mirror of https://gitee.com/bigwinds/arangodb
175 lines
7.6 KiB
Plaintext
175 lines
7.6 KiB
Plaintext
!CHAPTER Example Setup
|
|
|
|
Setting up a working master-slave replication requires two ArangoDB instances:
|
|
* **master**: this is the instance that all data-modification operations should be directed to
|
|
* **slave**: on this instance, we'll start a replication applier, and this will fetch data from the
|
|
master database's write-ahead log and apply its operations locally
|
|
|
|
For the following example setup, we'll use the instance *tcp://master.domain.org:8529* as the
|
|
master, and the instance *tcp://slave.domain.org:8530* as a slave.
|
|
|
|
The goal is to have all data from the database *_system* on master *tcp://master.domain.org:8529*
|
|
be replicated to the database *_system* on the slave *tcp://slave.domain.org:8530*.
|
|
|
|
On the **master**, nothing special needs to be done, as all write operations will automatically be
|
|
logged in the master's write-ahead log.
|
|
|
|
To start replication on the **slave**, make sure there currently is no replication applier running
|
|
in the slave's *_system* database:
|
|
|
|
```js
|
|
db._useDatabase("_system");
|
|
require("org/arangodb/replication").applier.stop();
|
|
```
|
|
|
|
The *stop* operation will terminate any replication activity in the _system database on the slave.
|
|
|
|
|
|
!SECTION Initial synchronization
|
|
|
|
After that, we perform an initial sync of the slave with data from the master. To do this,
|
|
execute the following commands on the slave:
|
|
|
|
```js
|
|
db._useDatabase("_system");
|
|
require("org/arangodb/replication").sync({
|
|
endpoint: "tcp://master.domain.org:8529",
|
|
username: "myuser",
|
|
password: "mypasswd",
|
|
verbose: false
|
|
});
|
|
```
|
|
|
|
Username and password only need to be specified when the master server requires authentication.
|
|
To check what the synchronization is currently doing, supply set the *verbose* option to *true*.
|
|
If set, the synchronization will create log messages with the current synchronization status.
|
|
|
|
**Warning**: The sync command will replace data in the slave database with data from the master
|
|
database! Only execute these commands if you have verified you are on the correct server, in the
|
|
correct database!
|
|
|
|
The sync operation will return an attribute named *lastLogTick* which we'll need to note. The
|
|
last log tick will be used as the starting point for any subsequent replication activity. Let's
|
|
assume we got the following last log tick:
|
|
|
|
```js
|
|
{
|
|
"lastLogTick" : "40694126",
|
|
...
|
|
}
|
|
```
|
|
|
|
!SECTION Continuous synchronization
|
|
|
|
Now, we could start the replication applier in the slave database using the last log tick.
|
|
However, there is one thing to consider: replication on the slave will be running until the
|
|
slave gets shut down. When the slave server gets restarted, replication will be turned off again.
|
|
To change this, we first need to configure the slave's replication applier and set its
|
|
*autoStart* attribute.
|
|
|
|
Here's the command to configure the replication applier with several options:
|
|
|
|
```js
|
|
db._useDatabase("_system");
|
|
require("org/arangodb/replication").applier.properties({
|
|
endpoint: "tcp://master.domain.org:8529",
|
|
username: "myuser",
|
|
password: "mypasswd",
|
|
autoStart: true,
|
|
autoResync: true,
|
|
adaptivePolling: true,
|
|
includeSystem: false,
|
|
requireFromPresent: false,
|
|
idleMinWaitTime: 0.5,
|
|
idleMaxWaitTime: 1.5,
|
|
verbose: false
|
|
});
|
|
```
|
|
|
|
An important consideration for replication is whether data from system collections (such as
|
|
*_graphs* or *_users*) should be applied. The *includeSystem* option controls that. If set to
|
|
*true*, changes in system collections will be replicated. Otherwise, they will not be replicated.
|
|
It is often not necessary to replicate data from system collections, especially because it may
|
|
lead to confusion on the slave because the slave needs to have its own system collections in
|
|
order to start and keep operational.
|
|
|
|
The *requireFromPresent* attribute controls whether the applier will start synchronizing in case
|
|
it detects that the master cannot provide data for the initial tick value provided by the slave.
|
|
This may be the case if the master does not have a big enough backlog of historic WAL logfiles,
|
|
and when the replication is re-started after a longer pause. When *requireFromPresent* is set to
|
|
*true*, then the replication applier will check at start whether the start tick from which it starts
|
|
or resumes replication is still present on the master. If not, then there would be data loss. If
|
|
*requireFromPresent* is *true*, the replication applier will abort with an appropriate error message.
|
|
If set to *false*, then the replication applier will still start, and ignore the data loss.
|
|
|
|
The *autoResync* option can be used in conjunction with the *requireFromPresent* option as follows:
|
|
when both *requireFromPresent* and *autoResync* are set to *true* and the master cannot provide the
|
|
log data the slave had requested, the replication applier will stop as usual. But due to the fact
|
|
that *autoResync* is set to true, the slave will automatically trigger a full resync of all data with
|
|
the master. After that, the replication applier will go into continuous replication mode again.
|
|
Additionally, setting *autoResync* to *true* will trigger a full re-synchronization of data when
|
|
the continuous replication is started and detects that there is no start tick value.
|
|
|
|
Note that automatic re-synchronization (*autoResync* option set to *true*) may transfer a lot of
|
|
data from the master to the slave and can therefore be expensive. Still it's turned on here so
|
|
there's less need for manual intervention.
|
|
|
|
Now it's time to start the replication applier on the slave using the last log tick we got
|
|
before:
|
|
|
|
```js
|
|
db._useDatabase("_system");
|
|
require("org/arangodb/replication").applier.start("40694126");
|
|
```
|
|
|
|
This will replicate all operations happening in the master's system database and apply them
|
|
on the slave, too.
|
|
|
|
After that, you should be able to monitor the state and progress of the replication
|
|
applier by executing the *state* command on the slave server:
|
|
|
|
```js
|
|
db._useDatabase("_system");
|
|
require("org/arangodb/replication").applier.state();
|
|
```
|
|
|
|
Please note that stopping the replication applier on the slave using the *stop* command
|
|
should be avoided. The reason is that currently ongoing transactions (that have partly been
|
|
replicated to the slave) will be need to be restarted after a restart of the replication
|
|
applier. Stopping and restarting the replication applier on the slave should thus only be
|
|
performed if there is certainty that the master is currently fully idle and all transactions
|
|
have been replicated fully.
|
|
|
|
Note that while a slave has only partly executed a transaction from the master, it might keep
|
|
a write lock on the collections involved in the transaction.
|
|
|
|
You may also want to check the master and slave states via the HTTP APIs
|
|
(see [HTTP Interface for Replication](../HttpReplications/README.md)).
|
|
|
|
|
|
!SECTION Initial synchronization from the ArangoShell
|
|
|
|
The *sync* may take a long time to complete. If it's called from the ArangoShell, the connection
|
|
may time out, which will effectively discard the result of the *sync* operation. Therefore in the
|
|
ArangoShell, the optional *async* attribute can be used to start the synchronization as a background
|
|
process on the slave. If the *async* attribute is set to *true*, the call to *sync* will return
|
|
almost instantly with an id string. Using this id string, the status of the sync job on the slave
|
|
can be queried using the *getSyncResult* function as follows:
|
|
|
|
```js
|
|
db._useDatabase("_system");
|
|
var replication = require("org/arangodb/replication");
|
|
var id = replication.sync({
|
|
endpoint: "tcp://master.domain.org:8529",
|
|
username: "myuser",
|
|
password: "mypasswd",
|
|
async: true
|
|
});
|
|
|
|
print(replication.getSyncResult(id));
|
|
```
|
|
|
|
*getSyncResult* will return *false* as long as the synchronization is not complete, and return the
|
|
synchronization result otherwise.
|
|
|