!CHAPTER Example Setup Setting up a working master-slave replication requires two ArangoDB instances: * **master**: this is the instance that all data-modification operations should be directed to * **slave**: on this instance, we'll start a replication applier, and this will fetch data from the master database's write-ahead log and apply its operations locally For the following example setup, we'll use the instance *tcp://master.domain.org:8529* as the master, and the instance *tcp://slave.domain.org:8530* as a slave. The goal is to have all data from the database *_system* on master *tcp://master.domain.org:8529* be replicated to the database *_system* on the slave *tcp://slave.domain.org:8530*. On the **master**, nothing special needs to be done, as all write operations will automatically be logged in the master's write-ahead log. To start replication on the **slave**, make sure there currently is no replication applier running in the slave's *_system* database: ```js db._useDatabase("_system"); require("@arangodb/replication").applier.stop(); ``` The *stop* operation will terminate any replication activity in the _system database on the slave. !SECTION Initial synchronization After that, we perform an initial sync of the slave with data from the master. To do this, execute the following commands on the slave: ```js db._useDatabase("_system"); require("@arangodb/replication").sync({ endpoint: "tcp://master.domain.org:8529", username: "myuser", password: "mypasswd", verbose: false }); ``` Username and password only need to be specified when the master server requires authentication. To check what the synchronization is currently doing, supply set the *verbose* option to *true*. If set, the synchronization will create log messages with the current synchronization status. **Warning**: The sync command will replace data in the slave database with data from the master database! Only execute these commands if you have verified you are on the correct server, in the correct database! The sync operation will return an attribute named *lastLogTick* which we'll need to note. The last log tick will be used as the starting point for any subsequent replication activity. Let's assume we got the following last log tick: ```js { "lastLogTick" : "40694126", ... } ``` !SECTION Continuous synchronization Now, we could start the replication applier in the slave database using the last log tick. However, there is one thing to consider: replication on the slave will be running until the slave gets shut down. When the slave server gets restarted, replication will be turned off again. To change this, we first need to configure the slave's replication applier and set its *autoStart* attribute. Here's the command to configure the replication applier with several options: ```js db._useDatabase("_system"); require("@arangodb/replication").applier.properties({ endpoint: "tcp://master.domain.org:8529", username: "myuser", password: "mypasswd", autoStart: true, autoResync: true, autoResyncRetries: 2, adaptivePolling: true, includeSystem: false, requireFromPresent: false, idleMinWaitTime: 0.5, idleMaxWaitTime: 1.5, verbose: false }); ``` An important consideration for replication is whether data from system collections (such as *_graphs* or *_users*) should be applied. The *includeSystem* option controls that. If set to *true*, changes in system collections will be replicated. Otherwise, they will not be replicated. It is often not necessary to replicate data from system collections, especially because it may lead to confusion on the slave because the slave needs to have its own system collections in order to start and keep operational. The *requireFromPresent* attribute controls whether the applier will start synchronizing in case it detects that the master cannot provide data for the initial tick value provided by the slave. This may be the case if the master does not have a big enough backlog of historic WAL logfiles, and when the replication is re-started after a longer pause. When *requireFromPresent* is set to *true*, then the replication applier will check at start whether the start tick from which it starts or resumes replication is still present on the master. If not, then there would be data loss. If *requireFromPresent* is *true*, the replication applier will abort with an appropriate error message. If set to *false*, then the replication applier will still start, and ignore the data loss. The *autoResync* option can be used in conjunction with the *requireFromPresent* option as follows: when both *requireFromPresent* and *autoResync* are set to *true* and the master cannot provide the log data the slave had requested, the replication applier will stop as usual. But due to the fact that *autoResync* is set to true, the slave will automatically trigger a full resync of all data with the master. After that, the replication applier will go into continuous replication mode again. Additionally, setting *autoResync* to *true* will trigger a full re-synchronization of data when the continuous replication is started and detects that there is no start tick value. Note that automatic re-synchronization (*autoResync* option set to *true*) may transfer a lot of data from the master to the slave and can therefore be expensive. Still it's turned on here so there's less need for manual intervention. The *autoResyncRetries* option can be used to control the number of resynchronization retries that will be performed in a row when automatic resynchronization is enabled and kicks in. Setting this to *0* will effectively disable *autoResync*. Setting it to some other value will limit the number of retries that are performed. This helps preventing endless retries in case resynchronizations always fail. Now it's time to start the replication applier on the slave using the last log tick we got before: ```js db._useDatabase("_system"); require("@arangodb/replication").applier.start("40694126"); ``` This will replicate all operations happening in the master's system database and apply them on the slave, too. After that, you should be able to monitor the state and progress of the replication applier by executing the *state* command on the slave server: ```js db._useDatabase("_system"); require("@arangodb/replication").applier.state(); ``` Please note that stopping the replication applier on the slave using the *stop* command should be avoided. The reason is that currently ongoing transactions (that have partly been replicated to the slave) will be need to be restarted after a restart of the replication applier. Stopping and restarting the replication applier on the slave should thus only be performed if there is certainty that the master is currently fully idle and all transactions have been replicated fully. Note that while a slave has only partly executed a transaction from the master, it might keep a write lock on the collections involved in the transaction. You may also want to check the master and slave states via the HTTP APIs (see [HTTP Interface for Replication](../HttpReplications/README.md)). !SECTION Initial synchronization from the ArangoShell The *sync* may take a long time to complete. If it's called from the ArangoShell, the connection may time out, which will effectively discard the result of the *sync* operation. Therefore in the ArangoShell, the optional *async* attribute can be used to start the synchronization as a background process on the slave. If the *async* attribute is set to *true*, the call to *sync* will return almost instantly with an id string. Using this id string, the status of the sync job on the slave can be queried using the *getSyncResult* function as follows: ```js db._useDatabase("_system"); var replication = require("@arangodb/replication"); var id = replication.sync({ endpoint: "tcp://master.domain.org:8529", username: "myuser", password: "mypasswd", async: true }); print(replication.getSyncResult(id)); ``` *getSyncResult* will return *false* as long as the synchronization is not complete, and return the synchronization result otherwise.