From 89cf804dbe7fb5683985619a86492ec408010579 Mon Sep 17 00:00:00 2001 From: Jan Steemann Date: Mon, 29 Jul 2013 13:52:57 +0200 Subject: [PATCH 1/3] fixed documentation errors --- Documentation/UserManual/FoxxManager.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/Documentation/UserManual/FoxxManager.md b/Documentation/UserManual/FoxxManager.md index d862d0cbdb..6be67545d8 100644 --- a/Documentation/UserManual/FoxxManager.md +++ b/Documentation/UserManual/FoxxManager.md @@ -64,9 +64,9 @@ There is currently one application installed. It is called "aardvark" and it is a system application. You can safely ignore system applications. We are now going to install the hello world application. It is called -"hello-world" - no suprise there. +"hello-foxx" - no suprise there. - unix> foxx-manager install hallo-world /example + unix> foxx-manager install hello-foxx /example Application app:hello-foxx:1.2.2 installed successfully at mount point /example The second parameter `/example` is the mount path of the application. You should now @@ -87,7 +87,7 @@ command. You can install the application again under different mount path. - unix> foxx-manager install hallo-world /hello + unix> foxx-manager install hello-foxx /hello Application app:hello-foxx:1.2.2 installed successfully at mount point /hello You now have to separated instances of the same application. They are completely From 7276c27fb6720c0345a57f7386e9ad8972cf9724 Mon Sep 17 00:00:00 2001 From: Jan Steemann Date: Mon, 29 Jul 2013 14:09:53 +0200 Subject: [PATCH 2/3] updated documentation --- .../ImplementorManual/HttpReplication.md | 57 ++++++++++++++++--- .../ImplementorManual/HttpReplicationTOC.md | 7 ++- Documentation/RefManual/Replication.md | 17 +----- Documentation/RefManual/ReplicationTOC.md | 8 +-- 4 files changed, 58 insertions(+), 31 deletions(-) diff --git a/Documentation/ImplementorManual/HttpReplication.md b/Documentation/ImplementorManual/HttpReplication.md index a2ec57ce67..ffc28a8f12 100644 --- a/Documentation/ImplementorManual/HttpReplication.md +++ b/Documentation/ImplementorManual/HttpReplication.md @@ -7,7 +7,44 @@ HTTP Interface for Replication {#HttpReplication} Replication {#HttpReplicationIntro} =================================== -This is an introduction to ArangoDB's Http replication interface. +This is an introduction to ArangoDB's HTTP replication interface. + +The HTTP replication interface serves four main purposes: +- fetch initial data from a server (e.g. for an initial synchronisation of data, or backups) +- administer the replication logger (starting, stopping, querying state) +- fetch the changelog from a server (used for incremental synchronisation of changes) +- administer the replication applier (starting, stopping, configuring, querying state) + +Replication Dump Commands {#HttpReplicationDumpCommands} +-------------------------------------------------------- + +The `inventory` method provides can be used to query an ArangoDB server's current +set of collections plus their indexes. Clients can use this method to get an +overview of which collections are present on the server. They can use this information +to either start a full or a partial synchronisation of data, e.g. to initiate a backup +or the incremental data synchronisation. + +@anchor HttpReplicationInventory +@copydetails triagens::arango::RestReplicationHandler::handleCommandInventory + +The `dump` method can be used to fetch data from a specific collection. As the +results of the dump command can be huge, it may not return all data from a collection +at once. Instead, the dump command may be called repeatedly by replication clients +until there is no more data to fetch. The dump command will not only return the +current documents in the collection, but also document updates and deletions. + +To get to an identical state of data, replication clients should apply the individual +parts of the dump results in the same order as they are served to them. + +@anchor HttpReplicationDump +@copydetails triagens::arango::RestReplicationHandler::handleCommandDump + + +Replication Logger Commands {#HttpReplicationLoggerCommands} +------------------------------------------------------------ + +The logger commands allow starting, starting, and fetching the current state of +the replication logger. @anchor HttpReplicationLoggerStart @copydetails triagens::arango::RestReplicationHandler::handleCommandLoggerStart @@ -20,19 +57,21 @@ This is an introduction to ArangoDB's Http replication interface. @anchor HttpReplicationLoggerState @copydetails triagens::arango::RestReplicationHandler::handleCommandLoggerState -@CLEARPAGE +To query the latest changes logged by the replication logger, the Http interface +also provides the `logger-follow`. + +This method should be used by replication clients to incrementally fetch updates +from an ArangoDB instance. + @anchor HttpReplicationLoggerFollow @copydetails triagens::arango::RestReplicationHandler::handleCommandLoggerFollow -@CLEARPAGE -@anchor HttpReplicationInventory -@copydetails triagens::arango::RestReplicationHandler::handleCommandInventory +Replication Applier Commands {#HttpReplicationApplierCommands} +-------------------------------------------------------------- -@CLEARPAGE -@anchor HttpReplicationDump -@copydetails triagens::arango::RestReplicationHandler::handleCommandDump +The applier commands allow to remotely start, stop, and query the state and +configuration of an ArangoDB server's replication applier. -@CLEARPAGE @anchor HttpReplicationApplierGetConfig @copydetails triagens::arango::RestReplicationHandler::handleCommandApplierGetConfig diff --git a/Documentation/ImplementorManual/HttpReplicationTOC.md b/Documentation/ImplementorManual/HttpReplicationTOC.md index 509e08ed2c..4174ac9793 100644 --- a/Documentation/ImplementorManual/HttpReplicationTOC.md +++ b/Documentation/ImplementorManual/HttpReplicationTOC.md @@ -3,12 +3,15 @@ TOC {#HttpReplicationTOC} - @ref HttpReplication - @ref HttpReplicationIntro + - @ref HttpReplicationDumpCommands + - @ref HttpReplicationInventory "GET /_api/replication/inventory" + - @ref HttpReplicationDump "GET /_api/replication/dump" + - @ref HttpReplicationLoggerCommands - @ref HttpReplicationLoggerStart "PUT /_api/replication/logger-start" - @ref HttpReplicationLoggerStop "PUT /_api/replication/logger-stop" - @ref HttpReplicationLoggerState "GET /_api/replication/logger-state" - @ref HttpReplicationLoggerFollow "GET /_api/replication/logger-follow" - - @ref HttpReplicationInventory "GET /_api/replication/inventory" - - @ref HttpReplicationDump "GET /_api/replication/dump" + - @ref HttpReplicationApplierCommands - @ref HttpReplicationApplierGetConfig "GET /_api/replication/applier-config" - @ref HttpReplicationApplierSetConfig "PUT /_api/replication/applier-config" - @ref HttpReplicationApplierStart "PUT /_api/replication/applier-start" diff --git a/Documentation/RefManual/Replication.md b/Documentation/RefManual/Replication.md index 8508116f8c..6d8388f803 100644 --- a/Documentation/RefManual/Replication.md +++ b/Documentation/RefManual/Replication.md @@ -7,18 +7,11 @@ Replication Events{#RefManualReplication} The replication logger in ArangoDB will log all events into the `_replication` system collection. It will only log events when the logger is enabled. -Continuous Replication Log{#RefManualReplicationContinuous} -=========================================================== - -Replication log events are made available to replication clients via the API at -`/_api/replication/logger-follow`. This API can be called by clients to fetch -replication log events repeatedly. - The following sections describe in detail the structure of the log events returned by this API. Replication Event Types{#RefManualReplicationEventTypes} --------------------------------------------------------- +======================================================== The following replication event types will be logged by ArangoDB 1.4: @@ -53,7 +46,7 @@ value is a sequence number and is used by the replication applier to determine whether a replication event was already processed. Examples{#RefManualReplicationExamples} ---------------------------------------- +======================================= - 1000: the replication logger was stopped: @@ -440,9 +433,3 @@ event that is neither a ocument/edge operation nor a `transaction commit` event) should abort the ongoing transaction and discard all buffered operations. It can then consider the current transaction as failed. -Collections{#RefManualReplicationCollections} ---------------------------------------------- - -The replication logger will only log events that affect user-defined collections. Any -events for system collections (collections with names that start with an underscore) are -not logged by the replication logger, and thus cannot be fetched from the continuous log. diff --git a/Documentation/RefManual/ReplicationTOC.md b/Documentation/RefManual/ReplicationTOC.md index acdccc8c66..1f0895d312 100644 --- a/Documentation/RefManual/ReplicationTOC.md +++ b/Documentation/RefManual/ReplicationTOC.md @@ -2,8 +2,6 @@ TOC {#RefManualReplicationTOC} ==================================== - @ref RefManualReplication - - @ref RefManualReplicationContinuous - - @ref RefManualReplicationEventTypes - - @ref RefManualReplicationExamples - - @ref RefManualReplicationTransactions - - @ref RefManualReplicationCollections + - @ref RefManualReplicationEventTypes + - @ref RefManualReplicationExamples + - @ref RefManualReplicationTransactions From 6cbf835adf65fcbfa778ac21420c9279322ed0a6 Mon Sep 17 00:00:00 2001 From: Jan Steemann Date: Mon, 29 Jul 2013 15:21:35 +0200 Subject: [PATCH 3/3] updated manual --- Documentation/Makefile.files | 1 + Documentation/UserManual/Transactions.md | 12 -- Documentation/UserManual/UserManual.md | 1 + .../UserManual/UserManualReplication.md | 129 ++++++++++++++++++ .../UserManual/UserManualReplicationTOC.md | 12 ++ 5 files changed, 143 insertions(+), 12 deletions(-) create mode 100644 Documentation/UserManual/UserManualReplication.md create mode 100644 Documentation/UserManual/UserManualReplicationTOC.md diff --git a/Documentation/Makefile.files b/Documentation/Makefile.files index 96cfa1164b..b0d1db3dfb 100644 --- a/Documentation/Makefile.files +++ b/Documentation/Makefile.files @@ -152,6 +152,7 @@ WIKI = \ UserManualArangosh \ UserManualFoxx \ UserManualFoxxManager \ + UserManualReplication \ UserManualWebInterface \ jsUnity diff --git a/Documentation/UserManual/Transactions.md b/Documentation/UserManual/Transactions.md index 69255fce7f..ad673f06aa 100644 --- a/Documentation/UserManual/Transactions.md +++ b/Documentation/UserManual/Transactions.md @@ -4,7 +4,6 @@ Transactions {#Transactions} @NAVIGATE_Transactions @EMBEDTOC{TransactionsTOC} - Introduction {#TransactionsIntroduction} ======================================== @@ -25,7 +24,6 @@ These *ACID* properties provide the following guarantees: transaction durability is configurable in ArangoDB, as is the durability on collection level. - Transaction invocation {#TransactionsInvocation} ================================================ @@ -54,7 +52,6 @@ data retrieval and/or modification operations, and at the end automatically commit the transaction. If an error occurs during transaction execution, the transaction is automatically aborted, and all changes are rolled back. - Declaration of collections ========================== @@ -104,7 +101,6 @@ Even without specifying them, it is still possible to read from such collections from within a transaction, but with relaxed isolation. Please refer to @ref TransactionsLocking for more details. - Declaration of data modification and retrieval operations ========================================================= @@ -189,7 +185,6 @@ case, the user can return any legal Javascript value from the function: } }); - Examples ======== @@ -303,7 +298,6 @@ start. The following example using a cap constraint should illustrate that: /* we now have these keys back: [ "key2", "key3", "key4" ] */ - Cross-collection transactions ============================= @@ -359,7 +353,6 @@ transaction abort and roll back all changes in all collections: db.c1.count(); /* 0 */ db.c2.count(); /* 0 */ - Passing parameters to transactions {#TransactionsParameters} ============================================================ @@ -391,7 +384,6 @@ Some example that uses collections: } }); - Disallowed operations {#TransactionsDisallowedOperations} ========================================================= @@ -403,7 +395,6 @@ If an attempt is made to carry out any of these operations during a transaction, ArangoDB will abort the transaction with error code `1653 (disallowed operation inside transaction)`. - Locking and isolation {#TransactionsLocking} ============================================ @@ -474,7 +465,6 @@ transaction. The total lock wait time may thus be much higher than the value of To avoid both deadlocks and non-repeatable reads, all collections used in a transaction should always be specified if known in advance. - Durability {#TransactionsDurability} ==================================== @@ -549,7 +539,6 @@ synchronisation for multi-collection transactions in ArangoDB. The disk sync speed of the system will thus be the most important factor for the performance of multi-collection transactions. - Limitations {#TransactionsLimitations} ====================================== @@ -588,4 +577,3 @@ It is legal to not declare read-only collections, but this should be avoided if possible to reduce the probability of deadlocks and non-repeatable reads. Please refer to @ref TransactionsLocking for more details. - diff --git a/Documentation/UserManual/UserManual.md b/Documentation/UserManual/UserManual.md index 9f54b252ea..d3372476a1 100644 --- a/Documentation/UserManual/UserManual.md +++ b/Documentation/UserManual/UserManual.md @@ -17,6 +17,7 @@ ArangoDB's User Manual (@VERSION) {#UserManual} @CHAPTER_REF{UserManualFoxxManager} @CHAPTER_REF{UserManualFoxx} @CHAPTER_REF{UserManualActions} +@CHAPTER_REF{UserManualReplication} @CHAPTER_REF{Transactions} @CHAPTER_REF{CommandLine} @CHAPTER_REF{Glossary} diff --git a/Documentation/UserManual/UserManualReplication.md b/Documentation/UserManual/UserManualReplication.md new file mode 100644 index 0000000000..2a202b7dfc --- /dev/null +++ b/Documentation/UserManual/UserManualReplication.md @@ -0,0 +1,129 @@ +Replication {#UserManualReplication} +==================================== + +@NAVIGATE_UserManualReplication +@EMBEDTOC{UserManualReplicationTOC} + +Introduction {#UserManualReplicationIntro} +========================================== + +Starting with ArangoDB 1.4, ArangoDB comes with an optional master-slave replication. + +The replication is asychronous and eventually consistent, meaning that slaves will +*pull* changes from the master and apply them locally. Data on a slave may be +behind the state of data on the master until the slave has fetched and applied all +changes. + +Transactions are honored in replication, i.e. changes by a replicated transaction will +become visible on the slave atomically. + +It is possible to connect multiple slaves to the same master. Slaves should be used as +read-only instances, though otherwise conflicts may occur that cannot be solved +automatically in ArangoDB 1.4. +This is also the reason why master-master replication is not supported. + +Components {#UserManualReplicationComponents} +============================================= + +ArangoDB's replication consists of two main components, which can be used together or +separately: the *replication logger* and the *replication applier*. + +Using both components on two ArangoDB servers provides master-slave replication between +the two, but there are also additional use cases. + +Replication Logger {#UserManualReplicationLogger} +------------------------------------------------- + +The purpose of the replication logger is to log all changes that modify data. +The replication logger will produce an ongoing stream of change events. That stream, +or specific parts of the stream can be queried by clients via an HTTP API. + +An example client for this is the ArangoDB replication applier. +The ArangoDB replication applier will permanently query the stream of change events +the replication logger will write. It will apply "new" changes locally to get to +the same state of data as the logger server. + +External systems (e.g. indexers) could also incrementally query the log stream from +the replication logger. Using this approach, one could feed external systems with all +data modification operations done in ArangoDB. + +The replication logger will write all change events to a system collection named +`_replication`. The events are thus persisted and still be present after a server +restart or crash. + +ArangoDB will only log changes if the replication logger is turned on. Should there be +any data modifications while the replication logger is turned off, these events will +be lost for replication. + +The replication logger will mainly log events that affect user-defined collections. +Operations on ArangoDB's system collections (collections with names that start with +an underscore) are intentionally excluded from replication. + +There is exactly one replication logger present in an ArangoDB database. + +Replication Applier {#UserManualReplicationApplier} +--------------------------------------------------- + +The purpose of the replication applier is to read data from a remote stream of change +events from a data provider and apply them locally. The applier is thus using the +*pull* principle. + +Normally, one would connect an ArangoDB replication applier to an ArangoDB replication +logger. This would make the applier fetch all data from the logger server incrementally. +The data on the applier thus will be a copy of the data on the logger server, and the +applier server can be used as a read-only or hot standby clone. + +The applier can connect to any system that speaks HTTP and returns replication log +events in the expected format (see @INTREF{HttpReplicationLoggerFollow,format} and @ref +RefManualReplicationEventTypes). It is thus possible (though not the scope of the +ArangoDB project) to implement data providers other than ArangoDB and still have an +ArangoDB applier fetch their data and apply it. + +As the replication applier does not get notified immediately when there are "new" +changes, it might take some time the applier has fetched and applied the newest changes +from a logger server. Data modification operations might thus become visible on the +applying server later than on the server on which they were originated. + +If the replication applier cannot connect to the data provider or the communication +fails for some reason, it will try to reconnect and fetch outstanding data. Until this +succeeds, the state of data on the replication applier might also be behind the state +of the data provider. + +There is exactly one replication applier present in an ArangoDB database. It is thus +not possible to have an applier collect data from multiple ArangoDB "master" instances. + +Setting up Replication {#UserManualReplicationSetup} +==================================================== + +Setting up a working replication topology requires two ArangoDB instances: +- the replication logger server (_master_): this is the instance we'll replication data from +- the replication applier server (_slave_): this instance will fetch data from the logger server + and apply all changes locally + +For the following example setup, we'll use the instance *tcp://localhost:8529* as the +logger server, and the instance *tcp://localhost:8530* as an applier. + +The goal is to have all data from *tcp://localhost:8529* being replicated to the instance +*tcp://localhost:8530*. + +Setting up the Logger {#UserManualReplicationSetupLogger} +--------------------------------------------------------- + + + +Setting up the Applier {#UserManualReplicationSetupApplier} +----------------------------------------------------------- + + +Replication Overhead {#UserManualReplicationOverhead} +===================================================== + +Running the replication logger will make all data modification operations more +expensive, as the ArangoDB server needs to write the operation into the replication log. + +Additionally, replication appliers that connect to an ArangoDB server will cause some +extra work as incoming HTTP requests need to be processed and results be generated. + +Overall, turning on the replication logger will reduce throughput on an ArangoDB server +by some extent. If the replication feature is not required, the replication logger should +be turned off. diff --git a/Documentation/UserManual/UserManualReplicationTOC.md b/Documentation/UserManual/UserManualReplicationTOC.md new file mode 100644 index 0000000000..f04b667c41 --- /dev/null +++ b/Documentation/UserManual/UserManualReplicationTOC.md @@ -0,0 +1,12 @@ +TOC {#UserManualReplicationTOC} +=============================== + +- @ref UserManualReplication + - @ref UserManualReplicationIntro + - @ref UserManualReplicationComponents + - @ref UserManualReplicationLogger + - @ref UserManualReplicationApplier + - @ref UserManualReplicationSetup + - @ref UserManualReplicationSetupLogger + - @ref UserManualReplicationSetupApplier + - @ref UserManualReplicationOverhead