1
0
Fork 0

Merge branch 'devel' of github.com:triAGENS/ArangoDB into devel

This commit is contained in:
Michael Hackstein 2013-07-29 16:06:29 +02:00
commit 8588b648ea
10 changed files with 204 additions and 46 deletions

View File

@ -7,7 +7,44 @@ HTTP Interface for Replication {#HttpReplication}
Replication {#HttpReplicationIntro}
===================================
This is an introduction to ArangoDB's Http replication interface.
This is an introduction to ArangoDB's HTTP replication interface.
The HTTP replication interface serves four main purposes:
- fetch initial data from a server (e.g. for an initial synchronisation of data, or backups)
- administer the replication logger (starting, stopping, querying state)
- fetch the changelog from a server (used for incremental synchronisation of changes)
- administer the replication applier (starting, stopping, configuring, querying state)
Replication Dump Commands {#HttpReplicationDumpCommands}
--------------------------------------------------------
The `inventory` method provides can be used to query an ArangoDB server's current
set of collections plus their indexes. Clients can use this method to get an
overview of which collections are present on the server. They can use this information
to either start a full or a partial synchronisation of data, e.g. to initiate a backup
or the incremental data synchronisation.
@anchor HttpReplicationInventory
@copydetails triagens::arango::RestReplicationHandler::handleCommandInventory
The `dump` method can be used to fetch data from a specific collection. As the
results of the dump command can be huge, it may not return all data from a collection
at once. Instead, the dump command may be called repeatedly by replication clients
until there is no more data to fetch. The dump command will not only return the
current documents in the collection, but also document updates and deletions.
To get to an identical state of data, replication clients should apply the individual
parts of the dump results in the same order as they are served to them.
@anchor HttpReplicationDump
@copydetails triagens::arango::RestReplicationHandler::handleCommandDump
Replication Logger Commands {#HttpReplicationLoggerCommands}
------------------------------------------------------------
The logger commands allow starting, starting, and fetching the current state of
the replication logger.
@anchor HttpReplicationLoggerStart
@copydetails triagens::arango::RestReplicationHandler::handleCommandLoggerStart
@ -20,19 +57,21 @@ This is an introduction to ArangoDB's Http replication interface.
@anchor HttpReplicationLoggerState
@copydetails triagens::arango::RestReplicationHandler::handleCommandLoggerState
@CLEARPAGE
To query the latest changes logged by the replication logger, the Http interface
also provides the `logger-follow`.
This method should be used by replication clients to incrementally fetch updates
from an ArangoDB instance.
@anchor HttpReplicationLoggerFollow
@copydetails triagens::arango::RestReplicationHandler::handleCommandLoggerFollow
@CLEARPAGE
@anchor HttpReplicationInventory
@copydetails triagens::arango::RestReplicationHandler::handleCommandInventory
Replication Applier Commands {#HttpReplicationApplierCommands}
--------------------------------------------------------------
@CLEARPAGE
@anchor HttpReplicationDump
@copydetails triagens::arango::RestReplicationHandler::handleCommandDump
The applier commands allow to remotely start, stop, and query the state and
configuration of an ArangoDB server's replication applier.
@CLEARPAGE
@anchor HttpReplicationApplierGetConfig
@copydetails triagens::arango::RestReplicationHandler::handleCommandApplierGetConfig

View File

@ -3,12 +3,15 @@ TOC {#HttpReplicationTOC}
- @ref HttpReplication
- @ref HttpReplicationIntro
- @ref HttpReplicationDumpCommands
- @ref HttpReplicationInventory "GET /_api/replication/inventory"
- @ref HttpReplicationDump "GET /_api/replication/dump"
- @ref HttpReplicationLoggerCommands
- @ref HttpReplicationLoggerStart "PUT /_api/replication/logger-start"
- @ref HttpReplicationLoggerStop "PUT /_api/replication/logger-stop"
- @ref HttpReplicationLoggerState "GET /_api/replication/logger-state"
- @ref HttpReplicationLoggerFollow "GET /_api/replication/logger-follow"
- @ref HttpReplicationInventory "GET /_api/replication/inventory"
- @ref HttpReplicationDump "GET /_api/replication/dump"
- @ref HttpReplicationApplierCommands
- @ref HttpReplicationApplierGetConfig "GET /_api/replication/applier-config"
- @ref HttpReplicationApplierSetConfig "PUT /_api/replication/applier-config"
- @ref HttpReplicationApplierStart "PUT /_api/replication/applier-start"

View File

@ -152,6 +152,7 @@ WIKI = \
UserManualArangosh \
UserManualFoxx \
UserManualFoxxManager \
UserManualReplication \
UserManualWebInterface \
jsUnity

View File

@ -7,18 +7,11 @@ Replication Events{#RefManualReplication}
The replication logger in ArangoDB will log all events into the `_replication`
system collection. It will only log events when the logger is enabled.
Continuous Replication Log{#RefManualReplicationContinuous}
===========================================================
Replication log events are made available to replication clients via the API at
`/_api/replication/logger-follow`. This API can be called by clients to fetch
replication log events repeatedly.
The following sections describe in detail the structure of the log events
returned by this API.
Replication Event Types{#RefManualReplicationEventTypes}
--------------------------------------------------------
========================================================
The following replication event types will be logged by ArangoDB 1.4:
@ -53,7 +46,7 @@ value is a sequence number and is used by the replication applier to determine
whether a replication event was already processed.
Examples{#RefManualReplicationExamples}
---------------------------------------
=======================================
- 1000: the replication logger was stopped:
@ -440,9 +433,3 @@ event that is neither a ocument/edge operation nor a `transaction commit` event)
should abort the ongoing transaction and discard all buffered operations. It can then
consider the current transaction as failed.
Collections{#RefManualReplicationCollections}
---------------------------------------------
The replication logger will only log events that affect user-defined collections. Any
events for system collections (collections with names that start with an underscore) are
not logged by the replication logger, and thus cannot be fetched from the continuous log.

View File

@ -2,8 +2,6 @@ TOC {#RefManualReplicationTOC}
====================================
- @ref RefManualReplication
- @ref RefManualReplicationContinuous
- @ref RefManualReplicationEventTypes
- @ref RefManualReplicationExamples
- @ref RefManualReplicationTransactions
- @ref RefManualReplicationCollections
- @ref RefManualReplicationEventTypes
- @ref RefManualReplicationExamples
- @ref RefManualReplicationTransactions

View File

@ -64,9 +64,9 @@ There is currently one application installed. It is called "aardvark" and it is
a system application. You can safely ignore system applications.
We are now going to install the hello world application. It is called
"hello-world" - no suprise there.
"hello-foxx" - no suprise there.
unix> foxx-manager install hallo-world /example
unix> foxx-manager install hello-foxx /example
Application app:hello-foxx:1.2.2 installed successfully at mount point /example
The second parameter `/example` is the mount path of the application. You should now
@ -87,7 +87,7 @@ command.
You can install the application again under different mount path.
unix> foxx-manager install hallo-world /hello
unix> foxx-manager install hello-foxx /hello
Application app:hello-foxx:1.2.2 installed successfully at mount point /hello
You now have to separated instances of the same application. They are completely

View File

@ -4,7 +4,6 @@ Transactions {#Transactions}
@NAVIGATE_Transactions
@EMBEDTOC{TransactionsTOC}
Introduction {#TransactionsIntroduction}
========================================
@ -25,7 +24,6 @@ These *ACID* properties provide the following guarantees:
transaction durability is configurable in ArangoDB, as is the durability
on collection level.
Transaction invocation {#TransactionsInvocation}
================================================
@ -54,7 +52,6 @@ data retrieval and/or modification operations, and at the end automatically
commit the transaction. If an error occurs during transaction execution, the
transaction is automatically aborted, and all changes are rolled back.
Declaration of collections
==========================
@ -104,7 +101,6 @@ Even without specifying them, it is still possible to read from such collections
from within a transaction, but with relaxed isolation. Please refer to
@ref TransactionsLocking for more details.
Declaration of data modification and retrieval operations
=========================================================
@ -189,7 +185,6 @@ case, the user can return any legal Javascript value from the function:
}
});
Examples
========
@ -303,7 +298,6 @@ start. The following example using a cap constraint should illustrate that:
/* we now have these keys back: [ "key2", "key3", "key4" ] */
Cross-collection transactions
=============================
@ -359,7 +353,6 @@ transaction abort and roll back all changes in all collections:
db.c1.count(); /* 0 */
db.c2.count(); /* 0 */
Passing parameters to transactions {#TransactionsParameters}
============================================================
@ -391,7 +384,6 @@ Some example that uses collections:
}
});
Disallowed operations {#TransactionsDisallowedOperations}
=========================================================
@ -403,7 +395,6 @@ If an attempt is made to carry out any of these operations during a transaction,
ArangoDB will abort the transaction with error code `1653 (disallowed operation inside
transaction)`.
Locking and isolation {#TransactionsLocking}
============================================
@ -474,7 +465,6 @@ transaction. The total lock wait time may thus be much higher than the value of
To avoid both deadlocks and non-repeatable reads, all collections used in a
transaction should always be specified if known in advance.
Durability {#TransactionsDurability}
====================================
@ -549,7 +539,6 @@ synchronisation for multi-collection transactions in ArangoDB.
The disk sync speed of the system will thus be the most important factor for the
performance of multi-collection transactions.
Limitations {#TransactionsLimitations}
======================================
@ -588,4 +577,3 @@ It is legal to not declare read-only collections, but this should be avoided if
possible to reduce the probability of deadlocks and non-repeatable reads.
Please refer to @ref TransactionsLocking for more details.

View File

@ -17,6 +17,7 @@ ArangoDB's User Manual (@VERSION) {#UserManual}
@CHAPTER_REF{UserManualFoxxManager}
@CHAPTER_REF{UserManualFoxx}
@CHAPTER_REF{UserManualActions}
@CHAPTER_REF{UserManualReplication}
@CHAPTER_REF{Transactions}
@CHAPTER_REF{CommandLine}
@CHAPTER_REF{Glossary}

View File

@ -0,0 +1,129 @@
Replication {#UserManualReplication}
====================================
@NAVIGATE_UserManualReplication
@EMBEDTOC{UserManualReplicationTOC}
Introduction {#UserManualReplicationIntro}
==========================================
Starting with ArangoDB 1.4, ArangoDB comes with an optional master-slave replication.
The replication is asychronous and eventually consistent, meaning that slaves will
*pull* changes from the master and apply them locally. Data on a slave may be
behind the state of data on the master until the slave has fetched and applied all
changes.
Transactions are honored in replication, i.e. changes by a replicated transaction will
become visible on the slave atomically.
It is possible to connect multiple slaves to the same master. Slaves should be used as
read-only instances, though otherwise conflicts may occur that cannot be solved
automatically in ArangoDB 1.4.
This is also the reason why master-master replication is not supported.
Components {#UserManualReplicationComponents}
=============================================
ArangoDB's replication consists of two main components, which can be used together or
separately: the *replication logger* and the *replication applier*.
Using both components on two ArangoDB servers provides master-slave replication between
the two, but there are also additional use cases.
Replication Logger {#UserManualReplicationLogger}
-------------------------------------------------
The purpose of the replication logger is to log all changes that modify data.
The replication logger will produce an ongoing stream of change events. That stream,
or specific parts of the stream can be queried by clients via an HTTP API.
An example client for this is the ArangoDB replication applier.
The ArangoDB replication applier will permanently query the stream of change events
the replication logger will write. It will apply "new" changes locally to get to
the same state of data as the logger server.
External systems (e.g. indexers) could also incrementally query the log stream from
the replication logger. Using this approach, one could feed external systems with all
data modification operations done in ArangoDB.
The replication logger will write all change events to a system collection named
`_replication`. The events are thus persisted and still be present after a server
restart or crash.
ArangoDB will only log changes if the replication logger is turned on. Should there be
any data modifications while the replication logger is turned off, these events will
be lost for replication.
The replication logger will mainly log events that affect user-defined collections.
Operations on ArangoDB's system collections (collections with names that start with
an underscore) are intentionally excluded from replication.
There is exactly one replication logger present in an ArangoDB database.
Replication Applier {#UserManualReplicationApplier}
---------------------------------------------------
The purpose of the replication applier is to read data from a remote stream of change
events from a data provider and apply them locally. The applier is thus using the
*pull* principle.
Normally, one would connect an ArangoDB replication applier to an ArangoDB replication
logger. This would make the applier fetch all data from the logger server incrementally.
The data on the applier thus will be a copy of the data on the logger server, and the
applier server can be used as a read-only or hot standby clone.
The applier can connect to any system that speaks HTTP and returns replication log
events in the expected format (see @INTREF{HttpReplicationLoggerFollow,format} and @ref
RefManualReplicationEventTypes). It is thus possible (though not the scope of the
ArangoDB project) to implement data providers other than ArangoDB and still have an
ArangoDB applier fetch their data and apply it.
As the replication applier does not get notified immediately when there are "new"
changes, it might take some time the applier has fetched and applied the newest changes
from a logger server. Data modification operations might thus become visible on the
applying server later than on the server on which they were originated.
If the replication applier cannot connect to the data provider or the communication
fails for some reason, it will try to reconnect and fetch outstanding data. Until this
succeeds, the state of data on the replication applier might also be behind the state
of the data provider.
There is exactly one replication applier present in an ArangoDB database. It is thus
not possible to have an applier collect data from multiple ArangoDB "master" instances.
Setting up Replication {#UserManualReplicationSetup}
====================================================
Setting up a working replication topology requires two ArangoDB instances:
- the replication logger server (_master_): this is the instance we'll replication data from
- the replication applier server (_slave_): this instance will fetch data from the logger server
and apply all changes locally
For the following example setup, we'll use the instance *tcp://localhost:8529* as the
logger server, and the instance *tcp://localhost:8530* as an applier.
The goal is to have all data from *tcp://localhost:8529* being replicated to the instance
*tcp://localhost:8530*.
Setting up the Logger {#UserManualReplicationSetupLogger}
---------------------------------------------------------
Setting up the Applier {#UserManualReplicationSetupApplier}
-----------------------------------------------------------
Replication Overhead {#UserManualReplicationOverhead}
=====================================================
Running the replication logger will make all data modification operations more
expensive, as the ArangoDB server needs to write the operation into the replication log.
Additionally, replication appliers that connect to an ArangoDB server will cause some
extra work as incoming HTTP requests need to be processed and results be generated.
Overall, turning on the replication logger will reduce throughput on an ArangoDB server
by some extent. If the replication feature is not required, the replication logger should
be turned off.

View File

@ -0,0 +1,12 @@
TOC {#UserManualReplicationTOC}
===============================
- @ref UserManualReplication
- @ref UserManualReplicationIntro
- @ref UserManualReplicationComponents
- @ref UserManualReplicationLogger
- @ref UserManualReplicationApplier
- @ref UserManualReplicationSetup
- @ref UserManualReplicationSetupLogger
- @ref UserManualReplicationSetupApplier
- @ref UserManualReplicationOverhead