1
0
Fork 0

Add documentation for synchronous replication.

This commit is contained in:
Max Neunhoeffer 2016-06-10 16:56:31 +02:00
parent 794a08dd86
commit 3af860b3e8
2 changed files with 55 additions and 0 deletions

View File

@ -125,6 +125,19 @@ to the [naming conventions](../NamingConventions/README.md).
attribute and this can only be done efficiently if this is the
only shard key by delegating to the individual shards.
* *replicationFactor* (optional, default is 1): in a cluster, this
attribute determines how many copies of each shard are kept on
different DBServers. The value 1 means that only one copy (no
synchronous replication) is kept. A value of k means that
k-1 replicas are kept. Any two copies reside on different DBServers.
Replication between them is synchronous, that is, every write operation
to the "leader" copy will be replicated to all "follower" replicas,
before the write operation is reported successful.
If a server fails, this is detected automatically and one of the
servers holding copies take over, usually without an error being
reported.
`db._create(collection-name, properties, type)`
Specifies the optional *type* of the collection, it can either be *document*

View File

@ -44,3 +44,45 @@ use
This call will create a new collection called *collection-name*.
This method is a database method and is documented in detail at [Database Methods](DatabaseMethods.md#create)
!SUBSECTION Synchronous replication
Starting in ArangoDB 3.0, the distributed version offers synchronous
replication, which means that there is the option to replicate all data
automatically within the ArangoDB cluster. This is configured for sharded
collections on a per collection basis by specifying a "replication factor"
when the collection is created. A replication factor of k means that
altogether k copies of each shard are kept in the cluster on k different
servers, and are kept in sync. That is, every write operation is automatically
replicated on all copies.
This is organised using a leader/follower model. At all times, one of the
servers holding replicas for a shard is "the leader" and all others
are "followers", this configuration is held in the Agency (see
[Scalability](../../Scalability/README.md) for details of the ArangoDB
cluster architecture). Every write operation is sent to the leader
by one of the coordinators, and then replicated to all followers
before the operation is reported to have succeeded. The leader keeps
a record of which followers are currently in sync. In case of network
problems or a failure of a follower, a leader can and will drop a follower
temporarily after 3 seconds, such that service can resume. In due course,
the follower will automatically resynchronize with the leader to restore
resilience.
If a leader fails, the cluster Agency automatically initiates a failover
routine after around 15 seconds, promoting one of the followers to
leader. The other followers (and the former leader, when it comes back),
automatically resynchronize with the new leader to restore resilience.
Usually, this whole failover procedure can be handled transparently
for the coordinator, such that the user code does not even see an error
message.
Obviously, this fault tolerance comes at a cost of increased latency.
Each write operation needs an additional network roundtrip for the
synchronous replication of the followers, but all replication operations
to all followers happen concurrently. This is, why the default replication
factor is 1, which means no replication.
For details on how to switch on synchronous replication for a collection,
see the database method `db._create(collection-name)` in the section about
[Database Methods](DatabaseMethods.md#create).