History

Simran 59de3403c1 Doc - Administration & Programs Refactor (#4907 )		2018-05-10 13:05:22 +02:00
..
README.md	Doc - Administration & Programs Refactor (#4907 )	2018-05-10 13:05:22 +02:00

README.md

Cluster Administration

This Section includes information related to the administration of an ArangoDB Cluster.

For a general introduction to the ArangoDB Cluster, please refer to the Cluster chapter.

Enabling synchronous replication

For an introduction about Synchronous Replication in Cluster, please refer to the Cluster Architecture section.

Synchronous replication can be enabled per collection. When creating a collection you may specify the number of replicas using the replicationFactor parameter. The default value is set to 1 which effectively disables synchronous replication among DBServers.

Whenever you specify a replicationFactor greater than 1 when creating a collection, synchronous replication will be activated for this collection. The Cluster will determine suitable leaders and followers for every requested shard (numberOfShards) within the Cluster.

Example:

127.0.0.1:8530@_system> db._create("test", {"replicationFactor": 3})

In the above case, any write operation will require 2 replicas to report success from now on.

Preparing growth

You may create a collection with higher replication factor than available DBServers. When additional DBServers become available the shards are automatically replicated to the newly available DBServers.

To create a collection with higher replication factor than available DBServers please set the option enforceReplicationFactor to false, when creating the collection from ArangoShell (the option is not available from the web interface), e.g.:

db._create("test", { replicationFactor: 4 }, { enforceReplicationFactor: false });

The default value for enforceReplicationFactor is true.

Note: multiple replicas of the same shard can never coexist on the same DBServer instance.

Sharding

For an introduction about Sharding in Cluster, please refer to the Cluster Architecture section.

Number of shards can be configured at collection creation time, e.g. the UI, or the ArangoDB Shell:

127.0.0.1:8529@_system> db._create("sharded_collection", {"numberOfShards": 4});

To configure a custom hashing for another attribute (default is _key):

127.0.0.1:8529@_system> db._create("sharded_collection", {"numberOfShards": 4, "shardKeys": ["country"]});

The example above, where 'country' has been used as shardKeys can be useful to keep data of every country in one shard, which would result in better performance for queries working on a per country base.

It is also possible to specify multiple shardKeys.

Note however that if you change the shard keys from their default ["_key"], then finding a document in the collection by its primary key involves a request to every single shard. Furthermore, in this case one can no longer prescribe the primary key value of a new document but must use the automatically generated one. This latter restriction comes from the fact that ensuring uniqueness of the primary key would be very inefficient if the user could specify the primary key.

On which DBServer in a Cluster a particular shard is kept is undefined. There is no option to configure an affinity based on certain shard keys.

Unique indexes (hash, skiplist, persistent) on sharded collections are only allowed if the fields used to determine the shard key are also included in the list of attribute paths for the index:

shardKeys	indexKeys
a	a	allowed
a	b	not allowed
a	a, b	allowed
a, b	a	not allowed
a, b	b	not allowed
a, b	a, b	allowed
a, b	a, b, c	allowed
a, b, c	a, b	not allowed
a, b, c	a, b, c	allowed

Moving/Rebalancing shards

A shard can be moved from a DBServer to another, and the entire shard distribution can be rebalanced using the correponding buttons in the web UI.

Replacing/Removing a Coordinator

Coordinators are effectively stateless and can be replaced, added and removed without more consideration than meeting the necessities of the particular installation.

To take out a Coordinator stop the Coordinator's instance by issueing kill -SIGTERM <pid>.

Ca. 15 seconds later the cluster UI on any other Coordinator will mark the Coordinator in question as failed. Almost simultaneously, a trash bin icon will appear to the right of the name of the Coordinator. Clicking that icon will remove the Coordinator from the coordinator registry.

Any new Coordinator instance that is informed of where to find any/all agent/s, --cluster.agency-endpoint <some agent endpoint> will be integrated as a new Coordinator into the cluster. You may also just restart the Coordinator as before and it will reintegrate itself into the cluster.

Replacing/Removing a DBServer

DBServers are where the data of an ArangoDB cluster is stored. They do not publish a we UI and are not meant to be accessed by any other entity than Coordinators to perform client requests or other DBServers to uphold replication and resilience.

The clean way of removing a DBServer is to first releave it of all its responsibilities for shards. This applies to followers as well as leaders of shards. The requirement for this operation is that no collection in any of the databases has a relicationFactor greater or equal to the current number of DBServers minus one. For the pupose of cleaning out DBServer004 for example would work as follows, when issued to any Coordinator of the cluster:

curl <coord-ip:coord-port>/_admin/cluster/cleanOutServer -d '{"id":"DBServer004"}'

After the DBServer has been cleaned out, you will find a trash bin icon to the right of the name of the DBServer on any Coordinators' UI. Clicking on it will remove the DBServer in questiuon from the cluster.

Firing up any DBServer from a clean data directory by specifying the any of all agency endpoints will integrate the new DBServer into the cluster.

To distribute shards onto the new DBServer either click on the Distribute Shards button at the bottom of the Shards page in every database.