9.0 KiB

Raw Blame History

Sharding

@NAVIGATE_Sharding @EMBEDTOC{ShardingTOC}

Introduction

Sharding allows to use multiple machines to run a cluster of ArangoDB instances that together appear to be a single database. This enables you to store much more data, since ArangoDB distributes the data automatically to the different servers.

In a cluster there are essentially two types of processes: "DBservers" and "coordinators". The former actually store the data, the latter expose the database to the outside world. The clients talk to the coordinators exactly as they would talk to a single ArangoDB instance via the REST interface. The coordinators know about the configuration of the cluster and automatically forward the incoming requests to the right DBservers.

As a central highly available service to hold the cluster configuration and to synchronise reconfiguration and failover operations we currently use a an external program called etcd. It provides a hierarchical key value store with strong consistency and reliability promises.

All this is admittedly a relatively complicated setup and involves a lot of steps for startup and shutdown of clusters. Therefore we have created convenience functionality to plan, setup, start and shutdown clusters.

The whole process works in two phases, first the "planning" phase and then the "running" phase. In the planning phase it is decided which processes with which roles run on which machine, which ports they use, where the central agency resides and what ports its agents use. The result of the planning phase is a "cluster plan", which is just a relatively big datastructure in JSON format. You can then use this cluster plan to startup, shutdown, check and cleanup your cluster.

This latter phase uses so-called "dispatchers". A dispatcher is yet another ArangoDB instance and you have to install exactly one such instance on every machine that will take part in your cluster. No special configuration whatsoever is needed and you can organise authentication exactly as you would in a normal ArangoDB instance.

However, you can use any of these dispatchers to plan and start your cluster. In the planning phase, you have tell the planner about all dispatchers in your cluster and it will automatically distribute your agency, DBserver and coordinator processes amonst the dispatchers. The result is the cluster plan which you feed into the kickstarter. The kickstarter is a program that actually uses the dispatchers to manipulate the processes in your cluster. It runs on one of the dispatchers, which analyses the cluster plan, and executes all actions, for which it is personally responsible, and forwards all other actions to the corresponding dispatcher. This is possible, because the cluster plan incorporates the information about all dispatchers.

How to try it out

In this text we assume that you are working with a standard installation of ArangoDB with version at least 2.0. This means that everything is compiled for cluster operation and that etcd is compiled and the executable is installed in the bin directory of your ArangoDB installation.

We will first plan and launch a cluster, such that all your servers run on the local machine.

Start up a regular ArangoDB, either in console mode or connect to it with the Arango shell arangosh. Then you can ask it to plan a cluster for you:

arangodb> var Planner = require("org/arangodb/cluster").Planner;
arangodb> p = new Planner({numberOfDBservers:3, numberOfCoordinators:2});
[object Object]

If you are curious you can look at the plan of your cluster:

arangodb> p.getPlan();

will show you a huge JSON document. More interestingly, some further components tell you more about the layout of your cluster:

arangodb> p.DBservers;
[ 
  { 
    "id" : "Pavel", 
    "dispatcher" : "me", 
    "port" : 8629 
  }, 
  { 
    "id" : "Perry", 
    "dispatcher" : "me", 
    "port" : 8630 
  }, 
  { 
    "id" : "Pancho", 
    "dispatcher" : "me", 
    "port" : 8631 
  } 
]
arangodb> p.coordinators;
[ 
  { 
    "id" : "Claus", 
    "dispatcher" : "me", 
    "port" : 8530 
  }, 
  { 
    "id" : "Chantalle", 
    "dispatcher" : "me", 
    "port" : 8531 
  } 
]

This for example tells you the ports on which your ArangoDB processes will run. We will need the 8530 (or whatever appears on your machine) for the coordinators below.

More interesting is that such a cluster plan document can be used to start up the cluster conveniently using a Kickstarter object. Please note that the launch method of the kickstarter shown below initialises all data directories and log files, so if you have previously used the same cluster plan you will lose all your data. Use the relaunch method described below instead in that case.

arangodb> var Kickstarter = require("org/arangodb/cluster").Kickstarter;
arangodb> k = new Kickstarter(p.getPlan());
arangodb> k.launch();

is all you have to do to fire up your first cluster. You will see some output, which you can safely ignore (as long as no error happens).

From then on you can contact one of the coordinators and use the cluster as if it were a single ArangoDB instance (use the port number from above instead of 8530, if you get a different one) (probably from another shell window):

$ arangosh --server.endpoint tcp://localhost:8530
[... some output omitted]
arangosh [_system]> db._listDatabases();
[ 
  "_system" 
]

This for example lists the cluster wide databases. Now let us create a sharded collection, we only have to specify the number of shards to use in addition to the usual command. The shards are automatically distributed amongst your DBservers:

arangosh [_system]> example = db._create("example",{numberOfShards:6});
[ArangoCollection 1000001, "example" (type document, status loaded)]
arangosh [_system]> x = example.save({"name":"Hans", "age":44});
{ 
  "error" : false, 
  "_id" : "example/1000008", 
  "_rev" : "13460426", 
  "_key" : "1000008" 
}
arangosh [_system]> example.document(x._key);
{ 
  "age" : 44, 
  "name" : "Hans", 
  "_id" : "example/1000008", 
  "_rev" : "13460426", 
  "_key" : "1000008" 
}

You can shut down your cluster by using the following Kickstarter method (in the original ArangoDB console):

arangodb> k.shutdown();

If you want to start your cluster again without losing the data you have previosly stored in it, you can use the relaunch method in exactly the same way as you previously used the launch method:

arangodb> k.relaunch();

You can now check whether or not all your cluster processes are still running by issueing:

arangodb> k.isHealthy();

which will show you the status of all the processes in the cluster. You should see "RUNNING" there in all relevant places.

Finally, to clean up the whole cluster (losing all the data stored in it), do:

arangodb> k.shutdown();
arangodb> k.cleanup();

We conclude this section with another example using two machines, that is two different dispatchers. We start from scratch using two machines, running on the network addresses tcp://192.168.173.78:8529 and tcp://192.168.173.13:6789, say. Both need to have a regular ArangoDB instance installed and running, please make sure that both bind to all network devices, such that they can talk to each other.

arangodb> var Planner = require("org/arangodb/cluster").Planner;
arangodb> var p = new Planner({
     dispatchers: {"me":{"endpoint":"tcp://192.168.173.78:8529"},
                   "theother":{"endpoint":"tcp://192.168.173.13:6789"}}, 
     "numberOfCoordinators":2, "numberOfDBservers": 2});

With this you create a cluster plan involving two machines. The planner will put one DBserver and one coordinator on each machine. You can now launch this cluster exactly as above:

arangodb> var Kickstarter = require("org/arangodb/cluster").Kickstarter;
arangodb> k = new Kickstarter(p.getPlan());
arangodb> k.launch();

Likewise, the methods shutdown, relaunch, isHealthy and cleanup work exactly as in the single server case.

See @ref JSModuleCluster "the corresponding chapter of the reference manual" for detailed information about the Planner and Kickstarter classes.

Status of the implementation

At this stage all the basic CRUD operation for single documents or edges and all simple queries are implemented. AQL queries work but not with the best possible performance. Creating collections, dropping collections, creating databases, dropping databases works, however these operations are relatively slow at this stage. This is mostly due to the etcd program we are currently using. We will probably see a speedup in a later release. The basic CRUD operations already have a relatively good performance, the simple queries are in need of further optimisations as well.

@NAVIGATE_Sharding

9.0 KiB Raw Blame History

Sharding

Introduction

How to try it out

Status of the implementation

9.0 KiB

Raw Blame History