diff --git a/Documentation/Books/Manual/Scalability/Architecture.mdpp b/Documentation/Books/Manual/Scalability/Architecture.mdpp index b5a2492fcf..e660837584 100644 --- a/Documentation/Books/Manual/Scalability/Architecture.mdpp +++ b/Documentation/Books/Manual/Scalability/Architecture.mdpp @@ -18,18 +18,13 @@ An ArangoDB cluster consists of a number of ArangoDB instances which talk to each other over the network. They play different roles, which will be explained in detail below. The current configuration of the cluster is held in the "Agency", which is a highly-available -resilient key/value store based on an odd number of ArangoDB instances. - -!SUBSUBSECTION Cluster ID - -Every non-Agency ArangoDB instance in a cluster is assigned a unique -ID during its startup. Using its ID a node is identifiable -throughout the cluster. All cluster operations will communicate -via this ID. +resilient key/value store based on an odd number of ArangoDB instances +running [Raft Consensus Protocol](https://raft.github.io/). For the various instances in an ArangoDB cluster there are 4 distinct roles: Agents, Coordinators, Primary and Secondary DBservers. In the -following sections we will shed light on each of them. +following sections we will shed light on each of them. Note that the +tasks for all roles run the same binary from the same Docker image. !SUBSUBSECTION Agents @@ -70,13 +65,21 @@ asked by a coordinator. !SUBSUBSECTION Secondaries -Secondary DBservers are asynchronous replicas of primaries. For each -primary, there can be one ore more secondaries. Since the replication -works asynchronously (eventual consistency), the replication does not -impede the performance of the primaries. On the other hand, their -replica of the data can be slightly out of date. The secondaries are -perfectly suitable for backups as they don't interfere with the normal -cluster operation. +Secondary DBservers are asynchronous replicas of primaries. If one is +using only synchronous replication, one does not need secondaries at all. +For each primary, there can be one or more secondaries. Since the +replication works asynchronously (eventual consistency), the replication +does not impede the performance of the primaries. On the other hand, +their replica of the data can be slightly out of date. The secondaries +are perfectly suitable for backups as they don't interfere with the +normal cluster operation. + +!SUBSUBSECTION Cluster ID + +Every non-Agency ArangoDB instance in a cluster is assigned a unique +ID during its startup. Using its ID a node is identifiable +throughout the cluster. All cluster operations will communicate +via this ID. !SUBSECTION Sharding @@ -205,7 +208,7 @@ modern microservice architectures of applications. With the [Foxx services](../Foxx/README.md) it is very easy to deploy a data centric microservice within an ArangoDB cluster. -Alternatively, one can deploy multiple instances of ArangoDB within the +In addition, one can deploy multiple instances of ArangoDB within the same project. One part of the project might need a scalable document store, another might need a graph database, and yet another might need the full power of a multi-model database actually mixing the various @@ -221,7 +224,7 @@ capabilities in this direction. !SUBSECTION Apache Mesos integration For the distributed setup, we use the Apache Mesos infrastructure by default. -ArangoDB is a fully certified package for the Mesosphere DC/OS and can thus +ArangoDB is a fully certified package for DC/OS and can thus be deployed essentially with a few mouse clicks or a single command, once you have an existing DC/OS cluster. But even on a plain Apache Mesos cluster one can deploy ArangoDB via Marathon with a single API call and some JSON @@ -268,3 +271,5 @@ even further you can install a reverse proxy like haproxy or nginx in front of the coordinators (that will also allow easy access from the application). +Authentication in the cluster will be added soon after the initial 3.0 +release. diff --git a/Documentation/Books/Manual/Scalability/DataModels.mdpp b/Documentation/Books/Manual/Scalability/DataModels.mdpp index a55afc44a7..cc94b0da4c 100644 --- a/Documentation/Books/Manual/Scalability/DataModels.mdpp +++ b/Documentation/Books/Manual/Scalability/DataModels.mdpp @@ -17,7 +17,7 @@ primary key and all these operations scale linearly. If the sharding is done using different shard keys, then a lookup of a single key involves asking all shards and thus does not scale linearly. -!SUBSECTION document store +!SUBSECTION Document store For the document store case even in the presence of secondary indexes essentially the same arguments apply, since an index for a sharded @@ -26,10 +26,51 @@ single document operations still scale linearly with the size of the cluster, unless a special sharding configuration makes lookups or write operations more expensive. -!SUBSECTION complex queries and joins +For a deeper analysis of this topic see +[this blog post](https://mesosphere.com/blog/2015/11/30/arangodb-benchmark-dcos/) +in which good linear scalability of ArangoDB for single document operations +is demonstrated. -TODO -!SUBSECTION graph database +!SUBSECTION Complex queries and joins -TODO +The AQL query language allows complex queries, using multiple +collections, secondary indexes as well as joins. In particular with +the latter, scaling can be a challenge, since if the data to be +joined resides on different machines, a lot of communication +has to happen. The AQL query execution engine organises a data +pipeline across the cluster to put together the results in the +most efficient way. The query optimizer is aware of the cluster +structure and knows what data is where and how it is indexed. +Therefore, it can arrive at an informed decision about what parts +of the query ought to run where in the cluster. + +Nevertheless, for certain complicated joins, there are limits as +to what can be achieved. A very important case that can be +optimized relatively easily is if one of the collections involved +in the join is small enough such that it is possible to +replicated its data on all machines. We call such a collection a +"satellite collection". Due to the replication a join involving +such a collection can be executed locally without too much +communication overhead. + + +!SUBSECTION Graph database + +Graph databases are particularly good at queries on graphs that involve +paths in the graph of an a priori unknown length. For example, finding +the shortest path between two vertices in a graph, or finding all +paths that match a certain pattern starting at a given vertex are such +example. + +However, if the vertices and edges along the occurring paths are +distributed across the cluster, then a lot of communication is +necessary between nodes, and performance suffers. To achieve good +performance at scale, it is therefore necessary, to get the +distribution of the graph data across the shards in the cluster +right. Most of the time, the application developers and users of +ArangoDB know best, how their graphs a structured. Therefore, +ArangoDB allows users to specify, according to which attributes +the graph data is sharded. A useful first step is usually to make +sure that the edges originating at a vertex reside on the same +cluster node as the vertex. diff --git a/Documentation/Books/Manual/Scalability/README.mdpp b/Documentation/Books/Manual/Scalability/README.mdpp index b9882eee44..2e2782d4a1 100644 --- a/Documentation/Books/Manual/Scalability/README.mdpp +++ b/Documentation/Books/Manual/Scalability/README.mdpp @@ -8,8 +8,12 @@ resilience by means of replication and automatic failover. Furthermore, one can build systems that scale their capacity dynamically up and down automatically according to demand. -Obviously, one can also scale ArangoDB vertically, that is, by using -ever larger servers. However, this has the disadvantage that the +One can also scale ArangoDB vertically, that is, by using +ever larger servers. There is no builtin limitation in ArangoDB, +for example, the server will automatically use more threads if +more CPUs are present. + +However, scaling vertically has the disadvantage that the costs grow faster than linear with the size of the server, and none of the resilience and dynamical capabilities can be achieved in this way.