mirror of https://gitee.com/bigwinds/arangodb
40 lines
3.2 KiB
Plaintext
40 lines
3.2 KiB
Plaintext
!CHAPTER Cluster Scalability
|
|
|
|
ArangoDB has been designed as a distributed multi model database. In this chapter we will give a short outline on the cluster architecture.
|
|
|
|
!SUBSECTION Cluster ID
|
|
|
|
Every node in a cluster will be assigned a uniquely generated ID during its startup. Using its ID a node is identifiable througout the cluster. All cluster operations will communicate via this ID.
|
|
|
|
!SUBSECTION Roles in an ArangoDB cluster
|
|
|
|
In an ArangoDB cluster there are 4 distinct roles: Agents, Coordinators, Primaries and Secondaries. In the following sections we will shed light on each of them.
|
|
|
|
!SUBSUBSECTION Agents
|
|
|
|
One or multiple Agents form the Agency in an ArangoDB cluster. The Agency is the central place to store the configuration in a cluster. Without it none of the other components can operate. While generally invisible to the outside it is the heart of the cluster. As such, failure tolerance is of course a must have for the Agency. To achieve that the Agents are using the [Raft Consensus Algorithm](https://raft.github.io/). The algorithm formally guarantees conflict free configuration management within the ArangoDB cluster.
|
|
|
|
At its core the Agency manages a big configuration tree. It supports transactional read and write operations on this tree.
|
|
|
|
!SUBSUBSECTION Coordinators
|
|
|
|
Coordinators should be accessible from the outside. These are the ones the clients should talk to. They will coordinate cluster tasks like executing queries and running foxx applications. They know where the data is stored and will optimize where to run user supplied queries or parts thereof.
|
|
|
|
!SUBSUBSECTION Primaries
|
|
|
|
Primaries are the ones where the data is actually hosted. They host shards of data and using synchronized replication a Primary may either be leader or follower for a shard.
|
|
|
|
They should not be accessed from the outside but indirect through the coordinator. They may also execute queries in part or as a whole when asked by a coordinator.
|
|
|
|
!SUBSUBSECTION Secondaries
|
|
|
|
Secondaries are asynchronous followers of primaries. They are perfectly suitable for backups as they don't interfere with the normal cluster operation.
|
|
|
|
!SUBSECTION Sharding
|
|
|
|
Using the roles outlined above an ArangoDB cluster is able to distribute data in so called shards across multiple primaries. From the outside this process is fully transparent and as such we achieve the goals of what other systems call "master-master replication". In an ArangoDB cluster you talk to any coordinator and whenever you read or write data it will automatically figure out where the data is stored (read) or to be stored (write). The information about the shards is shared across the coordinators using the Agency.
|
|
|
|
!SUBSECTION Authentication
|
|
|
|
As of version 3.0 ArangoDB authentication is **NOT** supported within a cluster. You **HAVE** to properly secure your cluster to the outside. Most setups will have a secured data center anyway and ArangoDB will be accessed from the outside via an application. To this application only the coordinators need to be made available. If you want to isolate even further you can install a reverse proxy like haproxy or nginx in front of the coordinators (that will also allow easy access from the application).
|