1
0
Fork 0

Doc - Replication Refactor - Part 1 (#4555)

Next steps after DC2DC and Cluster doc improvements:

- We refactor replication sections and make more intuitive separation between Master/Slave and the new Active Failover in 3.3
- We create corresponding sections for Master/Slave and Active Failover in the Administration and Deployment chapters, as well as in the Scalability chapter, where these "modes" are introduced
- We touch and improve the "Architecture" chapter as well, where some architecture info have to be placed
- We reorg the TOC having in more "logical" order:
-- Deployment
-- Administration
-- Security
-- Monitoring
-- Troubleshooting
- We adds parts in the TOC
- We add toc per pages, using page-toc plugin
- We also put close together "Scalability" and "Architecture" chapters, preliminary steps of further improvements / aggregation
- We improve swagger

Internal Ref:
- https://github.com/arangodb/planning/issues/1692
- https://github.com/arangodb/planning/issues/1655
- https://github.com/arangodb/planning/issues/1858
- https://github.com/arangodb/planning/issues/973 (partial fix)
- https://github.com/arangodb/planning/issues/1498 (partial fix)
This commit is contained in:
sleto-it 2018-02-28 12:23:19 +01:00 committed by GitHub
parent ab0cb34398
commit 0ba532b16a
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
360 changed files with 5549 additions and 3115 deletions

View File

@ -227,7 +227,7 @@ saves even more precious CPU cycles and gives the optimizer more alternatives.
Index usage
-----------
Especially on joins you should [make sure indices can be used to speed up your query.](../../ExecutionAndPerformance/ExplainingQueries.md)
Especially on joins you should [make sure indices can be used to speed up your query.](../ExecutionAndPerformance/ExplainingQueries.md)
Please note that sparse indices don't qualify for joins:
In joins you typically would also want to join documents not containing the property

View File

@ -1,6 +1,7 @@
# Summary
* [Introduction](README.md)
* [Tutorial](Tutorial/README.md)
* [Basic CRUD](Tutorial/CRUD.md)
* [Matching documents](Tutorial/Filter.md)

View File

@ -15,7 +15,7 @@
"ga",
"callouts@git+https://github.com/Simran-B/gitbook-plugin-callouts.git",
"edit-link",
"page-toc",
"page-toc@git+https://github.com/Simran-B/gitbook-plugin-page-toc.git",
"localized-footer"
],
"pdf": {

View File

@ -19,7 +19,7 @@ div.example_show_button {
background-color: rgba(240,240,0,0.4);
}
.book .book-body li:last-child {
.book .book-body section > ul li:last-child {
margin-bottom: 0.85em;
}

View File

@ -1,4 +1,5 @@
# Summary
* [Introduction](README.md)
* [Modelling Document Inheritance](DocumentInheritance.md)
* [Accessing Shapes Data](AccessingShapesData.md)
* [AQL](AQL/README.md)

View File

@ -15,7 +15,7 @@
"ga",
"callouts@git+https://github.com/Simran-B/gitbook-plugin-callouts.git",
"edit-link",
"page-toc",
"page-toc@git+https://github.com/Simran-B/gitbook-plugin-page-toc.git",
"localized-footer"
],
"pdf": {

View File

@ -19,7 +19,7 @@ div.example_show_button {
background-color: rgba(240,240,0,0.4);
}
.book .book-body li:last-child {
.book .book-body section > ul li:last-child {
margin-bottom: 0.85em;
}

View File

@ -7,36 +7,36 @@ monitoring of the server.
<!-- lib/Admin/RestAdminLogHandler.cpp -->
@startDocuBlock JSF_get_admin_log
@startDocuBlock get_admin_log
@startDocuBlock JSF_get_admin_loglevel
@startDocuBlock get_admin_loglevel
@startDocuBlock JSF_put_admin_loglevel
@startDocuBlock put_admin_loglevel
<!-- js/actions/api-system.js -->
@startDocuBlock JSF_get_admin_routing_reloads
@startDocuBlock get_admin_routing_reloads
<!-- js/actions/api-system.js -->
@startDocuBlock JSF_get_admin_statistics
@startDocuBlock get_admin_statistics
<!-- js/actions/api-system.js -->
@startDocuBlock JSF_get_admin_statistics_description
@startDocuBlock get_admin_statistics_description
<!-- js/actions/api-system.js -->
@startDocuBlock JSF_get_admin_server_role
@startDocuBlock get_admin_server_role
<!-- js/actions/api-system.js -->
@startDocuBlock JSF_get_admin_server_id
@startDocuBlock get_admin_server_id
<!-- js/actions/api-cluster.js -->
@startDocuBlock JSF_cluster_statistics_GET
@startDocuBlock get_cluster_statistics

View File

@ -12,7 +12,7 @@ inspect it and return meta information about it.
<!-- js/actions/api-explain.js -->
@startDocuBlock JSF_post_api_explain
@startDocuBlock post_api_explain
@startDocuBlock PostApiQueryProperties

View File

@ -2,10 +2,10 @@ Accessing Cursors via HTTP
==========================
<!-- js/actions/api-cursor.js -->
@startDocuBlock JSF_post_api_cursor
@startDocuBlock post_api_cursor
<!-- js/actions/api-cursor.js -->
@startDocuBlock JSF_post_api_cursor_identifier
@startDocuBlock post_api_cursor_identifier
<!-- js/actions/api-cursor.js -->
@startDocuBlock JSF_post_api_cursor_delete
@startDocuBlock post_api_cursor_delete

View File

@ -1,8 +1,8 @@
HTTP Interface for AQL User Functions Management
================================================
### AQL User Functions Management
AQL User Functions Management
-----------------------------
This is an introduction to ArangoDB's HTTP interface for managing AQL
user functions. AQL user functions are a means to extend the functionality
of ArangoDB's query language (AQL) with user-defined JavaScript code.
@ -18,10 +18,10 @@ system collection *_aqlfunctions*. Documents in this collection should not
be accessed directly, but only via the dedicated interfaces.
<!-- js/actions/api-aqlfunction.js -->
@startDocuBlock JSF_post_api_aqlfunction
@startDocuBlock post_api_aqlfunction
<!-- js/actions/api-aqlfunction.js -->
@startDocuBlock JSF_delete_api_aqlfunction
@startDocuBlock delete_api_aqlfunction
<!-- js/actions/api-aqlfunction.js -->
@startDocuBlock JSF_get_api_aqlfunction
@startDocuBlock get_api_aqlfunction

View File

@ -122,9 +122,9 @@ rejected instantly in the same way as a "regular", non-queued request.
Managing Async Results via HTTP
-------------------------------
@startDocuBlock JSF_job_fetch_result
@startDocuBlock JSF_job_cancel
@startDocuBlock JSF_job_delete
@startDocuBlock JSF_job_getStatusById
@startDocuBlock JSF_job_getByType
@startDocuBlock job_fetch_result
@startDocuBlock job_cancel
@startDocuBlock job_delete
@startDocuBlock job_getStatusById
@startDocuBlock job_getByType

View File

@ -209,4 +209,4 @@ part operations of a batch. When doing so, any other database name used
in a batch part will be ignored.
@startDocuBlock JSF_batch_processing
@startDocuBlock batch_processing

View File

@ -42,5 +42,5 @@ returned.
@startDocuBlock JSF_import_document
@startDocuBlock JSF_import_json
@startDocuBlock import_document
@startDocuBlock import_json

View File

@ -3,10 +3,10 @@ Creating and Deleting Collections
<!-- js/actions/api-collection.js -->
@startDocuBlock JSF_post_api_collection
@startDocuBlock post_api_collection
<!-- js/actions/api-collection.js -->
@startDocuBlock JSF_delete_api_collection
@startDocuBlock delete_api_collection
<!-- js/actions/api-collection.js -->
@startDocuBlock JSF_put_api_collection_truncate
@startDocuBlock put_api_collection_truncate

View File

@ -2,22 +2,22 @@ Getting Information about a Collection
======================================
<!-- js/actions/api-collection.js -->
@startDocuBlock JSA_get_api_collection_name
@startDocuBlock get_api_collection_name
<!-- js/actions/api-collection.js -->
@startDocuBlock JSA_get_api_collection_properties
@startDocuBlock get_api_collection_properties
<!-- js/actions/api-collection.js -->
@startDocuBlock JSA_get_api_collection_count
@startDocuBlock get_api_collection_count
<!-- js/actions/api-collection.js -->
@startDocuBlock JSA_get_api_collection_figures
@startDocuBlock get_api_collection_figures
<!-- js/actions/api-collection.js -->
@startDocuBlock JSA_get_api_collection_revision
@startDocuBlock get_api_collection_revision
<!-- js/actions/api-collection.js -->
@startDocuBlock JSA_get_api_collection_checksum
@startDocuBlock get_api_collection_checksum
<!-- js/actions/api-collection.js -->
@startDocuBlock JSF_get_api_collections
@startDocuBlock get_api_collections

View File

@ -2,19 +2,19 @@ Modifying a Collection
======================
<!-- js/actions/api-collection.js -->
@startDocuBlock JSF_put_api_collection_load
@startDocuBlock put_api_collection_load
<!-- js/actions/api-collection.js -->
@startDocuBlock JSF_put_api_collection_unload
@startDocuBlock put_api_collection_unload
<!-- js/actions/api-collection.js -->
@startDocuBlock JSF_put_api_collection_load_indexes_into_memory
@startDocuBlock put_api_collection_load_indexes_into_memory
<!-- js/actions/api-collection.js -->
@startDocuBlock JSF_put_api_collection_properties
@startDocuBlock put_api_collection_properties
<!-- js/actions/api-collection.js -->
@startDocuBlock JSF_put_api_collection_rename
@startDocuBlock put_api_collection_rename
<!-- js/actions/api-collection.js -->
@startDocuBlock JSF_put_api_collection_rotate
@startDocuBlock put_api_collection_rotate

View File

@ -15,16 +15,16 @@ Managing Databases using HTTP
-----------------------------
<!-- js/actions/api-database.js -->
@startDocuBlock JSF_get_api_database_current
@startDocuBlock get_api_database_current
<!-- js/actions/api-database.js -->
@startDocuBlock JSF_get_api_database_user
@startDocuBlock get_api_database_user
<!-- js/actions/api-database.js -->
@startDocuBlock JSF_get_api_database_list
@startDocuBlock get_api_database_list
<!-- js/actions/api-database.js -->
@startDocuBlock JSF_get_api_database_new
@startDocuBlock get_api_database_new
<!-- js/actions/api-database.js -->
@startDocuBlock JSF_get_api_database_delete
@startDocuBlock get_api_database_delete

View File

@ -2,7 +2,7 @@ Working with Documents using REST
=================================
<!-- arangod/RestHandler/RestDocumentHandler.cpp -->
@startDocuBlock REST_DOCUMENT_READ
@startDocuBlock get_read_document
#### Changes in 3.0 from 2.8:
@ -10,7 +10,7 @@ The *rev* query parameter has been withdrawn. The same effect can be
achieved with the *If-Match* HTTP header.
<!-- arangod/RestHandler/RestDocumentHandler.cpp -->
@startDocuBlock REST_DOCUMENT_READ_HEAD
@startDocuBlock head_read_document_header
#### Changes in 3.0 from 2.8:
@ -18,7 +18,7 @@ The *rev* query parameter has been withdrawn. The same effect can be
achieved with the *If-Match* HTTP header.
<!-- arangod/RestHandler/RestDocumentHandler.cpp -->
@startDocuBlock REST_DOCUMENT_READ_ALL
@startDocuBlock put_read_all_documents
#### Changes in 3.0 from 2.8:
@ -27,7 +27,7 @@ way with the URL path */_api/document* and the required query parameter
*collection* still works.
<!-- arangod/RestHandler/RestDocumentHandler.cpp -->
@startDocuBlock REST_DOCUMENT_CREATE
@startDocuBlock post_create_document
#### Changes in 3.0 from 2.8:
@ -38,7 +38,7 @@ with one operation is new and the query parameter *returnNew* has been added.
<!-- arangod/RestHandler/RestDocumentHandler.cpp -->
@startDocuBlock REST_DOCUMENT_REPLACE
@startDocuBlock put_replace_document
#### Changes in 3.0 from 2.8:
@ -62,14 +62,14 @@ way with the URL path */_api/document* and the required query parameter
*collection* still works.
<!-- arangod/RestHandler/RestDocumentHandler.cpp -->
@startDocuBlock REST_DOCUMENT_REPLACE_MULTI
@startDocuBlock put_replace_document_MULTI
#### Changes in 3.0 from 2.8:
The multi document version is new in 3.0.
<!-- arangod/RestHandler/RestDocumentHandler.cpp -->
@startDocuBlock REST_DOCUMENT_UPDATE
@startDocuBlock patch_update_document
#### Changes in 3.0 from 2.8:
@ -93,14 +93,14 @@ way with the URL path */_api/document* and the required query parameter
*collection* still works.
<!-- arangod/RestHandler/RestDocumentHandler.cpp -->
@startDocuBlock REST_DOCUMENT_UPDATE_MULTI
@startDocuBlock patch_update_document_MULTI
#### Changes in 3.0 from 2.8:
The multi document version is new in 3.0.
<!-- arangod/RestHandler/RestDocumentHandler.cpp -->
@startDocuBlock REST_DOCUMENT_DELETE
@startDocuBlock delete_remove_document
#### Changes in 3.0 from 2.8:
@ -117,7 +117,7 @@ combination of *If-Match* given and *policy=last* no longer works, but can
easily be achieved by leaving out the *If-Match* header.
<!-- arangod/RestHandler/RestDocumentHandler.cpp -->
@startDocuBlock REST_DOCUMENT_DELETE_MULTI
@startDocuBlock delete_remove_document_MULTI
#### Changes in 3.0 from 2.8:

View File

@ -13,4 +13,4 @@ Use the general document
for create/read/update/delete.
<!-- Rest/Graph edges -->
@startDocuBlock API_EDGE_READINOUTBOUND
@startDocuBlock get_read_in_out_edges

View File

@ -19,7 +19,7 @@ Asking about Endpoints via HTTP
---------------------------
<!-- js/actions/api-cluster.js -->
@startDocuBlock JSF_get_api_cluster_endpoints
@startDocuBlock get_api_cluster_endpoints
<!-- arangod/RestHandler/RestEndpointHandler.h -->
@startDocuBlock JSF_get_api_endpoint
@startDocuBlock get_api_endpoint

View File

@ -1,4 +1,4 @@
HTTP Interface for Exporting Documents
======================================
@startDocuBlock JSF_post_api_export
@startDocuBlock post_api_export

View File

@ -6,16 +6,16 @@ of the [graph module](../../Manual/Graphs/index.html) on the [knows graph](../..
![Social Example Graph](../../Manual/Graphs/knows_graph.png)
@startDocuBlock JSF_general_graph_edge_create_http_examples
@startDocuBlock general_graph_edge_create_http_examples
@startDocuBlock JSF_general_graph_edge_get_http_examples
@startDocuBlock general_graph_edge_get_http_examples
Examples will explain the API on the [social graph](../../Manual/Graphs/index.html#the-social-graph):
![Social Example Graph](../../Manual/Graphs/social_graph.png)
@startDocuBlock JSF_general_graph_edge_modify_http_examples
@startDocuBlock general_graph_edge_modify_http_examples
@startDocuBlock JSF_general_graph_edge_replace_http_examples
@startDocuBlock general_graph_edge_replace_http_examples
@startDocuBlock JSF_general_graph_edge_delete_http_examples
@startDocuBlock general_graph_edge_delete_http_examples

View File

@ -6,14 +6,14 @@ Examples will explain the REST API on the [social graph](../../Manual/Graphs/ind
![Social Example Graph](../../Manual/Graphs/social_graph.png)
@startDocuBlock JSF_general_graph_list_http_examples
@startDocuBlock JSF_general_graph_create_http_examples
@startDocuBlock JSF_general_graph_get_http_examples
@startDocuBlock JSF_general_graph_drop_http_examples
@startDocuBlock JSF_general_graph_list_vertex_http_examples
@startDocuBlock JSF_general_graph_vertex_collection_add_http_examples
@startDocuBlock JSF_general_graph_vertex_collection_remove_http_examples
@startDocuBlock JSF_general_graph_list_edge_http_examples
@startDocuBlock JSF_general_graph_edge_definition_add_http_examples
@startDocuBlock JSF_general_graph_edge_definition_modify_http_examples
@startDocuBlock JSF_general_graph_edge_definition_remove_http_examples
@startDocuBlock general_graph_list_http_examples
@startDocuBlock general_graph_create_http_examples
@startDocuBlock general_graph_get_http_examples
@startDocuBlock general_graph_drop_http_examples
@startDocuBlock general_graph_list_vertex_http_examples
@startDocuBlock general_graph_vertex_collection_add_http_examples
@startDocuBlock general_graph_vertex_collection_remove_http_examples
@startDocuBlock general_graph_list_edge_http_examples
@startDocuBlock general_graph_edge_definition_add_http_examples
@startDocuBlock general_graph_edge_definition_modify_http_examples
@startDocuBlock general_graph_edge_definition_remove_http_examples

View File

@ -6,8 +6,8 @@ on the [social graph](../../Manual/Graphs/index.html#the-social-graph):
![Social Example Graph](../../Manual/Graphs/social_graph.png)
@startDocuBlock JSF_general_graph_vertex_create_http_examples
@startDocuBlock JSF_general_graph_vertex_get_http_examples
@startDocuBlock JSF_general_graph_vertex_modify_http_examples
@startDocuBlock JSF_general_graph_vertex_replace_http_examples
@startDocuBlock JSF_general_graph_vertex_delete_http_examples
@startDocuBlock general_graph_vertex_create_http_examples
@startDocuBlock general_graph_vertex_get_http_examples
@startDocuBlock general_graph_vertex_modify_http_examples
@startDocuBlock general_graph_vertex_replace_http_examples
@startDocuBlock general_graph_vertex_delete_http_examples

View File

@ -5,7 +5,7 @@ If a [fulltext index](../../Manual/Appendix/Glossary.html#fulltext-index) exists
/_api/simple/fulltext will use this index to execute the specified fulltext query.
<!-- js/actions/api-index.js -->
@startDocuBlock JSF_post_api_index_fulltext
@startDocuBlock post_api_index_fulltext
<!-- js/actions/api-index.js -->
@startDocuBlock JSA_put_api_simple_fulltext
@startDocuBlock put_api_simple_fulltext

View File

@ -2,10 +2,10 @@ Working with Geo Indexes
========================
<!-- js/actions/api-index.js -->
@startDocuBlock JSF_post_api_index_geo
@startDocuBlock post_api_index_geo
<!-- js/actions/api-simple.js -->
@startDocuBlock JSA_put_api_simple_near
@startDocuBlock put_api_simple_near
<!-- js/actions/api-simple.js -->
@startDocuBlock JSA_put_api_simple_within
@startDocuBlock put_api_simple_within

View File

@ -4,10 +4,10 @@ Working with Hash Indexes
If a suitable hash index exists, then */_api/simple/by-example* will use this index to execute a query-by-example.
<!-- js/actions/api-index.js -->
@startDocuBlock JSF_post_api_index_hash
@startDocuBlock post_api_index_hash
<!-- js/actions/api-index.js -->
@startDocuBlock JSA_put_api_simple_by_example
@startDocuBlock put_api_simple_by_example
<!-- js/actions/api-index.js -->
@startDocuBlock JSA_put_api_simple_first_example
@startDocuBlock put_api_simple_first_example

View File

@ -5,4 +5,4 @@ If a suitable persistent index exists, then /_api/simple/range and other operati
will use this index to execute queries.
<!-- js/actions/api-index.js -->
@startDocuBlock JSF_post_api_index_persistent
@startDocuBlock post_api_index_persistent

View File

@ -5,4 +5,4 @@ If a suitable skip-list index exists, then /_api/simple/range and other operatio
will use this index to execute queries.
<!-- js/actions/api-index.js -->
@startDocuBlock JSF_post_api_index_skiplist
@startDocuBlock post_api_index_skiplist

View File

@ -2,13 +2,13 @@ Working with Indexes using HTTP
===============================
<!-- js/actions/api-index.js -->
@startDocuBlock JSF_get_api_reads_index
@startDocuBlock get_api_reads_index
<!-- js/actions/api-index.js -->
@startDocuBlock JSF_post_api_index
@startDocuBlock post_api_index
<!-- js/actions/api-index.js -->
@startDocuBlock JSF_post_api_index_delete
@startDocuBlock post_api_index_delete
<!-- js/actions/api-index.js -->
@startDocuBlock JSF_get_api_index
@startDocuBlock get_api_index

View File

@ -4,31 +4,34 @@ HTTP Interface for Miscellaneous functions
This is an overview of ArangoDB's HTTP interface for miscellaneous functions.
<!-- lib/Admin/RestVersionHandler.cpp -->
@startDocuBlock JSF_get_api_return
@startDocuBlock get_api_return
<!-- lib/Admin/RestEngineHandler.cpp -->
@startDocuBlock JSF_get_engine
@startDocuBlock get_engine
<!-- ljs/actions/api-system.js -->
@startDocuBlock JSF_put_admin_wal_flush
@startDocuBlock put_admin_wal_flush
<!-- ljs/actions/api-system.js -->
@startDocuBlock JSF_get_admin_wal_properties
@startDocuBlock get_admin_wal_properties
<!-- ljs/actions/api-system.js -->
@startDocuBlock JSF_put_admin_wal_properties
@startDocuBlock put_admin_wal_properties
<!-- ljs/actions/api-system.js -->
@startDocuBlock JSF_get_admin_wal_transactions
@startDocuBlock get_admin_wal_transactions
<!-- js/actions/api-system.js -->
@startDocuBlock JSF_get_admin_time
@startDocuBlock get_admin_time
@startDocuBlock JSF_get_admin_database_version
<!-- js/actions/api-system.js -->
@startDocuBlock post_admin_echo
@startDocuBlock get_admin_database_version
<!-- lib/Admin/RestShutdownHandler.cpp -->
@startDocuBlock JSF_get_api_initiate
@startDocuBlock delete_api_shutdown
<!-- js/actions/api-system.js -->
@startDocuBlock JSF_post_admin_execute
@startDocuBlock post_admin_execute

View File

@ -2,4 +2,4 @@ Other Replication Commands
==========================
<!-- arangod/RestHandler/RestReplicationHandler.cpp -->
@startDocuBlock JSF_put_api_replication_serverID
@startDocuBlock put_api_replication_serverID

View File

@ -5,20 +5,20 @@ The applier commands allow to remotely start, stop, and query the state and
configuration of an ArangoDB database's replication applier.
<!-- arangod/RestHandler/RestReplicationHandler.cpp -->
@startDocuBlock JSF_put_api_replication_applier
@startDocuBlock put_api_replication_applier
<!-- arangod/RestHandler/RestReplicationHandler.cpp -->
@startDocuBlock JSF_put_api_replication_applier_adjust
@startDocuBlock put_api_replication_applier_adjust
<!-- arangod/RestHandler/RestReplicationHandler.cpp -->
@startDocuBlock JSF_put_api_replication_applier_start
@startDocuBlock put_api_replication_applier_start
<!-- arangod/RestHandler/RestReplicationHandler.cpp -->
@startDocuBlock JSF_put_api_replication_applier_stop
@startDocuBlock put_api_replication_applier_stop
<!-- arangod/RestHandler/RestReplicationHandler.cpp -->
@startDocuBlock JSF_get_api_replication_applier_state
@startDocuBlock get_api_replication_applier_state
<!-- arangod/RestHandler/RestReplicationHandler.cpp -->
@startDocuBlock JSF_put_api_replication_makeSlave
@startDocuBlock put_api_replication_makeSlave

View File

@ -8,17 +8,17 @@ to either start a full or a partial synchronization of data, e.g. to initiate a
or the incremental data synchronization.
<!-- arangod/RestHandler/RestReplicationHandler.cpp -->
@startDocuBlock JSF_put_api_replication_inventory
@startDocuBlock put_api_replication_inventory
The *batch* method will create a snapshot of the current state that then can be
dumped. A batchId is required when using the dump api with rocksdb.
@startDocuBlock JSF_post_batch_replication
@startDocuBlock post_batch_replication
@startDocuBlock JSF_delete_batch_replication
@startDocuBlock delete_batch_replication
@startDocuBlock JSF_put_batch_replication
@startDocuBlock put_batch_replication
The *dump* method can be used to fetch data from a specific collection. As the
@ -36,11 +36,11 @@ To get to an identical state of data, replication clients should apply the indiv
parts of the dump results in the same order as they are provided.
<!-- arangod/RestHandler/RestReplicationHandler.cpp -->
@startDocuBlock JSF_get_api_replication_dump
@startDocuBlock get_api_replication_dump
<!-- arangod/RestHandler/RestReplicationHandler.cpp -->
@startDocuBlock JSF_put_api_replication_synchronize
@startDocuBlock put_api_replication_synchronize
<!-- arangod/RestHandler/RestReplicationHandler.cpp -->
@startDocuBlock JSF_get_api_replication_cluster_inventory
@startDocuBlock get_api_replication_cluster_inventory

View File

@ -11,7 +11,7 @@ of the logger and to fetch the latest changes written by the logger. The operati
will return the state and data from the write-ahead log.
<!-- arangod/RestHandler/RestReplicationHandler.cpp -->
@startDocuBlock JSF_get_api_replication_logger_return_state
@startDocuBlock get_api_replication_logger_return_state
To query the latest changes logged by the replication logger, the HTTP interface
also provides the `logger-follow` method.
@ -20,14 +20,14 @@ This method should be used by replication clients to incrementally fetch updates
from an ArangoDB database.
<!-- arangod/RestHandler/RestReplicationHandler.cpp -->
@startDocuBlock JSF_get_api_replication_logger_returns
@startDocuBlock get_api_replication_logger_returns
To check what range of changes is available (identified by tick values), the HTTP
interface provides the methods `logger-first-tick` and `logger-tick-ranges`.
Replication clients can use the methods to determine if certain data (identified
by a tick *date*) is still available on the master.
@startDocuBlock JSF_get_api_replication_logger_first_tick
@startDocuBlock get_api_replication_logger_first_tick
@startDocuBlock JSF_get_api_replication_logger_tick_ranges
@startDocuBlock get_api_replication_logger_tick_ranges

View File

@ -1,5 +1,6 @@
# Summary
* [Introduction](README.md)
* [General HTTP Handling](General/README.md)
* [HTTP Interface](Api/README.md)
* [Databases](Database/README.md)

View File

@ -5,34 +5,34 @@ Sharding only should be used by developers!
<!-- js/actions/api-cluster.js -->
@startDocuBlock JSF_cluster_test_GET
@startDocuBlock get_cluster_test
<!-- js/actions/api-cluster.js -->
@startDocuBlock JSF_cluster_test_POST
@startDocuBlock post_cluster_test
<!-- js/actions/api-cluster.js -->
@startDocuBlock JSF_cluster_test_PUT
@startDocuBlock put_cluster_test
<!-- js/actions/api-cluster.js -->
@startDocuBlock JSF_cluster_test_DELETE
@startDocuBlock delete_cluster_test
<!-- js/actions/api-cluster.js -->
@startDocuBlock JSF_cluster_test_PATCH
@startDocuBlock patch_cluster_test
<!-- js/actions/api-cluster.js -->
@startDocuBlock JSF_cluster_test_HEAD
@startDocuBlock head_cluster_test
<!-- js/actions/api-cluster.js -->
@startDocuBlock JSF_cluster_check_port_GET
@startDocuBlock get_cluster_check_port

View File

@ -29,43 +29,43 @@ be used with the cursor API to fetch any outstanding results from the server and
dispose the server-side cursor afterwards.
<!-- js/actions/api-simple.js -->
@startDocuBlock JSA_put_api_simple_all
@startDocuBlock put_api_simple_all
<!-- js/actions/api-simple.js -->
@startDocuBlock JSA_put_api_simple_by_example
@startDocuBlock put_api_simple_by_example
<!-- js/actions/api-simple.js -->
@startDocuBlock JSA_put_api_simple_first_example
@startDocuBlock put_api_simple_first_example
<!-- arangod/RestHandler/RestSimpleHandler.cpp -->
@startDocuBlock RestLookupByKeys
<!-- js/actions/api-simple.js -->
@startDocuBlock JSA_put_api_simple_any
@startDocuBlock put_api_simple_any
<!-- arangod/RestHandler/RestSimpleHandler.cpp -->
@startDocuBlock RestRemoveByKeys
<!-- js/actions/api-simple.js -->
@startDocuBlock JSA_put_api_simple_remove_by_example
@startDocuBlock put_api_simple_remove_by_example
<!-- js/actions/api-simple.js -->
@startDocuBlock JSA_put_api_simple_replace_by_example
@startDocuBlock put_api_simple_replace_by_example
<!-- js/actions/api-simple.js -->
@startDocuBlock JSA_put_api_simple_update_by_example
@startDocuBlock put_api_simple_update_by_example
<!-- js/actions/api-simple.js -->
@startDocuBlock JSA_put_api_simple_range
@startDocuBlock put_api_simple_range
<!-- js/actions/api-simple.js -->
@startDocuBlock JSA_put_api_simple_near
@startDocuBlock put_api_simple_near
<!-- js/actions/api-simple.js -->
@startDocuBlock JSA_put_api_simple_within
@startDocuBlock put_api_simple_within
<!-- js/actions/api-simple.js -->
@startDocuBlock JSA_put_api_simple_within_rectangle
@startDocuBlock put_api_simple_within_rectangle
<!-- js/actions/api-simple.js -->
@startDocuBlock JSA_put_api_simple_fulltext
@startDocuBlock put_api_simple_fulltext

View File

@ -5,12 +5,12 @@ Following you have ArangoDB's HTTP Interface for Tasks.
There are also some examples provided for every API action.
@startDocuBlock JSF_get_api_tasks_all
@startDocuBlock get_api_tasks_all
@startDocuBlock JSF_get_api_tasks
@startDocuBlock get_api_tasks
@startDocuBlock JSF_post_api_new_tasks
@startDocuBlock post_api_new_tasks
@startDocuBlock JSF_put_api_new_tasks
@startDocuBlock put_api_new_tasks
@startDocuBlock JSF_delete_api_tasks
@startDocuBlock delete_api_tasks

View File

@ -20,4 +20,4 @@ For a more detailed description of how transactions work in ArangoDB please
refer to [Transactions](../../Manual/Transactions/index.html).
<!-- js/actions/api-transaction.js -->
@startDocuBlock JSF_post_api_transaction
@startDocuBlock post_api_transaction

View File

@ -18,7 +18,7 @@ are offered.
Executing Traversals via HTTP
-----------------------------
@startDocuBlock JSF_HTTP_API_TRAVERSAL
@startDocuBlock HTTP_API_TRAVERSAL
All examples were using this graph:

View File

@ -2,10 +2,10 @@ Creating and Modifying an ArangoSearch View
===========================================
<!-- js/actions/api-view.js -->
@startDocuBlock JSF_post_api_view_iresearch
@startDocuBlock post_api_view_iresearch
<!-- js/actions/api-view.js -->
@startDocuBlock JSF_put_api_view_properties_iresearch
@startDocuBlock put_api_view_properties_iresearch
<!-- js/actions/api-view.js -->
@startDocuBlock JSF_patch_api_view_properties_iresearch
@startDocuBlock patch_api_view_properties_iresearch

View File

@ -2,5 +2,5 @@ Deleting Views
===========================
<!-- js/actions/api-view.js -->
@startDocuBlock JSF_delete_api_view
@startDocuBlock delete_api_view

View File

@ -2,10 +2,10 @@ Getting Information about a View
================================
<!-- js/actions/api-view.js -->
@startDocuBlock JSA_get_api_view_name
@startDocuBlock get_api_view_name
<!-- js/actions/api-view.js -->
@startDocuBlock JSA_get_api_view_properties
@startDocuBlock get_api_view_properties
<!-- js/actions/api-view.js -->
@startDocuBlock JSF_get_api_views
@startDocuBlock get_api_views

View File

@ -2,4 +2,4 @@ Modifying a View
========================================
<!-- js/actions/api-view.js -->
@startDocuBlock JSF_put_api_view_rename
@startDocuBlock put_api_view_rename

View File

@ -15,7 +15,7 @@
"ga",
"callouts@git+https://github.com/Simran-B/gitbook-plugin-callouts.git",
"edit-link",
"page-toc",
"page-toc@git+https://github.com/Simran-B/gitbook-plugin-page-toc.git",
"localized-footer"
],
"pdf": {

View File

@ -19,7 +19,7 @@ div.example_show_button {
background-color: rgba(240,240,0,0.4);
}
.book .book-body li:last-child {
.book .book-body section > ul li:last-child {
margin-bottom: 0.85em;
}

View File

@ -0,0 +1,70 @@
Active Failover Administration
==============================
The _Active Failover_ setup requires almost no manual administration.
You may still need to replace, upgrade or remove individual nodes
in an _Active Failover_ setup.
Determining the current _Leader_
--------------------------------
It is possible to determine the _leader_ by asking any of the involved single-server
instances. Just send a request to the `/_api/cluster/endpoints` REST API.
```bash
curl http://server.domain.org:8530/_api/cluster/endpoints
{
"error": false,
"code": 200,
"endpoints": [
{
"endpoint": "tcp://[::1]:8530"
},
{
"endpoint": "tcp://[::1]:8531"
}
]
}
```
This API will return you all available endpoints, the first endpoint is defined to
be the current _Leader_. This endpoint is always available and will not be blocked
with a `HTTP/1.1 503 Service Unavailable` response on a _Follower_
Upgrading / Replacing / Removing a _Leader_
-------------------------------------------
A _Leader_ is the active server which can receive all read and write operations
in an _Active-Failover_ setup.
Upgrading or removing a _Leader_ can be a little tricky, because as soon as you
stop the leader's process you will trigger a failover situation. This can be intended
here, but you will probably want to halt all writes to the _leader_ for a certain
amount of time to allow the _follower_ to catch up on all operations.
After you have ensured that the _follower_ is sufficiently caught up, you can
stop the _leader_ process via the shutdown API or by sending a `SIGTERM` signal
to the process (i.e. `kill <process-id>`). This will trigger an orderly shutdown,
and should trigger an immediate switch to the _follower_. If your client drivers
are configured correctly, you should notice almost no interruption in your
applications.
Once you upgraded the local server via the `--database.auto-upgrade` option,
you can add it again to the _Active Failover_ setup. The server will resync automatically
with the new _Leader_ and become a _Follower_.
Upgrading / Replacing / Removing a _Follower_
---------------------------------------------
A _Follower_ is the passive server which tries to mirror all the data stored in
the _Leader_.
To upgrade a _follower_ you only need to stop the process and start it
with `--database.auto-upgrade`. The server process will automatically resync
with the master after a restart.
The clean way of removing a _Follower_ is to first start a replacement _Follower_
(otherwise you will lose resiliency). To start a _Follower_ please have a look
into our [deployment guide](../../Deployment/ActiveFailover/README.md).
After you have your replacement ready you can just kill the process and remove it.

View File

@ -1,13 +1,118 @@
Cluster administration
Cluster Administration
======================
This Section includes information related to the administration of an ArangoDB Cluster.
This _Section_ includes information related to the administration of an ArangoDB Cluster.
For a general introduction to the ArangoDB Cluster, please refer to the Cluster [chapter](../../Scalability/Cluster/README.md).
For a general introduction to the ArangoDB Cluster, please refer to the
Cluster [chapter](../../Scalability/Cluster/README.md).
Replacing/Removing a Coordinator
Enabling synchronous replication
--------------------------------
For an introduction about _Synchronous Replication_ in Cluster, please refer
to the [_Cluster Architecture_](../../Scalability/Cluster/Architecture.md#synchronous-replication) section.
Synchronous replication can be enabled per _collection_. When creating a
_collection_ you may specify the number of _replicas_ using the
*replicationFactor* parameter. The default value is set to `1` which
effectively *disables* synchronous replication among _DBServers_.
Whenever you specify a _replicationFactor_ greater than 1 when creating a
collection, synchronous replication will be activated for this collection.
The Cluster will determine suitable _leaders_ and _followers_ for every
requested _shard_ (_numberOfShards_) within the Cluster.
Example:
```
127.0.0.1:8530@_system> db._create("test", {"replicationFactor": 3})
```
In the above case, any write operation will require 2 replicas to
report success from now on.
Preparing growth
----------------
You may create a _collection_ with higher _replication factor_ than
available _DBServers_. When additional _DBServers_ become available
the _shards_ are automatically replicated to the newly available _DBServers_.
To create a _collection_ with higher _replication factor_ than
available _DBServers_ please set the option _enforceReplicationFactor_ to _false_,
when creating the collection from _ArangoShell_ (the option is not available
from the web interface), e.g.:
```
db._create("test", { replicationFactor: 4 }, { enforceReplicationFactor: false });
```
The default value for _enforceReplicationFactor_ is true.
**Note:** multiple _replicas_ of the same _shard_ can never coexist on the same
_DBServer_ instance.
Sharding
--------
For an introduction about _Sharding_ in Cluster, please refer to the
[_Cluster Architecture_](../../Scalability/Cluster/Architecture.md#sharding) section.
Number of _shards_ can be configured at _collection_ creation time, e.g. the UI,
or the _ArangoDB Shell_:
```
127.0.0.1:8529@_system> db._create("sharded_collection", {"numberOfShards": 4});
```
To configure a custom _hashing_ for another attribute (default is __key_):
```
127.0.0.1:8529@_system> db._create("sharded_collection", {"numberOfShards": 4, "shardKeys": ["country"]});
```
The example above, where 'country' has been used as _shardKeys_ can be useful
to keep data of every country in one shard, which would result in better
performance for queries working on a per country base.
It is also possible to specify multiple `shardKeys`.
Note however that if you change the shard keys from their default `["_key"]`, then finding
a document in the collection by its primary key involves a request to
every single shard. Furthermore, in this case one can no longer prescribe
the primary key value of a new document but must use the automatically
generated one. This latter restriction comes from the fact that ensuring
uniqueness of the primary key would be very inefficient if the user
could specify the primary key.
On which DBServer in a Cluster a particular _shard_ is kept is undefined.
There is no option to configure an affinity based on certain _shard_ keys.
Unique indexes (hash, skiplist, persistent) on sharded collections are
only allowed if the fields used to determine the shard key are also
included in the list of attribute paths for the index:
| shardKeys | indexKeys | |
|----------:|----------:|------------:|
| a | a | allowed |
| a | b | not allowed |
| a | a, b | allowed |
| a, b | a | not allowed |
| a, b | b | not allowed |
| a, b | a, b | allowed |
| a, b | a, b, c | allowed |
| a, b, c | a, b | not allowed |
| a, b, c | a, b, c | allowed |
Moving/Rebalancing _shards_
---------------------------
A _shard_ can be moved from a _DBServer_ to another, and the entire shard distribution
can be rebalanced using the correponding buttons in the web [UI](../WebInterface/Cluster.md).
Replacing/Removing a _Coordinator_
----------------------------------
_Coordinators_ are effectively stateless and can be replaced, added and
removed without more consideration than meeting the necessities of the
particular installation.
@ -26,8 +131,8 @@ integrated as a new _Coordinator_ into the cluster. You may also just
restart the _Coordinator_ as before and it will reintegrate itself into
the cluster.
Replacing/Removing a DBServer
-----------------------------
Replacing/Removing a _DBServer_
-------------------------------
_DBServers_ are where the data of an ArangoDB cluster is stored. They
do not publish a we UI and are not meant to be accessed by any other

View File

@ -1,22 +1,22 @@
Per-Database Setup
==================
This page describes the replication process based on a specific database within an ArangoDB instance.
This page describes the master/slave replication process based on a specific database within an ArangoDB instance.
That means that only the specified database will be replicated.
Setting up a working master-slave replication requires two ArangoDB instances:
* **master**: this is the instance that all data-modification operations should be directed to
* **slave**: on this instance, we'll start a replication applier, and this will fetch data from the
master database's write-ahead log and apply its operations locally
* **master**: this is the instance where all data-modification operations should be directed to
* **slave**: this is the instance that replicates the data from the master. We will start a _replication applier_ on it, and it will fetch data from the
master database's _write-ahead log_ and apply its operations locally
For the following example setup, we'll use the instance *tcp://master.domain.org:8529* as the
master, and the instance *tcp://slave.domain.org:8530* as a slave.
For the following example setup, we will use the instance *tcp://master.domain.org:8529* as the
_master_, and the instance *tcp://slave.domain.org:8530* as a _slave_.
The goal is to have all data from the database *_system* on master *tcp://master.domain.org:8529*
be replicated to the database *_system* on the slave *tcp://slave.domain.org:8530*.
The goal is to have all data from the database *_system* on _master_ *tcp://master.domain.org:8529*
be replicated to the database *_system* on the _slave_ *tcp://slave.domain.org:8530*.
On the **master**, nothing special needs to be done, as all write operations will automatically be
logged in the master's write-ahead log (WAL).
On the _master_, nothing special needs to be done, as all write operations will automatically be
logged in the master's _write-ahead log_ (WAL).
All-in-one setup
----------------

View File

@ -0,0 +1,8 @@
Master/Slave Administration
===========================
This _Chapter_ includes information related to the administration of a Master/Slave
environment.
For a general introduction to the ArangoDB Master/Slave environment, please refer
to the Master/Slave [chapter](../../Scalability/MasterSlave/README.md).

View File

@ -1,180 +1,33 @@
Components
==========
_Replication applier_
=====================
Replication configuration
-------------------------
Replication Logger
------------------
The replication is turned off by default. In order to create a master-slave setup,
the so-called _replication applier_ needs to be enabled on the _slave_ databases.
### Purpose
Replication is configured on a per-database level, or, starting from 3.3.0 at server level.
The replication logger will write all data-modification operations into the write-ahead log.
This log may then be read by clients to replay any data modification on a different server.
The _replication applier_ on the _slave_ can be used to perform a one-time synchronization
with the _master_ (and then stop), or to perform an ongoing replication of changes. To
resume replication on _slave_ restart, the *autoStart* attribute of the replication
applier must be set to *true*.
_setupReplication_ Command
--------------------------
### Checking the state
To copy the initial data from the _master_ to the _slave_ and start the
continuous replication, there is an all-in-one command *setupReplication*.
To query the current state of the logger, use the *state* command:
require("@arangodb/replication").logger.state();
The result might look like this:
```js
{
"state" : {
"running" : true,
"lastLogTick" : "133322013",
"totalEvents" : 16,
"time" : "2014-07-06T12:58:11Z"
},
"server" : {
"version" : "2.2.0-devel",
"serverId" : "40897075811372"
},
"clients" : {
}
}
```
The *running* attribute will always be true. In earlier versions of ArangoDB the replication was optional and this could have been *false*.
The *totalEvents* attribute indicates how many log events have been logged since the start
of the ArangoDB server. Finally, the *lastLogTick* value indicates the id of the last
operation that was written to the server's write-ahead log. It can be used to determine whether new
operations were logged, and is also used by the replication applier for incremental
fetching of data.
**Note**: The replication logger state can also be queried via the
[HTTP API](../../../../HTTP/Replications/index.html).
To query which data ranges are still available for replication clients to fetch,
the logger provides the *firstTick* and *tickRanges* functions:
require("@arangodb/replication").logger.firstTick();
This will return the minimum tick value that the server can provide to replication
clients via its replication APIs. The *tickRanges* function returns the minimum and
maximum tick values per logfile:
require("@arangodb/replication").logger.tickRanges();
Replication Applier
-------------------
### Purpose
The purpose of the replication applier is to read data from a master database's event log,
and apply them locally. The applier will check the master database for new operations periodically.
It will perform an incremental synchronization, i.e. only asking the master for operations
that occurred after the last synchronization.
The replication applier does not get notified by the master database when there are "new"
operations available, but instead uses the pull principle. It might thus take some time (the
so-called *replication lag*) before an operation from the master database gets shipped to and
applied in a slave database.
The replication applier of a database is run in a separate thread. It may encounter problems
when an operation from the master cannot be applied safely, or when the connection to the master
database goes down (network outage, master database is down or unavailable etc.). In this case,
the database's replication applier thread might terminate itself. It is then up to the
administrator to fix the problem and restart the database's replication applier.
If the replication applier cannot connect to the master database, or the communication fails at
some point during the synchronization, the replication applier will try to reconnect to
the master database. It will give up reconnecting only after a configurable amount of connection
attempts.
The replication applier state is queryable at any time by using the *state* command of the
applier. This will return the state of the applier of the current database:
```js
require("@arangodb/replication").applier.state();
```
The result might look like this:
```js
{
"state" : {
"running" : true,
"lastAppliedContinuousTick" : "152786205",
"lastProcessedContinuousTick" : "152786205",
"lastAvailableContinuousTick" : "152786205",
"progress" : {
"time" : "2014-07-06T13:04:57Z",
"message" : "fetching master log from offset 152786205",
"failedConnects" : 0
},
"totalRequests" : 38,
"totalFailedConnects" : 0,
"totalEvents" : 1,
"lastError" : {
"errorNum" : 0
},
"time" : "2014-07-06T13:04:57Z"
},
"server" : {
"version" : "2.2.0-devel",
"serverId" : "210189384542896"
},
"endpoint" : "tcp://master.example.org:8529",
"database" : "_system"
}
```
The *running* attribute indicates whether the replication applier of the current database
is currently running and polling the server at *endpoint* for new events.
The *progress.failedConnects* attribute shows how many failed connection attempts the replication
applier currently has encountered in a row. In contrast, the *totalFailedConnects* attribute
indicates how many failed connection attempts the applier has made in total. The
*totalRequests* attribute shows how many requests the applier has sent to the master database
in total. The *totalEvents* attribute shows how many log events the applier has read from the
master.
The *progress.message* sub-attribute provides a brief hint of what the applier currently does
(if it is running). The *lastError* attribute also has an optional *errorMessage* sub-attribute,
showing the latest error message. The *errorNum* sub-attribute of the *lastError* attribute can be
used by clients to programmatically check for errors. It should be *0* if there is no error, and
it should be non-zero if the applier terminated itself due to a problem.
Here is an example of the state after the replication applier terminated itself due to
(repeated) connection problems:
```js
{
"state" : {
"running" : false,
"progress" : {
"time" : "2014-07-06T13:14:37Z",
"message" : "applier stopped",
"failedConnects" : 6
},
"totalRequests" : 79,
"totalFailedConnects" : 11,
"totalEvents" : 0,
"lastError" : {
"time" : "2014-07-06T13:09:41Z",
"errorMessage" : "could not connect to master at tcp://master.example.org:8529: Could not connect to 'tcp:/...",
"errorNum" : 1400
},
...
}
}
```
**Note**: the state of a database's replication applier is queryable via the HTTP API, too.
Please refer to [HTTP Interface for Replication](../../../../HTTP/Replications/index.html) for more details.
### All-in-one setup
To copy the initial data from the **slave** to the master and start the
continuous replication, there is an all-in-one command *setupReplication*:
From _ArangoSH_:
```js
require("@arangodb/replication").setupReplication(configuration);
```
The following example demonstrates how to use the command for setting up replication
for the *_system* database. Note that it should be run on the slave and not the master:
for the *_system* database. Note that it should be run on the _slave_ and not the _master_:
```js
db._useDatabase("_system");
@ -193,17 +46,17 @@ The command will return when the initial synchronization is finished and the con
is started, or in case the initial synchronization has failed.
If the initial synchronization is successful, the command will store the given configuration on
the slave. It also configures the continuous replication to start automatically if the slave is
the _slave_. It also configures the continuous replication to start automatically if the slave is
restarted, i.e. *autoStart* is set to *true*.
If the command is run while the slave's replication applier is already running, it will first
stop the running applier, drop its configuration and do a resynchronization of data with the
master. It will then use the provided configration, overwriting any previously existing replication
_master_. It will then use the provided configuration, overwriting any previously existing replication
configuration on the slave.
### Starting and Stopping
### Starting and Stopping the _replication applier_
To manually start and stop the applier in the current database, the *start* and *stop* commands
To manually start and stop the _replication applier_ in the current database, the *start* and *stop* commands
can be used like this:
```js
@ -211,7 +64,7 @@ require("@arangodb/replication").applier.start(<tick>);
require("@arangodb/replication").applier.stop();
```
**Note**: Starting a replication applier without setting up an initial configuration will
**Note**: Starting a _replication applier_ without setting up an initial configuration will
fail. The replication applier will look for its configuration in a file named
*REPLICATION-APPLIER-CONFIG* in the current database's directory. If the file is not present,
ArangoDB will use some default configuration, but it cannot guess the endpoint (the address
@ -239,9 +92,9 @@ too. Thus stopping the replication applier on the slave manually should only be
is certainty that there are no ongoing transactions on the master.
### Configuration
### _Replication applier_ Configuration
To configure the replication applier of a specific database, use the *properties* command. Using
To configure the _replication applier_ of a specific database, use the *properties* command. Using
it without any arguments will return the applier's current configuration:
```js

View File

@ -4,19 +4,21 @@ Server-level Setup
This page describes the replication process based on a complete ArangoDB instance. That means that
all included databases will be replicated.
Setting up a working master-slave replication requires two ArangoDB instances:
* **master**: this is the instance that all data-modification operations should be directed to
* **slave**: on this instance, we'll start a replication applier, and this will fetch data from the
master database's write-ahead log and apply its operations locally
For the following example setup, we'll use the instance *tcp://master.domain.org:8529* as the
master, and the instance *tcp://slave.domain.org:8530* as a slave.
**Note:** Server-level Setup is available only from version 3.3.0.
The goal is to have all data of all databases on master *tcp://master.domain.org:8529*
be replicated to the slave instance *tcp://slave.domain.org:8530*.
Setting up a working master-slave replication requires two ArangoDB instances:
* **master**: this is the instance where all data-modification operations should be directed to
* **slave**: this is the instance that replicates the data from the master. We will start a _replication applier_ on it, and it will fetch data from the
master _write-ahead log_ and apply its operations locally
For the following example setup, we will use the instance *tcp://master.domain.org:8529* as the
_master_, and the instance *tcp://slave.domain.org:8530* as a _slave_.
The goal is to have all data of all databases on _master_ *tcp://master.domain.org:8529*
be replicated to the _slave_ instance *tcp://slave.domain.org:8530*.
On the **master**, nothing special needs to be done, as all write operations will automatically be
logged in the master's write-ahead log (WAL).
logged in the master's _write-ahead log_ (WAL).
All-in-one setup
----------------
@ -56,7 +58,7 @@ configuration on the **slave**.
Stopping synchronization
-----------------------
------------------------
The initial synchronization and continuous replication applier can also be started separately.
To start replication on the **slave**, make sure there currently is no replication applier running.

View File

@ -0,0 +1,2 @@
Setting up Replication in a _Master/Slave_ environment
======================================================

View File

@ -1,7 +1,7 @@
Syncing Collections
===================
In order to synchronize data for a single collection from a master to a slave instance, there
In order to synchronize data for a single collection from a _master_ to a _slave_ instance, there
is the *syncCollection* function:
It will fetch all documents of the specified collection from the master database and store

View File

@ -1,69 +0,0 @@
Asynchronous replication
========================
Asynchronous replication works by logging every data modification on a *master* and replaying these events on a number of *slaves*.
Transactions are honored in replication, i.e. transactional write operations will
become visible on slaves atomically.
As all write operations will be logged to a master database's write-ahead log, the
replication in ArangoDB currently cannot be used for write-scaling. The main purposes
of the replication in current ArangoDB are to provide read-scalability and "hot backups"
of specific databases.
It is possible to connect multiple slave databases to the same master database. Slave
databases should be used as read-only instances, and no user-initiated write operations
should be carried out on them. Otherwise data conflicts may occur that cannot be solved
automatically, and that will make the replication stop.
In an asynchronous replication scenario slaves will *pull* changes
from the master database. Slaves need to know to which master database they should
connect to, but a master database is not aware of the slaves that replicate from it.
When the network connection between the master database and a slave goes down, write
operations on the master can continue normally. When the network is up again, slaves
can reconnect to the master database and transfer the remaining changes. This will
happen automatically provided slaves are configured appropriately.
Replication lag
---------------
In this setup, write operations are applied first in the master database, and applied
in the slave database(s) afterwards.
For example, let's assume a write operation is executed in the master database
at point in time t0. To make a slave database apply the same operation, it must first
fetch the write operation's data from master database's write-ahead log, then parse it and
apply it locally. This will happen at some point in time after t0, let's say t1.
The difference between t1 and t0 is called the *replication lag*, and it is unavoidable
in asynchronous replication. The amount of replication lag depends on many factors, a
few of which are:
* the network capacity between the slaves and the master
* the load of the master and the slaves
* the frequency in which slaves poll the master for updates
Between t0 and t1, the state of data on the master is newer than the state of data
on the slave(s). At point in time t1, the state of data on the master and slave(s)
is consistent again (provided no new data modifications happened on the master in
between). Thus, the replication will lead to an *eventually consistent* state of data.
Replication configuration
-------------------------
The replication is turned off by default. In order to create a master-slave setup,
the so-called *replication applier* needs to be enabled on the slave databases.
Replication is configured on a per-database level. If multiple database are to be
replicated, the replication must be set up individually per database.
The replication applier on the slave can be used to perform a one-time synchronization
with the master (and then stop), or to perform an ongoing replication of changes. To
resume replication on slave restart, the *autoStart* attribute of the replication
applier must be set to *true*.
Replication overhead
--------------------
As the master servers are logging any write operation in the write-ahead-log anyway replication doesn't cause any extra overhead on the master. However it will of course cause some overhead for the master to serve incoming read requests of the slaves. Returning the requested data is however a trivial task for the master and should not result in a notable performance degration in production.

View File

@ -1,59 +0,0 @@
Introduction to Replication
===========================
Replication allows you to *replicate* data onto another machine. It
forms the base of all disaster recovery and failover features ArangoDB
offers.
ArangoDB offers **asynchronous** and **synchronous** replication,
depending on which type of arangodb deployment you are using.
Since ArangoDB 3.2 the *synchronous replication* replication is the *only* replication
type used in a cluster whereas the asynchronous replication is only available between
single-server nodes. Future versions of ArangoDB may reintroduce asynchronous
replication for the cluster.
We will describe pros and cons of each of them in the following
sections.
Asynchronous replication
------------------------
In ArangoDB any write operation will be logged to the write-ahead
log. When using Asynchronous replication slaves will connect to a
master and apply all the events from the log in the same order
locally. After that, they will have the same state of data as the
master database.
Synchronous replication
-----------------------
Synchronous replication only works within a cluster and is typically
used for mission critical data which must be accessible at all
times. Synchronous replication generally stores a copy of a shard's
data on another db server and keeps it in sync. Essentially, when storing
data after enabling synchronous replication the cluster will wait for
all replicas to write all the data before greenlighting the write
operation to the client. This will naturally increase the latency a
bit, since one more network hop is needed for each write. However, it
will enable the cluster to immediately fail over to a replica whenever
an outage has been detected, without losing any committed data, and
mostly without even signaling an error condition to the client.
Synchronous replication is organized such that every shard has a
leader and `r-1` followers, where `r` denoted the replication
factor. The number of followers can be controlled using the
`replicationFactor` parameter whenever you create a collection, the
`replicationFactor` parameter is the total number of copies being
kept, that is, it is one plus the number of followers.
Satellite collections
---------------------
Satellite collections are synchronously replicated collections having a dynamic replicationFactor.
They will replicate all data to all database servers allowing the database servers to join data
locally instead of doing heavy network operations.
Satellite collections are an enterprise only feature.

View File

@ -1,32 +0,0 @@
Configuration
=============
Requirements
------------
Synchronous replication requires an operational ArangoDB cluster.
Enabling synchronous replication
--------------------------------
Synchronous replication can be enabled per collection. When creating a
collection you may specify the number of replicas using the
*replicationFactor* parameter. The default value is set to `1` which
effectively *disables* synchronous replication.
Example:
127.0.0.1:8530@_system> db._create("test", {"replicationFactor": 3})
In the above case, any write operation will require 2 replicas to
report success from now on.
Preparing growth
----------------
You may create a collection with higher replication factor than
available. When additional db servers become available the shards are
automatically replicated to the newly available machines.
Multiple replicas of the same shard can never coexist on the same db
server instance.

View File

@ -1,77 +0,0 @@
Implementation
==============
Architecture inside the cluster
-------------------------------
Synchronous replication can be configured per collection via the property *replicationFactor*. Synchronous replication requires a cluster to operate.
Whenever you specify a *replicationFactor* greater than 1 when creating a collection, synchronous replication will be activated for this collection. The cluster will determine suitable *leaders* and *followers* for every requested shard (*numberOfShards*) within the cluster. When requesting data of a shard only the current leader will be asked whereas followers will only keep their copy in sync. This is due to the current implementation of transactions.
Using *synchronous replication* alone will guarantee consistency and high availabilty at the cost of reduced performance: Write requests will have a higher latency (due to every write-request having to be executed on the followers) and read requests won't scale out as only the leader is being asked.
In a cluster synchronous replication will be managed by the *coordinators* for the client. The data will always be stored on *primaries*.
The following example will give you an idea of how synchronous operation has been implemented in ArangoDB.
1. Connect to a coordinator via arangosh
2. Create a collection
127.0.0.1:8530@_system> db._create("test", {"replicationFactor": 2})
3. the coordinator will figure out a *leader* and 1 *follower* and create 1 *shard* (as this is the default)
3. Insert data
127.0.0.1:8530@_system> db.test.insert({"replication": "😎"})
4. The coordinator will write the data to the leader, which in turn will
replicate it to the follower.
5. Only when both were successful the result is reported to be successful
{
"_id" : "test/7987",
"_key" : "7987",
"_rev" : "7987"
}
When a follower fails, the leader will give up on it after 3 seconds
and proceed with the operation. As soon as the follower (or the network
connection to the leader) is back up, the two will resynchronize and
synchronous replication is resumed. This happens all transparently
to the client.
The current implementation of ArangoDB does not allow changing the replicationFactor later. This is subject to change. In the meantime the only way is to dump and restore the collection. [See the cookbook recipe about migrating](../../../../Cookbook/Administration/Migrate2.8to3.0.html#controling-the-number-of-shards-and-the-replication-factor).
Automatic failover
------------------
Whenever the leader of a shard is failing and there is a query trying to access data of that shard the coordinator will continue trying to contact the leader until it timeouts.
The internal cluster supervision running on the agency will check cluster health every few seconds and will take action if there is no heartbeat from a server for 15 seconds.
If the leader doesn't come back in time the supervision will reorganize the cluster by promoting for each shard a follower that is in sync with its leader to be the new leader.
From then on the coordinators will contact the new leader.
The process is best outlined using an example:
1. The leader of a shard (lets name it DBServer001) is going down.
2. A coordinator is asked to return a document:
127.0.0.1:8530@_system> db.test.document("100069")
3. The coordinator determines which server is responsible for this document and finds DBServer001
4. The coordinator tries to contact DBServer001 and timeouts because it is not reachable.
5. After a short while the supervision (running in parallel on the agency) will see that heartbeats from DBServer001 are not coming in
6. The supervision promotes one of the followers (say DBServer002) that is in sync to be leader and makes DBServer001 a follower.
7. As the coordinator continues trying to fetch the document it will see that the leader changed to DBServer002
8. The coordinator tries to contact the new leader (DBServer002) and returns the result:
{
"_key" : "100069",
"_id" : "test/100069",
"_rev" : "513",
"replication" : "😎"
}
9. After a while the supervision declares DBServer001 to be completely dead.
10. A new follower is determined from the pool of DBservers.
11. The new follower syncs its data from the leader and order is restored.
Please note that there may still be timeouts. Depending on when exactly the request has been done (in regard to the supervision) and depending on the time needed to reconfigure the cluster the coordinator might fail with a timeout error!

View File

@ -1,4 +0,0 @@
Synchronous Replication
=======================
At its core synchronous replication will replicate write operations to multiple hosts. This feature is only available when operating ArangoDB in a cluster. Whenever a coordinator executes a sychronously replicated write operation it will only be reported to be successful if it was carried out on all replicas. In contrast to multi master replication setups known from other systems ArangoDB's synchronous operation guarantees a consistent state across the cluster.

View File

@ -126,7 +126,7 @@ depending on how much data has to be synced. When doing a join involving the sat
you can specify how long the DBServer is allowed to wait for sync until the query
is being aborted.
Check [Accessing Cursors](../../../../HTTP/AqlQueryCursor/AccessingCursors.html)
Check [Accessing Cursors](../../HTTP/AqlQueryCursor/AccessingCursors.html)
for details.
During network failure there is also a minimal chance that a query was properly

View File

@ -1,57 +0,0 @@
Sharding
========
ArangoDB is organizing its collection data in shards. Sharding
allows to use multiple machines to run a cluster of ArangoDB
instances that together constitute a single database. This enables
you to store much more data, since ArangoDB distributes the data
automatically to the different servers. In many situations one can
also reap a benefit in data throughput, again because the load can
be distributed to multiple machines.
Shards are configured per collection so multiple shards of data form
the collection as a whole. To determine in which shard the data is to
be stored ArangoDB performs a hash across the values. By default this
hash is being created from _key.
To configure the number of shards:
```
127.0.0.1:8529@_system> db._create("sharded_collection", {"numberOfShards": 4});
```
To configure the hashing for another attribute:
```
127.0.0.1:8529@_system> db._create("sharded_collection", {"numberOfShards": 4, "shardKeys": ["country"]});
```
This would be useful to keep data of every country in one shard which
would result in better performance for queries working on a per country
base. You can also specify multiple `shardKeys`. Note however that if
you change the shard keys from their default `["_key"]`, then finding
a document in the collection by its primary key involves a request to
every single shard. Furthermore, in this case one can no longer prescribe
the primary key value of a new document but must use the automatically
generated one. This latter restriction comes from the fact that ensuring
uniqueness of the primary key would be very inefficient if the user
could specify the primary key.
On which node in a cluster a particular shard is kept is undefined.
There is no option to configure an affinity based on certain shard keys.
Unique indexes (hash, skiplist, persistent) on sharded collections are
only allowed if the fields used to determine the shard key are also
included in the list of attribute paths for the index:
| shardKeys | indexKeys | |
|----------:|----------:|-------:|
| a | a | ok |
| a | b | not ok |
| a | a, b | ok |
| a, b | a | not ok |
| a, b | b | not ok |
| a, b | a, b | ok |
| a, b | a, b, c | ok |
| a, b, c | a, b | not ok |
| a, b, c | a, b, c | ok |

View File

@ -0,0 +1,105 @@
Replication
===========
Replication allows you to *replicate* data onto another machine. It
forms the base of all disaster recovery and failover features ArangoDB
offers.
ArangoDB offers **synchronous** and **asynchronous** replication.
Synchronous replication is used between the _DBServers_ of an ArangoDB
Cluster.
Asynchronous replication is used:
- when ArangoDB is operating in _Master/Slave_ or _Active Failover_ modes
- between multiple Data Centers (inside the same Data Center replication is
synchronous)
For more information on the ArangoDB Server _modes_ please refer to the
[_Server Modes_](../../Architecture/ServerModes.md) section.
Synchronous replication
-----------------------
Synchronous replication only works within an ArangoDB Cluster and is typically
used for mission critical data which must be accessible at all
times. Synchronous replication generally stores a copy of a shard's
data on another DBServer and keeps it in sync. Essentially, when storing
data after enabling synchronous replication the Cluster will wait for
all replicas to write all the data before greenlighting the write
operation to the client. This will naturally increase the latency a
bit, since one more network hop is needed for each write. However, it
will enable the cluster to immediately fail over to a replica whenever
an outage has been detected, without losing any committed data, and
mostly without even signaling an error condition to the client.
Synchronous replication is organized such that every _shard_ has a
_leader_ and `r-1` _followers_, where `r` denoted the replication
factor. The number of _followers_ can be controlled using the
`replicationFactor` parameter whenever you create a _collection_, the
`replicationFactor` parameter is the total number of copies being
kept, that is, it is one plus the number of _followers_.
Asynchronous replication
------------------------
In ArangoDB any write operation is logged in the _write-ahead
log_.
When using asynchronous replication _slaves_ (or _followers_)
connect to a _master_ (or _leader_) and apply locally all the events from
the master log in the same order. As a result the _slaves_ (_followers_)
will have the same state of data as the _master_ (_leader_).
_Slaves_ (_followers_) are only eventually consistent with the _master_ (_leader_).
Transactions are honored in replication, i.e. transactional write operations will
become visible on _slaves_ atomically.
As all write operations will be logged to a master database's _write-ahead log_, the
replication in ArangoDB currently cannot be used for write-scaling. The main purposes
of the replication in current ArangoDB are to provide read-scalability and "hot backups"
of specific databases.
It is possible to connect multiple _slave_ to the same _master_. _Slaves_ should be used
as read-only instances, and no user-initiated write operations
should be carried out on them. Otherwise data conflicts may occur that cannot be solved
automatically, and that will make the replication stop.
In an asynchronous replication scenario slaves will _pull_ changes
from the _master_. _Slaves_ need to know to which _master_ they should
connect to, but a _master_ is not aware of the _slaves_ that replicate from it.
When the network connection between the _master_ and a _slave_ goes down, write
operations on the master can continue normally. When the network is up again, _slaves_
can reconnect to the _master_ and transfer the remaining changes. This will
happen automatically provided _slaves_ are configured appropriately.
Before 3.3.0 asynchronous replication was per database. Starting with 3.3.0 it is possible
to setup global replication.
### Replication lag
As decribed above, write operations are applied first in the _master_, and then applied
in the _slaves_.
For example, let's assume a write operation is executed in the _master_
at point in time _t0_. To make a _slave_ apply the same operation, it must first
fetch the write operation's data from master's write-ahead log, then parse it and
apply it locally. This will happen at some point in time after _t0_, let's say _t1_.
The difference between _t1_ and _t0_ is called the _replication lag_, and it is unavoidable
in asynchronous replication. The amount of replication _lag_ depends on many factors, a
few of which are:
* the network capacity between the _slaves_ and the _master_
* the load of the _master_ and the _slaves_
* the frequency in which _slaves_ poll the _master_ for updates
Between _t0_ and _t1_, the state of data on the _master_ is newer than the state of data
on the _slaves_. At point in time _t1_, the state of data on the _master_ and _slaves_
is consistent again (provided no new data modifications happened on the _master_ in
between). Thus, the replication will lead to an _eventually consistent_ state of data.
### Replication overhead
As the _master_ servers are logging any write operation in the _write-ahead-log_ anyway replication doesn't cause any extra overhead on the _master_. However it will of course cause some overhead for the _master_ to serve incoming read requests of the _slaves_. Returning the requested data is however a trivial task for the _master_ and should not result in a notable performance degration in production.

View File

@ -0,0 +1,9 @@
# Server Modes
ArangoDB can operate in several _modes_:
- Single Server
- [Master/Slave](../Scalability/MasterSlave/README.md)
- [Active Failover](../Scalability/ActiveFailover/README.md)
- [Cluster](../Scalability/Cluster/README.md)
- [Multiple Data Centers](../Scalability/DC2DC/README.md)

View File

@ -1,9 +1,7 @@
Write-ahead log
===============
The Write-ahead log is part of the MMFiles storage engine; This doesn't apply to your
ArangoDB if you are running with the [RocksDB](../Administration/Configuration/RocksDB.md)
storage engine.
Both storage engines use a form of write ahead logging (WAL).
Starting with version 2.2 ArangoDB stores all data-modification operation in
its write-ahead log. The write-ahead log is sequence of append-only files containing
all the write operations that were executed on the server.
@ -12,9 +10,10 @@ It is used to run data recovery after a server crash, and can also be used in
a replication setup when slaves need to replay the same sequence of operations as
on the master.
### MMFiles WAL Details
By default, each write-ahead logfile is 32 MiB in size. This size is configurable via the
option *--wal.logfile-size*.
When a write-ahead logfile is full, it is set to read-only, and following operations will
be written into the next write-ahead logfile. By default, ArangoDB will reserve some
spare logfiles in the background so switching logfiles should be fast. How many reserve
@ -38,3 +37,21 @@ them if required. How many collected logfiles will be kept before they get delet
configurable via the option *--wal.historic-logfiles*.
For all write-ahead log configuration options, please refer to the page [Write-ahead log options](../Administration/Configuration/Wal.md).
### RocksDB WAL Details
The options mentioned above only apply for MMFiles. The WAL in the rocksdb storage engine
works slightly differently.
_Note:_ In rocksdb the WAL options are all prefixed with
`--rocksdb.*`. The `--wal.*` options do have no effect.
The individual RocksDB WAL files are per default about 64 MiB big. The size will always be proportionally
sized to the value specified via `--rocksdb.write-buffer-size`. The value specifies the amount of
data to build up in memory (backed by the unsorted WAL on disk) before converting it to a sorted on-disk file.
Larger values can increase performance, especially during bulk loads. Up to `--rocksdb.max-write-buffer-number`
write buffers may be held in memory at the same time, so you may wish to adjust this parameter to control memory usage. A larger write buffer will result in a longer recovery time the next time the database is opened.
The RocksDB WAL only contains committed transactions. This means you will never see partial transactions
in the replication log, but it also means transactions are tracked completely in-memory. In practice
this causes RocksDB transaction sizes to be limited, for more information see the [RocksDB Configuration](../Administration/Configuration/RocksDB.md)

View File

@ -6,8 +6,8 @@ which need to be set up when a database is created. This will make the creation
of a database take a while.
Replication is either configured on a
[per-database level](../../Administration/Replication/Asynchronous/DatabaseSetup.md)
or on [server level](../../Administration/Replication/Asynchronous/ServerLevelSetup.md).
[per-database level](../../Administration/MasterSlave/DatabaseSetup.md)
or on [server level](../../Administration/MasterSlave/ServerLevelSetup.md).
In a per-database setup, any replication logging or applying for a new database
must be configured explicitly after a new database has been created, whereas all
databases are automatically replicated in case of the server-level setup using the global replication applier.

View File

@ -0,0 +1,117 @@
Active Failover Deployment
==========================
This _Section_ describes how to deploy an _Active Failover_ environment.
For a general introduction to _Active Failover_, please refer to the
[Active Failover](../../Scalability/ActiveFailover/README.md) chapter.
As usual there are two main ways to start an Active-Failover setup:
Either [manually](README.md#starting-manually) or using the [_ArangoDB Starter_](README.md#using-the-arangodb-starter)
(possibly in conjunction with docker).
Starting Manually
-----------------
We are going to start two single server instances and one _Agency_.
First we need to start the
```bash
arangod \
--agency.activate true \
--agency.endpoint tcp://agency.domain.org:4001 \
--agency.my-address tcp://agency.domain.org:4001 \
--agency.pool-size 1 \
--agency.size 1 \
--database.directory dbdir/data4001 \
--javascript.v8-contexts 1 \
--server.endpoint tcp://agency.domain.org:4001 \
--server.statistics false \
--server.threads 16 \
--log.file dbdir/4001.log \
--log.level INFO \
| tee dbdir/4001.stdout 2>&1 &
```
Next we are going to start the _leader_ (wait until this server is fully started)
```bash
arangod \
--database.directory dbdir/data8530 \
--cluster.agency-endpoint tcp://agency.domain.org:4001 \
--cluster.my-address tcp://leader.domain.org:4001 \
--server.endpoint tcp://leader.domain.org:4001 \
--cluster.my-role SINGLE \
--replication.active-failover true \
--log.file dbdir/8530.log \
--server.statistics true \
--server.threads 5 \
| tee cluster/8530.stdout 2>&1 &
```
After the _leader_ server is fully started then you can add additional _followers_,
with basically the same startup parameters (except for their address and database directory)
```bash
arangod \
--database.directory dbdir/data8531 \
--cluster.agency-endpoint tcp://agency.domain.org:4001 \
--cluster.my-address tcp://leader.domain.org:4001 \
--server.endpoint tcp://leader.domain.org:4001 \
--cluster.my-role SINGLE \
--replication.active-failover true \
--log.file dbdir/8531.log \
--server.statistics true \
--server.threads 5 \
| tee cluster/8531.stdout 2>&1 &
```
Using the ArangoDB Starter
--------------------------
If you want to start a resilient single database server, use `--starter.mode=resilientsingle`.
In this mode a 3 machine _Agency is started as well as 2 single servers that perform
asynchronous replication an failover, if needed.
```bash
arangodb --starter.mode=resilientsingle --starter.join A,B,C
```
Run this on machine A, B & C.
The _Starter_ will decide on which 2 machines to run a single server instance.
To override this decision (only valid while bootstrapping), add a
`--cluster.start-single=false` to the machine where the single server
instance should NOT be scheduled.
### Starting a resilient single server pair in Docker
If you want to start a resilient single database server running in docker containers,
use the normal docker arguments, combined with `--starter.mode=resilientsingle`.
```bash
export IP=<IP of docker host>
docker volume create arangodb
docker run -it --name=adb --rm -p 8528:8528 \
-v arangodb:/data \
-v /var/run/docker.sock:/var/run/docker.sock \
arangodb/arangodb-starter \
--starter.address=$IP \
--starter.mode=resilientsingle \
--starter.join=A,B,C
```
Run this on machine A, B & C.
The starter will decide on which 2 machines to run a single server instance.
To override this decision (only valid while bootstrapping), add a
`--cluster.start-single=false` to the machine where the single server
instance should NOT be scheduled.
### Starting a local test resilient single sever pair
If you want to start a local resilient server pair quickly, use the `--starter.local` flag.
It will start all servers within the context of a single starter process.
```bash
arangodb --starter.local --starter.mode=resilientsingle
```
Note: When you restart the started, it remembers the original `--starter.local` flag.

View File

@ -1,6 +1,10 @@
Advanced Topics
---------------
In contrast to the other topics in this chapter that strive to get you simply set up in prepared environments, The following chapters describe whats going on under the hood in details, the components of ArangoDB Clusters, and how they're put together:
===============
In contrast to the other topics in this chapter that strive to get you simply set
up in prepared environments, the following chapters describe whats going on under
the hood in details, the components of ArangoDB Clusters, and how they are put together:
- [Running a local test setup](Local.md)
- [Running a distributed setup](Distributed.md)
- [Running in Docker](Docker.md)

View File

@ -1 +1,8 @@
# Cluster
Cluster Deployment
==================
This _Chapter_ describes how to deploy an _ArangoDB Cluster_.
For a general introduction to the _ArangoDB Cluster_, please refer to the [Cluster](../../Scalability/Cluster/README.md) chapter.
...

View File

@ -1,5 +1,5 @@
Launching an ArangoDB cluster on multiple machines
--------------------------------------------------
==================================================
Essentially, one can use the method from [the previous
section](Local.md) to start an ArangoDB cluster on multiple machines as

View File

@ -7,7 +7,7 @@ However starting a cluster manually is possible and is a very easy method to get
The easiest way to start a local cluster for testing purposes is to run `scripts/startLocalCluster.sh` from a clone of the [source repository](https://github.com/ArangoDB/ArangoDB) after compiling ArangoDB from source (see instructions in the file `README_maintainers.md` in the repository. This will start 1 Agency, 2 DBServers and 1 Coordinator. To stop the cluster issue `scripts/stopLocalCluster.sh`.
This section will discuss the required parameters for every role in an ArangoDB cluster. Be sure to read the [Architecture](../Scalability/Architecture.md) documentation to get a basic understanding of the underlying architecture and the involved roles in an ArangoDB cluster.
This section will discuss the required parameters for every role in an ArangoDB cluster. Be sure to read the [Architecture](../Scalability/Cluster/Architecture.md) documentation to get a basic understanding of the underlying architecture and the involved roles in an ArangoDB cluster.
In the following sections we will go through the relevant options per role.

View File

@ -0,0 +1,34 @@
Master/Slave Deployment
=======================
This _Section_ describes how to deploy a _Master/Slave_ environment.
For a general introduction to _Master/Slave_ in ArangoDB, please refer to the
[Master/Slave](../../Scalability/MasterSlave/README.md) chapter.
Setting up a working _Master/Slave_ replication requires at least two ArangoDB
instances:
1. *master:* this is the instance where all data-modification operations should
be directed to.
1. *slave:* this is the instance that replicates, in an asynchronous way, the data
from the _master_. For the replication to happen, a _replication applier_ has to
be started on the slave. The _replication applier_ will fetch data from the _master_'s
_write-ahead log_ and apply its operations locally. One or more slaves can replicate
from the same master.
Generally, one deploys the _master_ on a machine and each _slave_ on an additional,
separate, machine (one per _slave_). In case the _master_ and the _slaves_ are
running on the same machine (tests only), please make sure you use different ports
(and data directories) for the _master_ and the _slaves_.
Please install the _master_ and the _slaves_ as they were, separate,
[single instances](../SingleInstance/README.md). There are no specific differences,
at this stage, between a _master_ a _slave_ and a _single instance_.
Once the ArangoDB _master_ and _slaves_ have been deployed, the replication has
to be started on each of the available _slaves_. This can be done at database level,
or globally.
For further informations on how to set up the replication in _master/slave_ environment,
please refer to [this](../../Administration/MasterSlave/SettingUp.md) _Section_.

View File

@ -1,20 +1,10 @@
Deployment
==========
In this chapter we describe various possibilities to deploy ArangoDB.
In particular for the cluster mode, there are different ways
and we want to highlight their advantages and disadvantages.
We even document in detail, how to set up a cluster by simply starting
various ArangoDB processes on different machines, either directly
or using Docker containers.
- [Single instance](Single.md)
- [Cluster: DC/OS, Apache Mesos and Marathon](Mesos.md)
- [Cluster: Generic & Docker](ArangoDBStarter.md)
- [Multiple Datacenters](DC2DC.md)
- [Advanced Topics](Advanced.md)
- [Cluster: Test setup on a local machine](Local.md)
- [Cluster: Starting processes on different machines](Distributed.md)
- [Cluster: Launching an ArangoDB cluster using Docker containers](Docker.md)
- [Agency](Agency.md)
This _Chapter_ describes various possibilities to deploy ArangoDB:
- [Single instance](SingleInstance/README.md)
- [Master/Slave](MasterSlave/README.md)
- [Active Failover](ActiveFailover/README.md)
- [Cluster](Cluster/README.md)
- [Multiple Datacenters](DC2DC.md)

View File

@ -1,5 +1,5 @@
Single instance deployment
--------------------------
Single Instance Deployment
==========================
The latest official builds of ArangoDB for all supported operating systems may be obtained from https://www.arangodb.com/download/.

View File

@ -8,7 +8,7 @@ of SQL (Structured Query Language).
ArangoDB's query language is called AQL. There are some similarities between both
languages despite the different data models of the database systems. The most
notable difference is probably the concept of loops in AQL, which makes it feel
more like a programming language. It suites the schema-less model more natural
more like a programming language. It suits the schema-less model more natural
and makes the query language very powerful while remaining easy to read and write.
To get started with AQL, have a look at our detailed

View File

@ -56,7 +56,7 @@ execute
shell> arango-secure-installation
```
This will asked for a root password and sets this password.
This will ask for a root password and sets this password.
Web interface
-------------
@ -579,31 +579,3 @@ If you want to write more AQL queries right now, have a look here:
- [High-level operations](../../AQL/Operations/index.html): detailed descriptions
of `FOR`, `FILTER` and more operations not shown in this introduction
- [Functions](../../AQL/Functions/index.html): a reference of all provided functions
ArangoDB programs
-----------------
The ArangoDB package comes with the following programs:
- `arangod`: The [ArangoDB database daemon](../Administration/Configuration/GeneralArangod.md).
This server program is intended to run as a daemon process and to serve the
various clients connection to the server via TCP / HTTP.
- `arangosh`: The [ArangoDB shell](../Administration/Arangosh/README.md).
A client that implements a read-eval-print loop (REPL) and provides functions
to access and administrate the ArangoDB server.
- `arangoimport`: A [bulk importer](../Administration/Arangoimport.md) for the
ArangoDB server. It supports JSON and CSV.
- `arangodump`: A tool to [create backups](../Administration/Arangodump.md)
of an ArangoDB database in JSON format.
- `arangorestore`: A tool to [load data of a backup](../Administration/Arangorestore.md)
back into an ArangoDB database.
- `arango-dfdb`: A [datafile debugger](../Troubleshooting/DatafileDebugger.md) for
ArangoDB. It is primarily intended to be used during development of ArangoDB.
- `arangobench`: A [benchmark and test tool](../Troubleshooting/Arangobench.md).
It can be used for performance and server function testing.

View File

@ -13,7 +13,7 @@ Examples will explain the API on the [the city graph](../README.md#the-city-grap
Definition of examples
----------------------
@startDocuBlock JSF_general_graph_example_description
@startDocuBlock general_graph_example_description
Get vertices from edges.
------------------------

View File

@ -18,7 +18,7 @@ Version 3.3
**All Editions**
- [**Server-level Replication**](Administration/Replication/Asynchronous/ServerLevelSetup.md):
- [**Server-level Replication**](Administration/MasterSlave/ServerLevelSetup.md):
In addition to per-database replication, there is now an additional
`globalApplier`. Start the global replication on the slave once and all
current and future databases will be replicated from the master to the
@ -53,7 +53,7 @@ Version 3.2
further security and scalability features to ArangoDB Enterprise like
[LDAP integration](Administration/Configuration/Ldap.md),
[Encryption at Rest](Administration/Encryption/README.md), and the brand new
[Satellite Collections](Administration/Replication/Synchronous/Satellites.md).
[Satellite Collections](Administration/Satellites.md).
Also see [What's New in 3.2](ReleaseNotes/NewFeatures32.md).
@ -75,7 +75,7 @@ Also see [What's New in 3.1](ReleaseNotes/NewFeatures31.md).
Version 3.0
-----------
- [**self-organizing cluster**](Scalability/Architecture.md) with
- [**self-organizing cluster**](Scalability/Cluster/Architecture.md) with
synchronous replication, master/master setup, shared nothing
architecture, cluster management agency.

View File

@ -1 +1,54 @@
# ArangoDB Programs
The full ArangoDB package comes with the following programs:
- `arangod`: [ArangoDB server](../Administration/Configuration/GeneralArangod.md).
This server program is intended to run as a daemon process / service to serve the
various clients connections to the server via TCP / HTTP. It also provides a
[web interface](../Administration/WebInterface/README.md).
- `arangosh`: [ArangoDB shell](../Administration/Arangosh/README.md).
A client that implements a read-eval-print loop (REPL) and provides functions
to access and administrate the ArangoDB server.
- `arangoimport`: [Bulk importer](../Administration/Arangoimport.md) for the
ArangoDB server. It supports JSON and CSV.
- `arangoexport`: [Bulk exporter](../Administration/Arangoexport.md) for the
ArangoDB server. It supports JSON, CSV and XML.
- `arangodump`: Tool to [create backups](../Administration/Arangodump.md)
of an ArangoDB database in JSON format.
- `arangorestore`: Tool to [load data of a backup](../Administration/Arangorestore.md)
back into an ArangoDB database.
- `arango-dfdb`: [Datafile debugger](../Troubleshooting/DatafileDebugger.md) for
ArangoDB (MMFiles storage engine only). It is primarily intended to be used
during development of ArangoDB.
- `arangobench`: [Benchmark and test tool](../Troubleshooting/Arangobench.md).
It can be used for performance and server function testing.
<!--
- `arangovpack`: ???
- `foxx-manager`: ???
- `arango-init-database`: ???
- `arango-secure-installation`: ???
ArangoDB starter (not included?)
-->
The client package comes with a subset of programs:
- arangosh
- arangoimport
- arangoexport
- arangodump
- arangorestore
- arangobench
- arangovpack
- foxx-manager

View File

@ -19,7 +19,7 @@ The documentation is organized in four handbooks:
that is used to communicate with clients. In general, the HTTP handbook will be
of interest to driver developers. If you use any of the existing drivers for
the language of your choice, you can skip this handbook.
- Our [cookbook](../cookbook/index.html) with recipes for specific problems and
- Our [Cookbook](../Cookbook/index.html) with recipes for specific problems and
solutions.
Features are illustrated with interactive usage examples; you can cut'n'paste them

View File

@ -116,7 +116,7 @@ executes the query, locally. With this approach, network hops during join
operations on sharded collections can be avoided and response times can be close to
that of a single instance.
[Satellite collections](../Administration/Replication/Synchronous/Satellites.md)
[Satellite collections](../Administration/Satellites.md)
are available in the *Enterprise* edition.

View File

@ -93,7 +93,7 @@ created on the master, one needed to take action on the slave to ensure that dat
for that database got actually replicated. Replication on the slave also was not
aware of when a database was dropped on the master.
3.3 adds [server-level replication](../Administration/Replication/Asynchronous/ServerLevelSetup.md),
3.3 adds [server-level replication](../Administration/MasterSlave/ServerLevelSetup.md),
which will replicate the current and future databases from the master to the
slave automatically after the initial setup.

View File

@ -1,11 +1,11 @@
#
# Summary
#
# * [First Steps](FirstSteps/README.md) #TODO
# * [Getting Familiar](FirstSteps/GettingFamiliar.md) #TODO
* [Introduction](README.md)
## GETTING FAMILIAR
* [Getting Started](GettingStarted/README.md)
# move to administration (command line options)?
# * [Install and run the server](FirstSteps/Arangod.md) #TODO
* [Installing](GettingStarted/Installing/README.md)
* [Linux](GettingStarted/Installing/Linux.md)
* [Mac OS X](GettingStarted/Installing/MacOSX.md)
@ -18,23 +18,8 @@
* [ArangoDB Starter](GettingStarted/Starter/README.md)
# https://@github.com/arangodb/arangosync.git;arangosync;docs/Manual;;/
* [Datacenter to datacenter Replication](GettingStarted/DC2DC/README.md)
# * [Coming from MongoDB](GettingStarted/ComingFromMongoDb.md) #TODO
#
* [Highlights](Highlights.md)
#
* [Scalability](Scalability/README.md)
* [Cluster](Scalability/Cluster/README.md)
* [Architecture](Scalability/Architecture.md)
* [Data models](Scalability/DataModels.md)
* [Limitations](Scalability/Limitations.md)
# https://@github.com/arangodb/arangosync.git;arangosync;docs/Manual;;/
* [Datacenter to datacenter replication](Scalability/DC2DC/README.md)
* [Introduction](Scalability/DC2DC/Introduction.md)
* [Applicability](Scalability/DC2DC/Applicability.md)
* [Requirements](Scalability/DC2DC/Requirements.md)
#
* [Data models & modeling](DataModeling/README.md)
# * [Collections](FirstSteps/CollectionsAndDocuments.md) #TODO
* [Concepts](DataModeling/Concepts.md)
* [Databases](DataModeling/Databases/README.md)
* [Working with Databases](DataModeling/Databases/WorkingWith.md)
@ -55,8 +40,14 @@
* [Collection Names](DataModeling/NamingConventions/CollectionNames.md)
* [Document Keys](DataModeling/NamingConventions/DocumentKeys.md)
* [Attribute Names](DataModeling/NamingConventions/AttributeNames.md)
# * [Modeling Relationships](DataModeling/ModelingRelationships.md)
#
* [ArangoDB Programs](Programs/README.md)
# https://@github.com//arangodb-helper/arangodb.git;arangodb;docs/Manual;;/
* [ArangoDB Starter](Programs/Starter/README.md)
* [Options](Programs/Starter/options.md)
* [Security](Programs/Starter/security.md)
## CORE TOPICS
* [Indexing](Indexing/README.md)
* [Index Basics](Indexing/IndexBasics.md)
* [Which index to use when](Indexing/WhichIndex.md)
@ -68,7 +59,12 @@
* [Fulltext Indexes](Indexing/Fulltext.md)
* [Geo Indexes](Indexing/Geo.md)
* [Vertex Centric Indexes](Indexing/VertexCentric.md)
#
* [Transactions](Transactions/README.md)
* [Transaction invocation](Transactions/TransactionInvocation.md)
* [Passing parameters](Transactions/Passing.md)
* [Locking and isolation](Transactions/LockingAndIsolation.md)
* [Durability](Transactions/Durability.md)
* [Limitations](Transactions/Limitations.md)
* [Graphs](Graphs/README.md)
* [General Graphs](Graphs/GeneralGraphs/README.md)
* [Graph Management](Graphs/GeneralGraphs/Management.md)
@ -80,11 +76,17 @@
* [Example Data](Graphs/Traversals/ExampleData.md)
* [Working with Edges](Graphs/Edges/README.md)
* [Pregel](Graphs/Pregel/README.md)
#
* [Views](Views/README.md)
* [ArangoSearch](Views/ArangoSearch.md)
* [Analyzers](Views/ArangoSearch/Analyzers.md)
## ADVANCED TOPICS
* [Architecture](Architecture/README.md)
* [Modes](Architecture/ServerModes.md)
* [Replication](Architecture/Replication/README.md)
* [Write-ahead log](Architecture/WriteAheadLog.md)
* [Storage Engines](Architecture/StorageEngines.md)
* [Foxx Microservices](Foxx/README.md)
* [At a glance](Foxx/AtAGlance.md)
* [Getting started](Foxx/GettingStarted.md)
@ -129,18 +131,30 @@
* [Related modules](Foxx/Modules.md)
* [Authentication](Foxx/Auth.md)
* [OAuth 1.0a](Foxx/OAuth1.md)
* [OAuth 2.0](Foxx/OAuth2.md)
* [Transactions](Transactions/README.md)
* [Transaction invocation](Transactions/TransactionInvocation.md)
* [Passing parameters](Transactions/Passing.md)
* [Locking and isolation](Transactions/LockingAndIsolation.md)
* [Durability](Transactions/Durability.md)
* [Limitations](Transactions/Limitations.md)
#
# Use cases / Advanced usage / Best practice (?)
#
* [OAuth 2.0](Foxx/OAuth2.md)
* [Scalability](Scalability/README.md)
* [Master/Slave](Scalability/MasterSlave/README.md)
* [Architecture](Scalability/MasterSlave/Architecture.md)
* [Limitations](Scalability/MasterSlave/Limitations.md)
* [Active Failover](Scalability/ActiveFailover/README.md)
* [Architecture](Scalability/ActiveFailover/Architecture.md)
* [Limitations](Scalability/ActiveFailover/Limitations.md)
* [Cluster](Scalability/Cluster/README.md)
* [Architecture](Scalability/Cluster/Architecture.md)
* [Data models](Scalability/Cluster/DataModels.md)
* [Limitations](Scalability/Cluster/Limitations.md)
# https://@github.com/arangodb/arangosync.git;arangosync;docs/Manual;;/
* [Datacenter to datacenter replication](Scalability/DC2DC/README.md)
* [Introduction](Scalability/DC2DC/Introduction.md)
* [Applicability](Scalability/DC2DC/Applicability.md)
* [Requirements](Scalability/DC2DC/Requirements.md)
## OPERATIONS
* [Deployment](Deployment/README.md)
* [Single instance](Deployment/Single.md)
* [Single instance](Deployment/SingleInstance/README.md)
* [Master/Slave](Deployment/MasterSlave/README.md)
* [Active Failover](Deployment/ActiveFailover/README.md)
* [Cluster](Deployment/Cluster/README.md)
* [Cluster: Mesos, DC/OS](Deployment/Mesos.md)
* [Cluster: Generic & Docker](Deployment/ArangoDBStarter.md)
@ -156,13 +170,6 @@
* [ArangoSync Master](Deployment/DC2DC/ArangoSyncMaster.md)
* [ArangoSync Workers](Deployment/DC2DC/ArangoSyncWorkers.md)
* [Prometheus & Grafana](Deployment/DC2DC/PrometheusGrafana.md)
#
* [ArangoDB Programs](Programs/README.md)
# https://@github.com//arangodb-helper/arangodb.git;arangodb;docs/Manual;;/
* [ArangoDB Starter](Programs/Starter/README.md)
* [Options](Programs/Starter/options.md)
* [Security](Programs/Starter/security.md)
#
* [Administration](Administration/README.md)
* [Web Interface](Administration/WebInterface/README.md)
* [Dashboard](Administration/WebInterface/Dashboard.md)
@ -177,7 +184,6 @@
* [ArangoDB Shell](Administration/Arangosh/README.md)
* [Shell Output](Administration/Arangosh/Output.md)
* [Configuration](Administration/Arangosh/Configuration.md)
# relocate file?
* [Details](GettingStarted/Arangosh.md)
* [Arangoimport](Administration/Arangoimport.md)
* [Arangodump](Administration/Arangodump.md)
@ -202,24 +208,18 @@
* [Encryption](Administration/Encryption/README.md)
* [Auditing](Administration/Auditing/README.md)
* [Configuration](Administration/Auditing/AuditConfiguration.md)
* [Events](Administration/Auditing/AuditEvents.md)
* [Replication](Administration/Replication/README.md)
* [Asynchronous Replication](Administration/Replication/Asynchronous/README.md)
* [Components](Administration/Replication/Asynchronous/Components.md)
* [Per-Database Setup](Administration/Replication/Asynchronous/DatabaseSetup.md)
* [Server-Level Setup](Administration/Replication/Asynchronous/ServerLevelSetup.md)
* [Syncing Collections](Administration/Replication/Asynchronous/SyncingCollections.md)
* [Replication Limitations](Administration/Replication/Asynchronous/Limitations.md)
* [Synchronous Replication](Administration/Replication/Synchronous/README.md)
* [Implementation](Administration/Replication/Synchronous/Implementation.md)
* [Configuration](Administration/Replication/Synchronous/Configuration.md)
* [Satellite Collections](Administration/Replication/Synchronous/Satellites.md)
* [Events](Administration/Auditing/AuditEvents.md)
* [Satellite Collections](Administration/Satellites.md)
* [Master/Slave](Administration/MasterSlave/README.md)
* [Setting up](Administration/MasterSlave/SettingUp.md)
* [Replication Applier](Administration/MasterSlave/ReplicationApplier.md)
* [Per-Database Setup](Administration/MasterSlave/DatabaseSetup.md)
* [Server-Level Setup](Administration/MasterSlave/ServerLevelSetup.md)
* [Syncing Collections](Administration/MasterSlave/SyncingCollections.md)
* [Active Failover](Administration/ActiveFailover/README.md)
* [Cluster](Administration/Cluster/README.md)
# https://@github.com/arangodb/arangosync.git;arangosync;docs/Manual;;/
* [Datacenter to datacenter replication](Administration/DC2DC/README.md)
* [Sharding](Administration/Sharding/README.md)
# * [Authentication](Administration/Sharding/Authentication.md)
# * [Firewall setup](Administration/Sharding/FirewallSetup.md)
* [Upgrading](Administration/Upgrading/README.md)
* [Upgrading to 3.3](Administration/Upgrading/Upgrading33.md)
* [Upgrading to 3.2](Administration/Upgrading/Upgrading32.md)
@ -231,7 +231,12 @@
* [Upgrading to 2.4](Administration/Upgrading/Upgrading24.md)
* [Upgrading to 2.3](Administration/Upgrading/Upgrading23.md)
* [Upgrading to 2.2](Administration/Upgrading/Upgrading22.md)
#
* [Security](Security/README.md)
# https://@github.com/arangodb/arangosync.git;arangosync;docs/Manual;;/
* [Datacenter to datacenter replication](Security/DC2DC/README.md)
* [Monitoring](Monitoring/README.md)
# https://@github.com/arangodb/arangosync.git;arangosync;docs/Manual;;/
* [Datacenter to datacenter replication](Monitoring/DC2DC/README.md)
* [Troubleshooting](Troubleshooting/README.md)
* [arangod](Troubleshooting/Arangod.md)
* [Emergency Console](Troubleshooting/EmergencyConsole.md)
@ -239,21 +244,10 @@
* [Arangobench](Troubleshooting/Arangobench.md)
* [Cluster](Troubleshooting/Cluster/README.md)
# https://@github.com/arangodb/arangosync.git;arangosync;docs/Manual;;/
* [Datacenter to datacenter replication](Troubleshooting/DC2DC/README.md)
#
* [Monitoring](Monitoring/README.md)
# https://@github.com/arangodb/arangosync.git;arangosync;docs/Manual;;/
* [Datacenter to datacenter replication](Monitoring/DC2DC/README.md)
#
* [Security](Security/README.md)
# https://@github.com/arangodb/arangosync.git;arangosync;docs/Manual;;/
* [Datacenter to datacenter replication](Security/DC2DC/README.md)
#
* [Architecture](Architecture/README.md)
* [Write-ahead log](Architecture/WriteAheadLog.md)
* [Storage Engines](Architecture/StorageEngines.md)
# * [Server Internals](Architecture/ServerInternals.md)
#
* [Datacenter to datacenter replication](Troubleshooting/DC2DC/README.md)
---
* [Release notes](ReleaseNotes/README.md)
* [Incompatible changes in 3.4](ReleaseNotes/UpgradingChanges34.md)
* [Whats New in 3.3](ReleaseNotes/NewFeatures33.md)
@ -279,7 +273,6 @@
* [Incompatible changes in 2.3](ReleaseNotes/UpgradingChanges23.md)
* [Whats New in 2.2](ReleaseNotes/NewFeatures22.md)
* [Whats New in 2.1](ReleaseNotes/NewFeatures21.md)
#
* [Appendix](Appendix/README.md)
* [References](Appendix/References/README.md)
* [db](Appendix/References/DBObject.md)
@ -305,6 +298,5 @@
* [Delivering HTML Pages](Appendix/Deprecated/Actions/HtmlExample.md)
* [Json Objects](Appendix/Deprecated/Actions/JsonExample.md)
* [Modifying](Appendix/Deprecated/Actions/Modifying.md)
# Link to here from arangosh, actions, foxx, transactions
* [Error codes and meanings](Appendix/ErrorCodes.md)
* [Glossary](Appendix/Glossary.md)

View File

@ -0,0 +1,46 @@
Active Failover Architecture
============================
Consider the case for two *arangod* instances:
![Simple Leader / Follower setup, with a single node agency](leader-follower.png)
Two servers are connected via server wide asynchronous replication. One of the servers is
elected _Leader_, and the other one is made a _Follower_ automatically. At startup,
the two servers race for the leadership position. This happens through the agency
locking mechanisms (which means the Agency needs to be available at server start).
You can control which server will become Leader by starting it earlier than
other server instances in the beginning.
The _Follower_ will automatically start
replication from the master for all available databases, using the server-level
replication introduced in 3.3.
When the master goes down, this is automatically detected by an agency
instance, which is also started in this mode. This instance will make the
previous follower stop its replication and make it the new leader.
The follower will deny all read and write requests from client
applications. Only the replication itself is allowed to access the follower's data
until the follower becomes a new leader.
When sending a request to read or write data on a follower, the follower will
always respond with `HTTP 503 (Service unavailable)` and provide the address of
the current leader. Client applications and drivers can use this information to
then make a follow-up request to the proper leader:
```
HTTP/1.1 503 Service Unavailable
X-Arango-Endpoint: http://[::1]:8531
....
```
Client applications can also detect who the current leader and the followers
are by calling the `/_api/cluster/endpoints` REST API. This API is accessible
on leaders and followers alike.
The ArangoDB starter supports starting two servers with asynchronous
replication and failover out of the box.
The arangojs driver for JavaScript, the Go driver, the Java driver, ArangoJS and
the PHP driver support active failover in case the currently accessed server endpoint
responds with `HTTP 503`.

View File

@ -0,0 +1,12 @@
Active Failover Limitations
===========================
The _Active Failover_ setup in ArangoDB has a few limitations. Some of these limitations
may be removed in later versions of ArangoDB:
- Even though it is already possible to have several _followers_ of the same _leader_,
currently only one _follower_ is officially supported
- Should you add more than one _follower_, be aware that during a _failover_ situation
there is no preference for _followers_ which are more up to date with the failed _leader_.
- At the moment it is not possible to read from _followers_
- All requests will be redirected to the _leader_

View File

@ -0,0 +1,32 @@
Active Failover
===============
This _Chapter_ introduces ArangoDB's _Active Failover_ environment.
An active failover is defined as:
- One ArangoDB Single-Server instance which is read / writeable by clients called **Leader**
- An ArangoDB Single-Server instance, which is passive and not read or writeable called **Follower**
- At least one Agency node, acting as a "witness" to determine which server becomes the leader in a failure situation
![Simple Leader / Follower setup, with a single node agency](leader-follower.png)
The advantage compared to a traditional Master-Slave setup is that there is an active third party
which observes and supervises all involved server processes. _Follower_ instances can rely on the
agency to determine the correct _leader_ server. This setup is made **resilient** by the fact
that all our official ArangoDB drivers can now automatically determine the correct _leader_ server
and redirect requests appropriately. Furthermore Foxx Services do also automatically perform
a failover: Should your _leader_ instance fail (which is also the _Foxxmaster_) the newly elected
_leader_ will reinstall all Foxx services and resume executing queued [Foxx tasks](../../Foxx/Scripts.md).
[Database users](../../Administration/ManagingUsers/README.md) which were created on the _leader_ will also be valid on the newly elected _leader_ (always depending on the condition that they were synced already).
For further information about _Active Failover_ in ArangoDB, please refer to the following sections:
- [Active Failover Deployment](../../Deployment/ActiveFailover/README.md)
- [Active Failover Administration](../../Administration/ActiveFailover/README.md)
**Note:** _Asynchronous Failover_, _Resilient Single_, _Active-Passive_ or _Hot
Standby_ are other terms that have been used to define the _Active Failover_ environment.
Starting from version 3.3 _Active Failover_ is the preferred term to identify such
environment.

Binary file not shown.

After

Width:  |  Height:  |  Size: 59 KiB

View File

@ -1,269 +0,0 @@
Architecture
============
The cluster architecture of ArangoDB is a CP master/master model with no
single point of failure. With "CP" we mean that in the presence of a
network partition, the database prefers internal consistency over
availability. With "master/master" we mean that clients can send their
requests to an arbitrary node, and experience the same view on the
database regardless. "No single point of failure" means that the cluster
can continue to serve requests, even if one machine fails completely.
In this way, ArangoDB has been designed as a distributed multi-model
database. This section gives a short outline on the cluster architecture and
how the above features and capabilities are achieved.
Structure of an ArangoDB cluster
--------------------------------
An ArangoDB cluster consists of a number of ArangoDB instances
which talk to each other over the network. They play different roles,
which will be explained in detail below. The current configuration
of the cluster is held in the "Agency", which is a highly-available
resilient key/value store based on an odd number of ArangoDB instances
running [Raft Consensus Protocol](https://raft.github.io/).
For the various instances in an ArangoDB cluster there are 4 distinct
roles: Agents, Coordinators, Primary and Secondary DBservers. In the
following sections we will shed light on each of them. Note that the
tasks for all roles run the same binary from the same Docker image.
### Agents
One or multiple Agents form the Agency in an ArangoDB cluster. The
Agency is the central place to store the configuration in a cluster. It
performs leader elections and provides other synchronization services for
the whole cluster. Without the Agency none of the other components can
operate.
While generally invisible to the outside it is the heart of the
cluster. As such, fault tolerance is of course a must have for the
Agency. To achieve that the Agents are using the [Raft Consensus
Algorithm](https://raft.github.io/). The algorithm formally guarantees
conflict free configuration management within the ArangoDB cluster.
At its core the Agency manages a big configuration tree. It supports
transactional read and write operations on this tree, and other servers
can subscribe to HTTP callbacks for all changes to the tree.
### Coordinators
Coordinators should be accessible from the outside. These are the ones
the clients talk to. They will coordinate cluster tasks like
executing queries and running Foxx services. They know where the
data is stored and will optimize where to run user supplied queries or
parts thereof. Coordinators are stateless and can thus easily be shut down
and restarted as needed.
### Primary DBservers
Primary DBservers are the ones where the data is actually hosted. They
host shards of data and using synchronous replication a primary may
either be leader or follower for a shard.
They should not be accessed from the outside but indirectly through the
coordinators. They may also execute queries in part or as a whole when
asked by a coordinator.
### Secondaries
Secondary DBservers are asynchronous replicas of primaries. If one is
using only synchronous replication, one does not need secondaries at all.
For each primary, there can be one or more secondaries. Since the
replication works asynchronously (eventual consistency), the replication
does not impede the performance of the primaries. On the other hand,
their replica of the data can be slightly out of date. The secondaries
are perfectly suitable for backups as they don't interfere with the
normal cluster operation.
### Cluster ID
Every non-Agency ArangoDB instance in a cluster is assigned a unique
ID during its startup. Using its ID a node is identifiable
throughout the cluster. All cluster operations will communicate
via this ID.
Sharding
--------
Using the roles outlined above an ArangoDB cluster is able to distribute
data in so called shards across multiple primaries. From the outside
this process is fully transparent and as such we achieve the goals of
what other systems call "master-master replication". In an ArangoDB
cluster you talk to any coordinator and whenever you read or write data
it will automatically figure out where the data is stored (read) or to
be stored (write). The information about the shards is shared across the
coordinators using the Agency.
Also see [Sharding](../Administration/Sharding/README.md) in the
Administration chapter.
Many sensible configurations
----------------------------
This architecture is very flexible and thus allows many configurations,
which are suitable for different usage scenarios:
1. The default configuration is to run exactly one coordinator and
one primary DBserver on each machine. This achieves the classical
master/master setup, since there is a perfect symmetry between the
different nodes, clients can equally well talk to any one of the
coordinators and all expose the same view to the data store.
2. One can deploy more coordinators than DBservers. This is a sensible
approach if one needs a lot of CPU power for the Foxx services,
because they run on the coordinators.
3. One can deploy more DBservers than coordinators if more data capacity
is needed and the query performance is the lesser bottleneck
4. One can deploy a coordinator on each machine where an application
server (e.g. a node.js server) runs, and the Agents and DBservers
on a separate set of machines elsewhere. This avoids a network hop
between the application server and the database and thus decreases
latency. Essentially, this moves some of the database distribution
logic to the machine where the client runs.
These for shall suffice for now. The important piece of information here
is that the coordinator layer can be scaled and deployed independently
from the DBserver layer.
Replication
-----------
ArangoDB offers two ways of data replication within a cluster, synchronous
and asynchronous. In this section we explain some details and highlight
the advantages and disadvantages respectively.
### Synchronous replication with automatic fail-over
Synchronous replication works on a per-shard basis. One configures for
each collection, how many copies of each shard are kept in the cluster.
At any given time, one of the copies is declared to be the "leader" and
all other replicas are "followers". Write operations for this shard
are always sent to the DBserver which happens to hold the leader copy,
which in turn replicates the changes to all followers before the operation
is considered to be done and reported back to the coordinator.
Read operations are all served by the server holding the leader copy,
this allows to provide snapshot semantics for complex transactions.
If a DBserver fails that holds a follower copy of a shard, then the leader
can no longer synchronize its changes to that follower. After a short timeout
(3 seconds), the leader gives up on the follower, declares it to be
out of sync, and continues service without the follower. When the server
with the follower copy comes back, it automatically resynchronizes its
data with the leader and synchronous replication is restored.
If a DBserver fails that holds a leader copy of a shard, then the leader
can no longer serve any requests. It will no longer send a heartbeat to
the Agency. Therefore, a supervision process running in the Raft leader
of the Agency, can take the necessary action (after 15 seconds of missing
heartbeats), namely to promote one of the servers that hold in-sync
replicas of the shard to leader for that shard. This involves a
reconfiguration in the Agency and leads to the fact that coordinators
now contact a different DBserver for requests to this shard. Service
resumes. The other surviving replicas automatically resynchronize their
data with the new leader. When the DBserver with the original leader
copy comes back, it notices that it now holds a follower replica,
resynchronizes its data with the new leader and order is restored.
All shard data synchronizations are done in an incremental way, such that
resynchronizations are quick. This technology allows to move shards
(follower and leader ones) between DBservers without service interruptions.
Therefore, an ArangoDB cluster can move all the data on a specific DBserver
to other DBservers and then shut down that server in a controlled way.
This allows to scale down an ArangoDB cluster without service interruption,
loss of fault tolerance or data loss. Furthermore, one can re-balance the
distribution of the shards, either manually or automatically.
All these operations can be triggered via a REST/JSON API or via the
graphical web UI. All fail-over operations are completely handled within
the ArangoDB cluster.
Obviously, synchronous replication involves a certain increased latency for
write operations, simply because there is one more network hop within the
cluster for every request. Therefore the user can set the replication factor
to 1, which means that only one copy of each shard is kept, thereby
switching off synchronous replication. This is a suitable setting for
less important or easily recoverable data for which low latency write
operations matter.
### Asynchronous replication with automatic fail-over
Asynchronous replication works differently, in that it is organized
using primary and secondary DBservers. Each secondary server replicates
all the data held on a primary by polling in an asynchronous way. This
process has very little impact on the performance of the primary. The
disadvantage is that there is a delay between the confirmation of a
write operation that is sent to the client and the actual replication of
the data. If the master server fails during this delay, then committed
and confirmed data can be lost.
Nevertheless, we also offer automatic fail-over with this setup. Contrary
to the synchronous case, here the fail-over management is done from outside
the ArangoDB cluster. In a future version we might move this management
into the supervision process in the Agency, but as of now, the management
is done via the Mesos framework scheduler for ArangoDB (see below).
The granularity of the replication is a whole ArangoDB instance with
all data that resides on that instance, which means that
you need twice as many instances as without asynchronous replication.
Synchronous replication is more flexible in that respect, you can have
smaller and larger instances, and if one fails, the data can be rebalanced
across the remaining ones.
Microservices and zero administation
------------------------------------
The design and capabilities of ArangoDB are geared towards usage in
modern microservice architectures of applications. With the
[Foxx services](../Foxx/README.md) it is very easy to deploy a data
centric microservice within an ArangoDB cluster.
In addition, one can deploy multiple instances of ArangoDB within the
same project. One part of the project might need a scalable document
store, another might need a graph database, and yet another might need
the full power of a multi-model database actually mixing the various
data models. There are enormous efficiency benefits to be reaped by
being able to use a single technology for various roles in a project.
To simplify live of the devops in such a scenario we try as much as
possible to use a zero administration approach for ArangoDB. A running
ArangoDB cluster is resilient against failures and essentially repairs
itself in case of temporary failures. See the next section for further
capabilities in this direction.
Apache Mesos integration
------------------------
For the distributed setup, we use the Apache Mesos infrastructure by default.
ArangoDB is a fully certified package for DC/OS and can thus
be deployed essentially with a few mouse clicks or a single command, once
you have an existing DC/OS cluster. But even on a plain Apache Mesos cluster
one can deploy ArangoDB via Marathon with a single API call and some JSON
configuration.
The advantage of this approach is that we can not only implement the
initial deployment, but also the later management of automatic
replacement of failed instances and the scaling of the ArangoDB cluster
(triggered manually or even automatically). Since all manipulations are
either via the graphical web UI or via JSON/REST calls, one can even
implement auto-scaling very easily.
A DC/OS cluster is a very natural environment to deploy microservice
architectures, since it is so convenient to deploy various services,
including potentially multiple ArangoDB cluster instances within the
same DC/OS cluster. The built-in service discovery makes it extremely
simple to connect the various microservices and Mesos automatically
takes care of the distribution and deployment of the various tasks.
See the [Deployment](../Deployment/README.md) chapter and its subsections
for instructions.
It is possible to deploy an ArangoDB cluster by simply launching a bunch of
Docker containers with the right command line options to link them up,
or even on a single machine starting multiple ArangoDB processes. In that
case, synchronous replication will work within the deployed ArangoDB cluster,
and automatic fail-over in the sense that the duties of a failed server will
automatically be assigned to another, surviving one. However, since the
ArangoDB cluster cannot within itself launch additional instances, replacement
of failed nodes is not automatic and scaling up and down has to be managed
manually. This is why we do not recommend this setup for production
deployment.

View File

@ -0,0 +1,327 @@
Cluster Architecture
====================
The cluster architecture of ArangoDB is a _CP_ master/master model with no
single point of failure. With "CP" we mean that in the presence of a
network partition, the database prefers internal consistency over
availability. With "master/master" we mean that clients can send their
requests to an arbitrary node, and experience the same view on the
database regardless. "No single point of failure" means that the cluster
can continue to serve requests, even if one machine fails completely.
In this way, ArangoDB has been designed as a distributed multi-model
database. This section gives a short outline on the cluster architecture and
how the above features and capabilities are achieved.
Structure of an ArangoDB Cluster
--------------------------------
An ArangoDB Cluster consists of a number of ArangoDB instances
which talk to each other over the network. They play different roles,
which will be explained in detail below. The current configuration
of the Cluster is held in the _Agency_, which is a highly-available
resilient key/value store based on an odd number of ArangoDB instances
running [Raft Consensus Protocol](https://raft.github.io/).
For the various instances in an ArangoDB Cluster there are 3 distinct
roles:
- _Agents_
- _Coordinators_
- _DBServers_.
In the following sections we will shed light on each of them.
### Agents
One or multiple _Agents_ form the _Agency_ in an ArangoDB Cluster. The
_Agency_ is the central place to store the configuration in a Cluster. It
performs leader elections and provides other synchronization services for
the whole Cluster. Without the _Agency_ none of the other components can
operate.
While generally invisible to the outside the _Agency_ is the heart of the
Cluster. As such, fault tolerance is of course a must have for the
_Agency_. To achieve that the _Agents_ are using the [Raft Consensus
Algorithm](https://raft.github.io/). The algorithm formally guarantees
conflict free configuration management within the ArangoDB Cluster.
At its core the _Agency_ manages a big configuration tree. It supports
transactional read and write operations on this tree, and other servers
can subscribe to HTTP callbacks for all changes to the tree.
### Coordinators
_Coordinators_ should be accessible from the outside. These are the ones
the clients talk to. They will coordinate cluster tasks like
executing queries and running Foxx services. They know where the
data is stored and will optimize where to run user supplied queries or
parts thereof. _Coordinators_ are stateless and can thus easily be shut down
and restarted as needed.
### DBServers
DBservers are the ones where the data is actually hosted. They
host shards of data and using synchronous replication a DBServer may
either be leader or follower for a shard.
They should not be accessed from the outside but indirectly through the
_Coordinators_. They may also execute queries in part or as a whole when
asked by a _Coordinator_.
Many sensible configurations
----------------------------
This architecture is very flexible and thus allows many configurations,
which are suitable for different usage scenarios:
1. The default configuration is to run exactly one _Coordinator_ and
one _DBServer_ on each machine. This achieves the classical
master/master setup, since there is a perfect symmetry between the
different nodes, clients can equally well talk to any one of the
_Coordinators_ and all expose the same view to the data store. _Agents_
can run on separate, less powerful machines.
2. One can deploy more _Coordinators_ than _DBservers_. This is a sensible
approach if one needs a lot of CPU power for the Foxx services,
because they run on the _Coordinators_.
3. One can deploy more _DBServers_ than _Coordinators_ if more data capacity
is needed and the query performance is the lesser bottleneck
4. One can deploy a _Coordinator_ on each machine where an application
server (e.g. a node.js server) runs, and the _Agents_ and _DBServers_
on a separate set of machines elsewhere. This avoids a network hop
between the application server and the database and thus decreases
latency. Essentially, this moves some of the database distribution
logic to the machine where the client runs.
As you acn see, the _Coordinator_ layer can be scaled and deployed independently
from the _DBServer_ layer.
Cluster ID
----------
Every non-Agency ArangoDB instance in a Cluster is assigned a unique
ID during its startup. Using its ID a node is identifiable
throughout the Cluster. All cluster operations will communicate
via this ID.
Sharding
--------
Using the roles outlined above an ArangoDB Cluster is able to distribute
data in so called _shards_ across multiple _DBServers_. From the outside
this process is fully transparent and as such we achieve the goals of
what other systems call "master-master replication".
In an ArangoDB Cluster you talk to any _Coordinator_ and whenever you read or write data
it will automatically figure out where the data is stored (read) or to
be stored (write). The information about the _shards_ is shared across the
_Coordinators_ using the _Agency_.
ArangoDB organizes its collection data in _shards_. Sharding
allows to use multiple machines to run a cluster of ArangoDB
instances that together constitute a single database. This enables
you to store much more data, since ArangoDB distributes the data
automatically to the different servers. In many situations one can
also reap a benefit in data throughput, again because the load can
be distributed to multiple machines.
_Shards_ are configured per _collection_ so multiple _shards_ of data form
the _collection_ as a whole. To determine in which _shard_ the data is to
be stored ArangoDB performs a hash across the values. By default this
hash is being created from the document __key_.
For further information, please refer to the
[_Cluster Administration_ ](../../Administration/Cluster/README.md#sharding) section.
Synchronous replication
-----------------------
In an ArangoDB Cluster, the replication among the data stored by the _DBServers_
is synchronous.
Synchronous replication works on a per-shard basis. Using the option _replicationFactor_,
one configures for each _collection_ how many copies of each _shard_ are kept in the Cluster.
At any given time, one of the copies is declared to be the _leader_ and
all other replicas are _followers_. Write operations for this _shard_
are always sent to the _DBServer_ which happens to hold the _leader_ copy,
which in turn replicates the changes to all _followers_ before the operation
is considered to be done and reported back to the _Coordinator_.
Read operations are all served by the server holding the _leader_ copy,
this allows to provide snapshot semantics for complex transactions.
Using synchronous replication alone will guarantee consistency and high availabilty
at the cost of reduced performance: write requests will have a higher latency
(due to every write-request having to be executed on the followers) and
read requests will not scale out as only the _leader_ is being asked.
In a Cluster, synchronous replication will be managed by the _Coordinators_ for the client.
The data will always be stored on the _DBServers_.
The following example will give you an idea of how synchronous operation
has been implemented in ArangoDB Cluster:
1. Connect to a coordinator via arangosh
2. Create a collection
127.0.0.1:8530@_system> db._create("test", {"replicationFactor": 2})
3. the coordinator will figure out a *leader* and 1 *follower* and create 1 *shard* (as this is the default)
4. Insert data
127.0.0.1:8530@_system> db.test.insert({"replication": "😎"})
5. The coordinator will write the data to the leader, which in turn will
replicate it to the follower.
6. Only when both were successful the result is reported to be successful
```json
{
"_id" : "test/7987",
"_key" : "7987",
"_rev" : "7987"
}
```
When a follower fails, the leader will give up on it after 3 seconds
and proceed with the operation. As soon as the follower (or the network
connection to the leader) is back up, the two will resynchronize and
synchronous replication is resumed. This happens all transparently
to the client.
Automatic failover
------------------
If a _DBServer_ that holds a _follower_ copy of a _shard_ fails, then the _leader_
can no longer synchronize its changes to that _follower_. After a short timeout
(3 seconds), the _leader_ gives up on the _follower_, declares it to be
out of sync, and continues service without the _follower_. When the server
with the _follower_ copy comes back, it automatically resynchronizes its
data with the _leader_ and synchronous replication is restored.
If a _DBserver_ that holds a _leader_ copy of a shard fails, then the _leader_
can no longer serve any requests. It will no longer send a heartbeat to
the _Agency_. Therefore, a _supervision_ process running in the Raft leader
of the Agency, can take the necessary action (after 15 seconds of missing
heartbeats), namely to promote one of the servers that hold in-sync
replicas of the shard to leader for that shard. This involves a
reconfiguration in the Agency and leads to the fact that coordinators
now contact a different DBserver for requests to this shard. Service
resumes. The other surviving replicas automatically resynchronize their
data with the new leader. When the DBserver with the original leader
copy comes back, it notices that it now holds a follower replica,
resynchronizes its data with the new leader and order is restored.
The following example will give you an idea of how failover
has been implemented in ArangoDB Cluster:
1. The _leader_ of a _shard_ (lets name it _DBServer001_) is going down.
2. A _Coordinator_ is asked to return a document:
127.0.0.1:8530@_system> db.test.document("100069")
3. The _Coordinator_ determines which server is responsible for this document and finds _DBServer001_
4. The _Coordinator_ tries to contact _DBServer001_ and timeouts because it is not reachable.
5. After a short while the _supervision_ (running in parallel on the _Agency_) will see that _heartbeats_ from _DBServer001_ are not coming in
6. The _supervision_ promotes one of the _followers_ (say _DBServer002_), that is in sync, to be _leader_ and makes _DBServer001_ a _follower_.
7. As the _Coordinator_ continues trying to fetch the document it will see that the _leader_ changed to _DBServer002_
8. The _Coordinator_ tries to contact the new _leader_ (_DBServer002_) and returns the result:
```json
{
"_key" : "100069",
"_id" : "test/100069",
"_rev" : "513",
"replication" : "😎"
}
```
9. After a while the _supervision_ declares _DBServer001_ to be completely dead.
10. A new _follower_ is determined from the pool of _DBservers_.
11. The new _follower_ syncs its data from the _leade_r and order is restored.
Please note that there may still be timeouts. Depending on when exactly
the request has been done (in regard to the _supervision_) and depending
on the time needed to reconfigure the Cluster the _Coordinator_ might fail
with a timeout error!
Shard movement and resynchronization
------------------------------------
All _shard_ data synchronizations are done in an incremental way, such that
resynchronizations are quick. This technology allows to move shards
(_follower_ and _leader_ ones) between _DBServers_ without service interruptions.
Therefore, an ArangoDB Cluster can move all the data on a specific _DBServer_
to other _DBServers_ and then shut down that server in a controlled way.
This allows to scale down an ArangoDB Cluster without service interruption,
loss of fault tolerance or data loss. Furthermore, one can re-balance the
distribution of the _shards_, either manually or automatically.
All these operations can be triggered via a REST/JSON API or via the
graphical web UI. All fail-over operations are completely handled within
the ArangoDB Cluster.
Obviously, synchronous replication involves a certain increased latency for
write operations, simply because there is one more network hop within the
Cluster for every request. Therefore the user can set the _replicationFactor_
to 1, which means that only one copy of each shard is kept, thereby
switching off synchronous replication. This is a suitable setting for
less important or easily recoverable data for which low latency write
operations matter.
Microservices and zero administation
------------------------------------
The design and capabilities of ArangoDB are geared towards usage in
modern microservice architectures of applications. With the
[Foxx services](../../Foxx/README.md) it is very easy to deploy a data
centric microservice within an ArangoDB Cluster.
In addition, one can deploy multiple instances of ArangoDB within the
same project. One part of the project might need a scalable document
store, another might need a graph database, and yet another might need
the full power of a multi-model database actually mixing the various
data models. There are enormous efficiency benefits to be reaped by
being able to use a single technology for various roles in a project.
To simplify life of the _devops_ in such a scenario we try as much as
possible to use a _zero administration_ approach for ArangoDB. A running
ArangoDB Cluster is resilient against failures and essentially repairs
itself in case of temporary failures. See the next section for further
capabilities in this direction.
Apache Mesos integration
------------------------
For the distributed setup, we use the Apache Mesos infrastructure by default.
ArangoDB is a fully certified package for DC/OS and can thus
be deployed essentially with a few mouse clicks or a single command, once
you have an existing DC/OS cluster. But even on a plain Apache Mesos cluster
one can deploy ArangoDB via Marathon with a single API call and some JSON
configuration.
The advantage of this approach is that we can not only implement the
initial deployment, but also the later management of automatic
replacement of failed instances and the scaling of the ArangoDB cluster
(triggered manually or even automatically). Since all manipulations are
either via the graphical web UI or via JSON/REST calls, one can even
implement auto-scaling very easily.
A DC/OS cluster is a very natural environment to deploy microservice
architectures, since it is so convenient to deploy various services,
including potentially multiple ArangoDB cluster instances within the
same DC/OS cluster. The built-in service discovery makes it extremely
simple to connect the various microservices and Mesos automatically
takes care of the distribution and deployment of the various tasks.
See the [Deployment](../../Deployment/README.md) chapter and its subsections
for instructions.
It is possible to deploy an ArangoDB cluster by simply launching a bunch of
Docker containers with the right command line options to link them up,
or even on a single machine starting multiple ArangoDB processes. In that
case, synchronous replication will work within the deployed ArangoDB cluster,
and automatic fail-over in the sense that the duties of a failed server will
automatically be assigned to another, surviving one. However, since the
ArangoDB cluster cannot within itself launch additional instances, replacement
of failed nodes is not automatic and scaling up and down has to be managed
manually. This is why we do not recommend this setup for production
deployment.

View File

@ -0,0 +1,14 @@
Cluster Limitations
===================
ArangoDB has no built-in limitations to horizontal scalability. The
central resilient _Agency_ will easily sustain hundreds of _DBservers_
and _Coordinators_, and the usual database operations work completely
decentrally and do not require assistance of the _Agency_.
Likewise, the supervision process in the _Agency_ can easily deal
with lots of servers, since all its activities are not performance
critical.
Obviously, an ArangoDB Cluster is limited by the available resources
of CPU, memory, disk and network bandwidth and latency.

View File

@ -1,10 +1,10 @@
Cluster
=======
This chapter introduces ArangoDB's Cluster.
This _Chapter_ introduces ArangoDB's Cluster.
For further information about the Cluster, please refer to the following sections:
- [Deployment](../../Deployment/Cluster/README.md)
- [Administration](../../Administration/Cluster/README.md)
- [Troubleshooting](../../Troubleshooting/Cluster/README.md)
- [Cluster Deployment](../../Deployment/Cluster/README.md)
- [Cluster Administration](../../Administration/Cluster/README.md)
- [Cluster Troubleshooting](../../Troubleshooting/Cluster/README.md)

View File

@ -1,14 +0,0 @@
Limitations
-----------
ArangoDB has no built-in limitations to horizontal scalability. The
central resilient Agency will easily sustain hundreds of DBservers
and coordinators, and the usual database operations work completely
decentrally and do not require assistance of the Agency.
Likewise, the supervision process in the Agency can easily deal
with lots of servers, since all its activities are not performance
critical.
Obviously, an ArangoDB cluster is limited by the available resources
of CPU, memory, disk and network bandwidth and latency.

View File

@ -0,0 +1,177 @@
Master/Slave Architecture
=========================
Introduction
------------
Components
----------
### Replication Logger
#### Purpose
The _replication logger_ will write all data-modification operations into the
_write-ahead log_. This log may then be read by clients to replay any data
modification on a different server.
#### Checking the state
To query the current state of the _logger_, use the *state* command:
require("@arangodb/replication").logger.state();
The result might look like this:
```js
{
"state" : {
"running" : true,
"lastLogTick" : "133322013",
"totalEvents" : 16,
"time" : "2014-07-06T12:58:11Z"
},
"server" : {
"version" : "2.2.0-devel",
"serverId" : "40897075811372"
},
"clients" : {
}
}
```
The *running* attribute will always be true. In earlier versions of ArangoDB the
replication was optional and this could have been *false*.
The *totalEvents* attribute indicates how many log events have been logged since
the start of the ArangoDB server. Finally, the *lastLogTick* value indicates the
_id_ of the last operation that was written to the server's _write-ahead log_.
It can be used to determine whether new operations were logged, and is also used
by the _replication applier_ for incremental fetching of data.
**Note**: The replication logger state can also be queried via the
[HTTP API](../../../HTTP/Replications/index.html).
To query which data ranges are still available for replication clients to fetch,
the logger provides the *firstTick* and *tickRanges* functions:
require("@arangodb/replication").logger.firstTick();
This will return the minimum tick value that the server can provide to replication
clients via its replication APIs. The *tickRanges* function returns the minimum
and maximum tick values per logfile:
require("@arangodb/replication").logger.tickRanges();
### Replication Applier
#### Purpose
The purpose of the _replication applier_ is to read data from a master database's
event log, and apply them locally. The _applier_ will check the master database
for new operations periodically. It will perform an incremental synchronization,
i.e. only asking the master for operations that occurred after the last synchronization.
The _replication applier_ does not get notified by the master database when there
are "new" operations available, but instead uses the pull principle. It might thus
take some time (the so-called *replication lag*) before an operation from the master
database gets shipped to, and applied in, a slave database.
The _replication applier_ of a database is run in a separate thread. It may encounter
problems when an operation from the master cannot be applied safely, or when the
connection to the master database goes down (network outage, master database is
down or unavailable etc.). In this case, the database's _replication applier_ thread
might terminate itself. It is then up to the administrator to fix the problem and
restart the database's _replication applier_.
If the _replication applier_ cannot connect to the master database, or the
communication fails at some point during the synchronization, the _replication applier_
will try to reconnect to the master database. It will give up reconnecting only
after a configurable amount of connection attempts.
The _replication applier_ state is queryable at any time by using the *state* command
of the _applier_. This will return the state of the _applier_ of the current database:
```js
require("@arangodb/replication").applier.state();
```
The result might look like this:
```js
{
"state" : {
"running" : true,
"lastAppliedContinuousTick" : "152786205",
"lastProcessedContinuousTick" : "152786205",
"lastAvailableContinuousTick" : "152786205",
"progress" : {
"time" : "2014-07-06T13:04:57Z",
"message" : "fetching master log from offset 152786205",
"failedConnects" : 0
},
"totalRequests" : 38,
"totalFailedConnects" : 0,
"totalEvents" : 1,
"lastError" : {
"errorNum" : 0
},
"time" : "2014-07-06T13:04:57Z"
},
"server" : {
"version" : "2.2.0-devel",
"serverId" : "210189384542896"
},
"endpoint" : "tcp://master.example.org:8529",
"database" : "_system"
}
```
The *running* attribute indicates whether the _replication applier_ of the current
database is currently running and polling the server at *endpoint* for new events.
The *progress.failedConnects* attribute shows how many failed connection attempts
the _replication applier_ currently has encountered in a row. In contrast, the
*totalFailedConnects* attribute indicates how many failed connection attempts the
_applier_ has made in total. The *totalRequests* attribute shows how many requests
the _applier_ has sent to the master database in total. The *totalEvents* attribute
shows how many log events the _applier_ has read from the master.
The *progress.message* sub-attribute provides a brief hint of what the _applier_
currently does (if it is running). The *lastError* attribute also has an optional
*errorMessage* sub-attribute, showing the latest error message. The *errorNum*
sub-attribute of the *lastError* attribute can be used by clients to programmatically
check for errors. It should be *0* if there is no error, and it should be non-zero
if the _applier_ terminated itself due to a problem.
Below is an example of the state after the _replication applier_ terminated itself
due to (repeated) connection problems:
```js
{
"state" : {
"running" : false,
"progress" : {
"time" : "2014-07-06T13:14:37Z",
"message" : "applier stopped",
"failedConnects" : 6
},
"totalRequests" : 79,
"totalFailedConnects" : 11,
"totalEvents" : 0,
"lastError" : {
"time" : "2014-07-06T13:09:41Z",
"errorMessage" : "could not connect to master at tcp://master.example.org:8529: Could not connect to 'tcp:/...",
"errorNum" : 1400
},
...
}
}
```
**Note**: the state of a database's replication applier is queryable via the HTTP
API, too. Please refer to [HTTP Interface for Replication](../../../HTTP/Replications/index.html)
for more details.

View File

@ -1,18 +1,18 @@
Replication Limitations
=======================
Master/Slave Limitations
=========================
The replication in ArangoDB has a few limitations. Some of these limitations may be
removed in later versions of ArangoDB:
The Master/Slave setup in ArangoDB has a few limitations. Some of these limitations
may be removed in later versions of ArangoDB:
* there is no feedback from the slaves to the master. If a slave cannot apply an event
it got from the master, the master will have a different state of data. In this
case, the replication applier on the slave will stop and report an error. Administrators
case, the _replication applier_ on the slave will stop and report an error. Administrators
can then either "fix" the problem or re-sync the data from the master to the slave
and start the applier again.
* at the moment it is assumed that only the replication applier executes write
* at the moment it is assumed that only the _replication applier_ executes write
operations on a slave. ArangoDB currently does not prevent users from carrying out
their own write operations on slaves, though this might lead to undefined behavior
and the replication applier stopping.
and the _replication applier_ stopping.
* when a replication slave asks a master for log events, the replication master will
return all write operations for user-defined collections, but it will exclude write
operations for certain system collections. The following collections are excluded
@ -28,16 +28,11 @@ removed in later versions of ArangoDB:
* master servers do not know which slaves are or will be connected to them. All servers
in a replication setup are currently only loosely coupled. There currently is no way
for a client to query which servers are present in a replication.
* when not using our mesos integration failover must be handled by clients or client APIs.
* there currently is one replication applier per ArangoDB database. It is thus not
possible to have a slave apply operations from multiple masters into the same target
database.
* replication is set up on a per-database level. When using ArangoDB with multiple
databases, replication must be configured individually for each database.
* the replication applier is single-threaded, but write operations on the master may
be executed in parallel if they affect different collections. Thus the replication
applier might not be able to catch up with a very powerful and loaded master.
* failover must be handled by clients or client APIs.
* the _replication applier_ is single-threaded, but write operations on the master may
be executed in parallel if they affect different collections. Thus the _replication
applier_ might not be able to catch up with a very powerful and loaded master.
* replication is only supported between the two ArangoDB servers running the same
ArangoDB version. It is currently not possible to replicate between different ArangoDB
versions.
* a replication applier cannot apply data from itself.
* a _replication applier_ cannot apply data from itself.

View File

@ -0,0 +1,11 @@
Master/Slave
============
This _Chapter_ introduces ArangoDB's _Master/Slave_ environment.
For further information about _Master/Slave_ in ArangoDB, please refer to the following
sections:
- [Master/Slave Deployment](../../Deployment/MasterSlave/README.md)
- [Master/Slave Administration](../../Administration/MasterSlave/README.md)

View File

@ -19,10 +19,13 @@ costs grow faster than linear with the size of the server, and
none of the resilience and dynamical capabilities can be achieved
in this way.
In this chapter we explain the distributed architecture of ArangoDB and
discuss its scalability features and limitations:
Options
-------
- [ArangoDB's distributed architecture](Architecture.md)
- [Different data models and scalability](DataModels.md)
- [Limitations](Limitations.md)
Several options are available to scale ArangoDB, each of them has its own pros and
cons:
- [Master/Slave](MasterSlave/README.md)
- [Active Failover](ActiveFailover/README.md)
- [Cluster](Cluster/README.md)
- [Multiple Datacenters](DC2DC/README.md)

View File

@ -15,7 +15,7 @@
"ga",
"callouts@git+https://github.com/Simran-B/gitbook-plugin-callouts.git",
"edit-link",
"page-toc",
"page-toc@git+https://github.com/Simran-B/gitbook-plugin-page-toc.git",
"localized-footer"
],
"pdf": {

Some files were not shown because too many files have changed in this diff Show More