mirror of https://gitee.com/bigwinds/arangodb
Doc - ArangoSync doc integration (#4590)
This commit is contained in:
parent
7498b67f67
commit
ecd033491f
|
@ -1,10 +1,11 @@
|
|||
<!-- don't edit here, its from https://@github.com/arangodb/arangosync.git / docs/Manual/ -->
|
||||
# Datacenter to datacenter replication administration
|
||||
|
||||
This Section includes information related to the administration of the _datacenter
|
||||
to datacenter replication_.
|
||||
|
||||
For a general introduction to the _datacenter to datacenter replication_, please
|
||||
refer to the [Datacenter to datacenter replication](..\..\Scalability\DC2DC\README.md)
|
||||
refer to the [Datacenter to datacenter replication](../../Scalability/DC2DC/README.md)
|
||||
chapther.
|
||||
|
||||
## Starting synchronization
|
||||
|
@ -38,6 +39,8 @@ arangosync configure sync \
|
|||
The command will finish quickly. Afterwards it will take some time until
|
||||
the clusters in both datacenters are in sync.
|
||||
|
||||
## Inspect status
|
||||
|
||||
Use the following command to inspect the status of the synchronization of a datacenter:
|
||||
|
||||
```bash
|
||||
|
@ -84,7 +87,7 @@ arangosync get workers \
|
|||
-v
|
||||
```
|
||||
|
||||
## Stoping synchronization
|
||||
## Stopping synchronization
|
||||
|
||||
If you no longer want to synchronize data from a source to a target datacenter
|
||||
you must stop it. To do so, run the following command:
|
||||
|
@ -109,6 +112,7 @@ arangosync abort sync \
|
|||
--auth.user=<username used for authentication of this command> \
|
||||
--auth.password=<password of auth.user>
|
||||
```
|
||||
|
||||
If the source datacenter recovers after an `abort sync` has been executed, it is
|
||||
needed to "cleanup" ArangoSync in the source datacenter.
|
||||
To do so, execute the following command:
|
||||
|
|
|
@ -1,12 +1,13 @@
|
|||
<!-- don't edit here, its from https://@github.com/arangodb/arangosync.git / docs/Manual/ -->
|
||||
# Datacenter to datacenter replication deployment
|
||||
|
||||
This chapter describes how to deploy all the components needed for _datacenter to
|
||||
datacenter replication_.
|
||||
|
||||
For a general introduction to _datacenter to datacenter replication_, please refer
|
||||
to the [Datacenter to datacenter replication](..\Scalability\DC2DC\README.md) chapter.
|
||||
to the [Datacenter to datacenter replication](../Scalability/DC2DC/README.md) chapter.
|
||||
|
||||
[Requirements](..\Scalability\DC2DC\Requirements.md) can be found in this section.
|
||||
[Requirements](../Scalability/DC2DC/Requirements.md) can be found in this section.
|
||||
|
||||
Deployment steps:
|
||||
|
||||
|
|
|
@ -1,8 +1,10 @@
|
|||
<!-- don't edit here, its from https://@github.com/arangodb/arangosync.git / docs/Manual/ -->
|
||||
# ArangoSync Master
|
||||
|
||||
The _ArangoSync Master_ is responsible for managing all synchronization, creating
|
||||
tasks and assigning those to the _ArangoSync Workers_.
|
||||
<br/> At least 2 instances muts be deployed in each datacenter.
|
||||
|
||||
At least 2 instances must be deployed in each datacenter.
|
||||
One instance will be the "leader", the other will be an inactive slave. When the
|
||||
leader is gone for a short while, one of the other instances will take over.
|
||||
|
||||
|
|
|
@ -1,14 +1,16 @@
|
|||
<!-- don't edit here, its from https://@github.com/arangodb/arangosync.git / docs/Manual/ -->
|
||||
# ArangoSync Workers
|
||||
|
||||
The _ArangoSync Worker_ is responsible for executing synchronization tasks.
|
||||
<br/> For optimal performance at least 1 _worker_ instance must be placed on
|
||||
|
||||
For optimal performance at least 1 _worker_ instance must be placed on
|
||||
every machine that has an ArangoDB _DBserver_ running. This ensures that tasks
|
||||
can be executed with minimal network traffic outside of the machine.
|
||||
|
||||
Since _sync workers_ will automatically stop once their TLS server certificate expires
|
||||
(which is set to 2 years by default), it is recommended to run at least 2 instances
|
||||
of a _worker_ on every machine in the datacenter. That way, tasks can still be
|
||||
assigned in the most optimal way, even when a _worker_ in temporarily down for a
|
||||
assigned in the most optimal way, even when a _worker_ is temporarily down for a
|
||||
restart.
|
||||
|
||||
To start an _ArangoSync Worker_ using a `systemd` service, use a unit like this:
|
||||
|
|
|
@ -1,3 +1,4 @@
|
|||
<!-- don't edit here, its from https://@github.com/arangodb/arangosync.git / docs/Manual/ -->
|
||||
# ArangoDB cluster
|
||||
|
||||
There are several ways to start an ArangoDB cluster. In this section we will focus
|
||||
|
@ -14,7 +15,7 @@ The _Starter_ simplifies things for the operator and will coordinate a distribut
|
|||
cluster startup across several machines and assign cluster roles automatically.
|
||||
|
||||
When started on several machines and enough machines have joined, the _Starters_
|
||||
will start _Agents_, s_Coordinators_ and _DBservers_ on these machines.
|
||||
will start _Agents_, _Coordinators_ and _DBservers_ on these machines.
|
||||
|
||||
When running the _Starter_ will supervise its child tasks (namely _Coordinators_,
|
||||
_DBservers_ and _Agents_) and restart them in case of failure.
|
||||
|
@ -76,4 +77,4 @@ The _Starter_ itself will use port `8528`.
|
|||
Since the _Agents_ are so critical to the availability of both the ArangoDB and the ArangoSync cluster,
|
||||
it is recommended to run _Agents_ on dedicated machines. Consider these machines "pets".
|
||||
|
||||
_Coordinators_ and _DBServers_ can be deployed of other machines that should be considered "cattle".
|
||||
_Coordinators_ and _DBServers_ can be deployed on other machines that should be considered "cattle".
|
||||
|
|
|
@ -1,3 +1,4 @@
|
|||
<!-- don't edit here, its from https://@github.com/arangodb/arangosync.git / docs/Manual/ -->
|
||||
# Kafka & Zookeeper
|
||||
|
||||
- How to deploy zookeeper
|
||||
|
|
|
@ -1,3 +1,4 @@
|
|||
<!-- don't edit here, its from https://@github.com/arangodb/arangosync.git / docs/Manual/ -->
|
||||
# Prometheus & Grafana (optional)
|
||||
|
||||
_ArangoSync_ provides metrics in a format supported by [Prometheus](https://prometheus.io).
|
||||
|
|
|
@ -0,0 +1,372 @@
|
|||
<!-- don't edit here, its from https://@github.com/arangodb/arangosync.git / docs/Manual/ -->
|
||||
# Datacenter to datacenter Replication
|
||||
|
||||
## About
|
||||
|
||||
At some point in the growth of a database, there comes a need for
|
||||
replicating it across multiple datacenters.
|
||||
|
||||
Reasons for that can be:
|
||||
|
||||
- Fallback in case of a disaster in one datacenter.
|
||||
- Regional availability
|
||||
- Separation of concerns
|
||||
|
||||
And many more.
|
||||
|
||||
This tutorial describes what the ArangoSync datacenter to datacenter
|
||||
replication solution (ArangoSync from now on) offers,
|
||||
when to use it, when not to use it and how to configure,
|
||||
operate, troubleshoot it & keep it safe.
|
||||
|
||||
### What is it
|
||||
|
||||
ArangoSync is a solution that enables you to asynchronously replicate
|
||||
the entire structure and content in an ArangoDB cluster in one place to a cluster
|
||||
in another place. Typically it is used from one datacenter to another.
|
||||
<br/>It is not a solution for replicating single server instances.
|
||||
|
||||
The replication done by ArangoSync is **asynchronous**. This means that when
|
||||
a client is writing data into the source datacenter, it will consider the
|
||||
request finished before the data has been replicated to the other datacenter.
|
||||
The time needed to completely replicate changes to the other datacenter is
|
||||
typically in the order of seconds, but this can vary significantly depending on
|
||||
load, network & computer capacity.
|
||||
|
||||
ArangoSync performs replication in a **single direction** only. That means that
|
||||
you can replicate data from cluster A to cluster B or from cluster B to cluster A,
|
||||
but never at the same time.
|
||||
<br/>Data modified in the destination cluster **will be lost!**
|
||||
|
||||
Replication is a completely **autonomous** process. Once it is configured it is
|
||||
designed to run 24/7 without frequent manual intervention.
|
||||
<br/>This does not mean that it requires no maintenance or attention at all.
|
||||
<br/>As with any distributed system some attention is needed to monitor its operation
|
||||
and keep it secure (e.g. certificate & password rotation).
|
||||
|
||||
Once configured, ArangoSync will replicate both **structure and data** of an
|
||||
**entire cluster**. This means that there is no need to make additional configuration
|
||||
changes when adding/removing databases or collections.
|
||||
<br/>Also meta data such as users, foxx application & jobs are automatically replicated.
|
||||
|
||||
### When to use it... and when not
|
||||
|
||||
ArangoSync is a good solution in all cases where you want to replicate
|
||||
data from one cluster to another without the requirement that the data
|
||||
is available immediately in the other cluster.
|
||||
|
||||
ArangoSync is not a good solution when one of the following applies:
|
||||
|
||||
- You want to replicate data from cluster A to cluster B and from cluster B
|
||||
to cluster A at the same time.
|
||||
- You need synchronous replication between 2 clusters.
|
||||
- There is no network connection betwee cluster A and B.
|
||||
- You want complete control over which database, collection & documents are replicate and which not.
|
||||
|
||||
## Requirements
|
||||
|
||||
To use ArangoSync you need the following:
|
||||
|
||||
- Two datacenters, each running an ArangoDB Enterprise cluster, version 3.3 or higher.
|
||||
- A network connection between both datacenters with accessible endpoints
|
||||
for several components (see individual components for details).
|
||||
- TLS certificates for ArangoSync master instances (can be self-signed).
|
||||
- TLS certificates for Kafka brokers (can be self-signed).
|
||||
- Optional (but recommended) TLS certificates for ArangoDB clusters (can be self-signed).
|
||||
- Client certificates CA for ArangoSync masters (typically self-signed).
|
||||
- Client certificates for ArangoSync masters (typically self-signed).
|
||||
- At least 2 instances of the ArangoSync master in each datacenter.
|
||||
- One instances of the ArangoSync worker on every machine in each datacenter.
|
||||
|
||||
Note: In several places you will need a (x509) certificate.
|
||||
<br/>The [certificates](#certificates) section below provides more guidance for creating
|
||||
and renewing these certificates.
|
||||
|
||||
Besides the above list, you probably want to use the following:
|
||||
|
||||
- An orchestrator to keep all components running. In this tutorial we will use `systemd` as an example.
|
||||
- A log file collector for centralized collection & access to the logs of all components.
|
||||
- A metrics collector & viewing solution such as Prometheus + Grafana.
|
||||
|
||||
## Deployment
|
||||
|
||||
In the following paragraphs you'll learn which components have to be deployed
|
||||
for datacenter to datacenter replication. For detailed deployment instructions,
|
||||
consult the [reference manual](../../Deployment/DC2DC.md).
|
||||
|
||||
### ArangoDB cluster
|
||||
|
||||
Datacenter to datacenter replication requires an ArangoDB cluster in both data centers,
|
||||
configured with the `rocksdb` storage engine.
|
||||
|
||||
Since the cluster agents are so critical to the availability of both the ArangoDB and the ArangoSync cluster,
|
||||
it is recommended to run agents on dedicated machines. Consider these machines "pets".
|
||||
|
||||
Coordinators and dbservers can be deployed of other machines that should be considered "cattle".
|
||||
|
||||
### Kafka & Zookeeper
|
||||
|
||||
Kafka & Zookeeper are needed when using the `kafka` type message queue.
|
||||
|
||||
Since the kafka brokers are really CPU and memory intensive,
|
||||
it is recommended to run zookeeper & kakfa on dedicated machines.
|
||||
|
||||
Consider these machines "pets".
|
||||
|
||||
### Sync Master
|
||||
|
||||
The Sync Master is responsible for managing all synchronization, creating tasks and assigning
|
||||
those to workers.
|
||||
<br/> At least 2 instances must be deployed in each datacenter.
|
||||
One instance will be the "leader", the other will be an inactive slave. When the leader
|
||||
is gone for a short while, one of the other instances will take over.
|
||||
|
||||
With clusters of a significant size, the sync master will require a significant set of resources.
|
||||
Therefore it is recommended to deploy sync masters on their own servers, equiped with sufficient
|
||||
CPU power and memory capacity.
|
||||
|
||||
The sync master must be reachable on a TCP port 8629 (default).
|
||||
This port must be reachable from inside the datacenter (by sync workers and operations)
|
||||
and from inside of the other datacenter (by sync masters in the other datacenter).
|
||||
|
||||
Since the sync masters can be CPU intensive when running lots of databases & collections,
|
||||
it is recommended to run them on dedicated machines with a lot of CPU power.
|
||||
|
||||
Consider these machines "pets".
|
||||
|
||||
### Sync Workers
|
||||
|
||||
The Sync Worker is responsible for executing synchronization tasks.
|
||||
<br/> For optimal performance at least 1 worker instance must be placed on
|
||||
every machine that has an ArangoDB `dbserver` running. This ensures that tasks
|
||||
can be executed with minimal network traffic outside of the machine.
|
||||
|
||||
Since sync workers will automatically stop once their TLS server certificate expires
|
||||
(which is set to 2 years by default),
|
||||
it is recommended to run at least 2 instances of a worker on every machine in the datacenter.
|
||||
That way, tasks can still be assigned in the most optimal way, even when a worker in temporarily
|
||||
down for a restart.
|
||||
|
||||
The sync worker must be reachable on a TCP port 8729 (default).
|
||||
This port must be reachable from inside the datacenter (by sync masters).
|
||||
When using the `direct` message queue type, this port must also be reachable from
|
||||
the other datacenter.
|
||||
|
||||
Note the large file descriptor limit when using the `kafka` message queue type.
|
||||
With kafka, the sync worker requires about 30 file descriptors per shard.
|
||||
If you use hardware with huge resources, and still run out of file descriptors,
|
||||
you can decide to run multiple sync workers on each machine in order to spread the tasks across them.
|
||||
|
||||
The sync workers should be run on all machines that also contain an ArangoDB dbserver.
|
||||
The sync worker can be memory intensive when running lots of databases & collections.
|
||||
|
||||
Consider these machines "cattle".
|
||||
|
||||
### Prometheus & Grafana (optional)
|
||||
|
||||
ArangoSync provides metrics in a format supported by [Prometheus](https://prometheus.io).
|
||||
We also provide a standard set of dashboards for viewing those metrics in [Grafana](https://grafana.org).
|
||||
|
||||
If you want to use these tools, go to their websites for instructions on how to deploy them.
|
||||
|
||||
After deployment, you must configure prometheus using a configuration file that instructs
|
||||
it about which targets to scrape. For ArangoSync you should configure scrape targets for
|
||||
all sync masters and all sync workers.
|
||||
Consult the [reference manual](../../Deployment/DC2DC/PrometheusGrafana.md) for a sample configuration.
|
||||
|
||||
Prometheus can be a memory & CPU intensive process. It is recommended to keep them
|
||||
on other machines than used to run the ArangoDB cluster or ArangoSync components.
|
||||
|
||||
Consider these machines "cattle", unless you configure alerting on prometheus,
|
||||
in which case it is recommended to consider these machines "pets".
|
||||
|
||||
## Configuration
|
||||
|
||||
Once all components of the ArangoSync solution have been deployed and are
|
||||
running properly, ArangoSync will not automatically replicate database structure
|
||||
and content. For that, it is is needed to configure synchronization.
|
||||
|
||||
To configure synchronization, you need the following:
|
||||
|
||||
- The endpoint of the sync master in the target datacenter.
|
||||
- The endpoint of the sync master in the source datacenter.
|
||||
- A certificate (in keyfile format) used for client authentication of the sync master
|
||||
(with the sync master in the source datacenter).
|
||||
- A CA certificate (public key only) for verifying the integrity of the sync masters.
|
||||
- A username+password pair (or client certificate) for authenticating the configure
|
||||
require with the sync master (in the target datacenter)
|
||||
|
||||
With that information, run:
|
||||
|
||||
```bash
|
||||
arangosync configure sync \
|
||||
--master.endpoint=<endpoints of sync masters in target datacenter> \
|
||||
--master.keyfile=<keyfile of of sync masters in target datacenter> \
|
||||
--source.endpoint=<endpoints of sync masters in source datacenter> \
|
||||
--source.cacert=<public key of CA certificate used to verify sync master in source datacenter> \
|
||||
--auth.user=<username used for authentication of this command> \
|
||||
--auth.password=<password of auth.user>
|
||||
```
|
||||
|
||||
The command will finish quickly. Afterwards it will take some time until
|
||||
the clusters in both datacenters are in sync.
|
||||
|
||||
Use the following command to inspect the status of the synchronization of a datacenter:
|
||||
|
||||
```bash
|
||||
arangosync get status \
|
||||
--master.endpoint=<endpoints of sync masters in datacenter of interest> \
|
||||
--auth.user=<username used for authentication of this command> \
|
||||
--auth.password=<password of auth.user> \
|
||||
-v
|
||||
```
|
||||
|
||||
Note: Invoking this command on the target datacenter will return different results from
|
||||
invoking it on the source datacenter. You need insight in both results to get a "complete picture".
|
||||
|
||||
ArangoSync has more command to inspect the status of synchronization.
|
||||
Consult the [reference manual](../../Administration/DC2DC/README.md#inspect-status) for details.
|
||||
|
||||
### Stop synchronization
|
||||
|
||||
If you no longer want to synchronize data from a source to a target datacenter
|
||||
you must stop it. To do so, run the following command:
|
||||
|
||||
```bash
|
||||
arangosync stop sync \
|
||||
--master.endpoint=<endpoints of sync masters in target datacenter> \
|
||||
--auth.user=<username used for authentication of this command> \
|
||||
--auth.password=<password of auth.user>
|
||||
```
|
||||
|
||||
The command will wait until synchronization has completely stopped before returning.
|
||||
If the synchronization is not completely stopped within a reasonable period (2 minutes by default)
|
||||
the command will fail.
|
||||
|
||||
If the source datacenter is no longer available it is not possible to stop synchronization in
|
||||
a graceful manner. Consult the [reference manual](../../Administration/DC2DC/README.md#stopping-synchronization) for instructions how to abort synchronization in
|
||||
this case.
|
||||
|
||||
### Reversing synchronization direction
|
||||
|
||||
If you want to reverse the direction of synchronization (e.g. after a failure
|
||||
in datacenter A and you switched to the datacenter B for fallback), you
|
||||
must first stop (or abort) the original synchronization.
|
||||
|
||||
Once that is finished (and cleanup has been applied in case of abort),
|
||||
you must now configure the synchronization again, but with swapped
|
||||
source & target settings.
|
||||
|
||||
## Operations & Maintenance
|
||||
|
||||
ArangoSync is a distributed system with a lot different components.
|
||||
As with any such system, it requires some, but not a lot, of operational
|
||||
support.
|
||||
|
||||
### What means are available to monitor status
|
||||
|
||||
All of the components of ArangoSync provide means to monitor their status.
|
||||
Below you'll find an overview per component.
|
||||
|
||||
- Sync master & workers: The `arangosync` servers running as either master
|
||||
or worker, provide:
|
||||
- A status API, see `arangosync get status`. Make sure that all statuses report `running`.
|
||||
<br/>For even more detail the following commands are also available:
|
||||
`arangosync get tasks`, `arangosync get masters` & `arangosync get workers`.
|
||||
- A log on the standard output. Log levels can be configured using `--log.level` settings.
|
||||
- A metrics API `GET /metrics`. This API is compatible with Prometheus.
|
||||
Sample Grafana dashboards for inspecting these metrics are available.
|
||||
|
||||
- ArangoDB cluster: The `arangod` servers that make up the ArangoDB cluster
|
||||
provide:
|
||||
- A log file. This is configurable with settings with a `log.` prefix.
|
||||
E.g. `--log.output=file://myLogFile` or `--log.level=info`.
|
||||
- A statistics API `GET /_admin/statistics`
|
||||
|
||||
- Kafka cluster: The kafka brokers provide:
|
||||
- A log file, see settings with `log.` prefix in its `server.properties` configuration file.
|
||||
|
||||
- Zookeeper: The zookeeper agents provide:
|
||||
- A log on standard output.
|
||||
|
||||
### What to look for while monitoring status
|
||||
|
||||
The very first thing to do when monitoring the status of ArangoSync is to
|
||||
look into the status provided by `arangosync get status ... -v`.
|
||||
When not everything is in the `running` state (on both datacenters), this is an
|
||||
indication that something may be wrong. In case that happens, give it some time
|
||||
(incremental synchronization may take quite some time for large collections)
|
||||
and look at the status again. If the statuses do not change (or change, but not reach `running`)
|
||||
it is time to inspects the metrics & log files.
|
||||
<br/> When the metrics or logs seem to indicate a problem in a sync master or worker, it is
|
||||
safe to restart it, as long as only 1 instance is restarted at a time.
|
||||
Give restarted instances some time to "catch up".
|
||||
|
||||
### 'What if ...'
|
||||
|
||||
Please consult the [reference manual](../../TroubleShooting/DC2DC/README.md) for details descriptions of what to do in case of certain
|
||||
problems and how & what information to provide to support so they can assist you best when needed.
|
||||
|
||||
### Metrics
|
||||
|
||||
ArangoSync (master & worker) provide metrics that can be used for monitoring the ArangoSync
|
||||
solution. These metrics are available using the following HTTPS endpoints:
|
||||
|
||||
- GET `/metrics`: Provides metrics in a format supported by Prometheus.
|
||||
- GET `/metrics.json`: Provides the same metrics in JSON format.
|
||||
|
||||
Both endpoints include help information per metrics.
|
||||
|
||||
Note: Both endpoints require authentication. Besides the usual authentication methods
|
||||
these endpoints are also accessible using a special bearer token specified using the `--monitoring.token`
|
||||
command line option.
|
||||
|
||||
Consult the [reference manual](../../Monitoring/DC2DC/README.md#metrics) for sample output of the metrics endpoints.
|
||||
|
||||
## Security
|
||||
|
||||
### Firewall settings
|
||||
|
||||
The components of ArangoSync use (TCP) network connections to communicate with each other.
|
||||
|
||||
Consult the [reference manual](../../Security/DC2DC/README.md#firewall-settings) for a detailed list of connections and the ports that should be accessible.
|
||||
|
||||
### Certificates
|
||||
|
||||
Digital certificates are used in many places in ArangoSync for both encryption
|
||||
and authentication.
|
||||
|
||||
In ArangoSync all network connections are using Transport Layer Security (TLS),
|
||||
a set of protocols that ensure that all network traffic is encrypted.
|
||||
For this TLS certificates are used. The server side of the network connection
|
||||
offers a TLS certificate. This certificate is (often) verified by the client side of the network
|
||||
connection, to ensure that the certificate is signed by a trusted Certificate Authority (CA).
|
||||
This ensures the integrity of the server.
|
||||
|
||||
In several places additional certificates are used for authentication. In those cases
|
||||
the client side of the connection offers a client certificate (on top of an existing TLS connection).
|
||||
The server side of the connection uses the client certificate to authenticate
|
||||
the client and (optionally) decides which rights should be assigned to the client.
|
||||
|
||||
Note: ArangoSync does allow the use of certificates signed by a well know CA (eg. verisign)
|
||||
however it is more convenient (and common) to use your own CA.
|
||||
|
||||
Consult the [reference manual](../../Security/DC2DC/README.md#certificates) for detailed instructions on how to create these certificates.
|
||||
|
||||
#### Renewing certificates
|
||||
|
||||
All certificates have meta information in them the limit their use in function,
|
||||
target & lifetime.
|
||||
<br/> A certificate created for client authentication (function) cannot be used as a TLS server certificate
|
||||
(same is true for the reverse).
|
||||
<br/> A certificate for host `myserver` (target) cannot be used for host `anotherserver`.
|
||||
<br/> A certficiate that is valid until October 2017 (limetime) cannot be used after October 2017.
|
||||
|
||||
If anything changes in function, target or lifetime you need a new certificate.
|
||||
|
||||
The procedure for creating a renewed certificate is the same as for creating a "first" certificate.
|
||||
<br/> After creating the renewed certificate the process(es) using them have to be updated.
|
||||
This mean restarting them. All ArangoSync components are designed to support stopping and starting
|
||||
single instances, but do not restart more than 1 instance at the same time.
|
||||
As soon as 1 instance has been restarted, give it some time to "catch up" before restarting
|
||||
the next instance.
|
|
@ -1,3 +1,4 @@
|
|||
<!-- don't edit here, its from https://@github.com/arangodb/arangosync.git / docs/Manual/ -->
|
||||
# Monitoring datacenter to datacenter replication
|
||||
|
||||
This section includes information related to the monitoring of the _datacenter
|
||||
|
@ -7,7 +8,7 @@ For a general introduction to the _datacenter to datacenter replication_, please
|
|||
refer to the [Datacenter to datacenter replication](..\..\Scalability\DC2DC\README.md)
|
||||
chapter.
|
||||
|
||||
# Metrics
|
||||
## Metrics
|
||||
|
||||
_ArangoSync_ (_master_ & _worker_) provide metrics that can be used for monitoring
|
||||
the _datacenter to datacenter repliation_ solution. These metrics are available
|
||||
|
|
|
@ -16,6 +16,8 @@
|
|||
* [Coming from SQL](GettingStarted/ComingFromSql.md)
|
||||
# https://@github.com/arangodb-helper/arangodb.git;arangodb;docs/Manual;;/
|
||||
* [ArangoDB Starter](GettingStarted/Starter/README.md)
|
||||
# https://@github.com/arangodb/arangosync.git;arangosync;docs/Manual;;/
|
||||
* [Datacenter to datacenter Replication](GettingStarted/DC2DC/README.md)
|
||||
# * [Coming from MongoDB](GettingStarted/ComingFromMongoDb.md) #TODO
|
||||
#
|
||||
* [Highlights](Highlights.md)
|
||||
|
@ -25,6 +27,7 @@
|
|||
* [Architecture](Scalability/Architecture.md)
|
||||
* [Data models](Scalability/DataModels.md)
|
||||
* [Limitations](Scalability/Limitations.md)
|
||||
# https://@github.com/arangodb/arangosync.git;arangosync;docs/Manual;;/
|
||||
* [Datacenter to datacenter replication](Scalability/DC2DC/README.md)
|
||||
* [Introduction](Scalability/DC2DC/Introduction.md)
|
||||
* [Applicability](Scalability/DC2DC/Applicability.md)
|
||||
|
@ -146,6 +149,7 @@
|
|||
* [Cluster: Local test setups](Deployment/Local.md)
|
||||
* [Cluster: Processes](Deployment/Distributed.md)
|
||||
* [Cluster: Docker](Deployment/Docker.md)
|
||||
# https://@github.com/arangodb/arangosync.git;arangosync;docs/Manual;;/
|
||||
* [Multiple Datacenters](Deployment/DC2DC.md)
|
||||
* [Cluster](Deployment/DC2DC/Cluster.md)
|
||||
* [Kafka & Zookeeper](Deployment/DC2DC/KafkaZookeeper.md)
|
||||
|
@ -211,6 +215,7 @@
|
|||
* [Configuration](Administration/Replication/Synchronous/Configuration.md)
|
||||
* [Satellite Collections](Administration/Replication/Synchronous/Satellites.md)
|
||||
* [Cluster](Administration/Cluster/README.md)
|
||||
# https://@github.com/arangodb/arangosync.git;arangosync;docs/Manual;;/
|
||||
* [Datacenter to datacenter replication](Administration/DC2DC/README.md)
|
||||
* [Sharding](Administration/Sharding/README.md)
|
||||
# * [Authentication](Administration/Sharding/Authentication.md)
|
||||
|
@ -233,12 +238,15 @@
|
|||
* [Datafile Debugger](Troubleshooting/DatafileDebugger.md)
|
||||
* [Arangobench](Troubleshooting/Arangobench.md)
|
||||
* [Cluster](Troubleshooting/Cluster/README.md)
|
||||
# https://@github.com/arangodb/arangosync.git;arangosync;docs/Manual;;/
|
||||
* [Datacenter to datacenter replication](Troubleshooting/DC2DC/README.md)
|
||||
#
|
||||
* [Monitoring](Monitoring/README.md)
|
||||
# https://@github.com/arangodb/arangosync.git;arangosync;docs/Manual;;/
|
||||
* [Datacenter to datacenter replication](Monitoring/DC2DC/README.md)
|
||||
#
|
||||
* [Security](Security/README.md)
|
||||
# https://@github.com/arangodb/arangosync.git;arangosync;docs/Manual;;/
|
||||
* [Datacenter to datacenter replication](Security/DC2DC/README.md)
|
||||
#
|
||||
* [Architecture](Architecture/README.md)
|
||||
|
|
|
@ -1,3 +1,4 @@
|
|||
<!-- don't edit here, its from https://@github.com/arangodb/arangosync.git / docs/Manual/ -->
|
||||
# When to use it... and when not
|
||||
|
||||
The _datacenter to datacenter replication_ is a good solution in all cases where
|
||||
|
@ -6,6 +7,7 @@ that the data is available immediately in the other cluster.
|
|||
|
||||
The _datacenter to datacenter replication_ is not a good solution when one of the
|
||||
following applies:
|
||||
|
||||
- You want to replicate data from cluster A to cluster B and from cluster B
|
||||
to cluster A at the same time.
|
||||
- You need synchronous replication between 2 clusters.
|
||||
|
|
|
@ -1,9 +1,11 @@
|
|||
<!-- don't edit here, its from https://@github.com/arangodb/arangosync.git / docs/Manual/ -->
|
||||
# Introduction
|
||||
|
||||
At some point in the grows of a database, there comes a need for replicating it
|
||||
across multiple datacenters.
|
||||
|
||||
Reasons for that can be:
|
||||
|
||||
- Fallback in case of a disaster in one datacenter
|
||||
- Regional availability
|
||||
- Separation of concerns
|
||||
|
|
|
@ -1,3 +1,4 @@
|
|||
<!-- don't edit here, its from https://@github.com/arangodb/arangosync.git / docs/Manual/ -->
|
||||
# Datacenter to datacenter replication
|
||||
|
||||
This chapter introduces ArangoDB's _datacenter to datacenter replication_ (DC2DC).
|
||||
|
@ -5,8 +6,8 @@ This chapter introduces ArangoDB's _datacenter to datacenter replication_ (DC2DC
|
|||
For further information about _datacenter to datacenter replication_, please refer
|
||||
to the following sections:
|
||||
|
||||
- [Deployment](..\..\Deployment\DC2DC.md)
|
||||
- [Administration](..\..\Administration\DC2DC\README.md)
|
||||
- [Troubleshooting](..\..\Troubleshooting\DC2DC\README.md)
|
||||
- [Monitoring](..\..\Monitoring\DC2DC\README.md)
|
||||
- [Security](..\..\Security\DC2DC\README.md)
|
||||
- [Deployment](../../Deployment/DC2DC.md)
|
||||
- [Administration](../../Administration/DC2DC/README.md)
|
||||
- [Troubleshooting](../../Troubleshooting/DC2DC/README.md)
|
||||
- [Monitoring](../../Monitoring/DC2DC/README.md)
|
||||
- [Security](../../Security/DC2DC/README.md)
|
||||
|
|
|
@ -1,3 +1,4 @@
|
|||
<!-- don't edit here, its from https://@github.com/arangodb/arangosync.git / docs/Manual/ -->
|
||||
# Requirements
|
||||
|
||||
To use _datacenter to datacenter replication_ you need the following:
|
||||
|
@ -14,7 +15,7 @@ To use _datacenter to datacenter replication_ you need the following:
|
|||
- One instances of the _ArangoSync worker_ on every machine in each datacenter.
|
||||
|
||||
Note: In several places you will need a (x509) certificate.
|
||||
<br/>The [Certificates](..\..\Security\DC2DC\README.md#certificates) section provides more guidance for creating
|
||||
<br/>The [Certificates](../../Security/DC2DC/README.md#certificates) section provides more guidance for creating
|
||||
and renewing these certificates.
|
||||
|
||||
Besides the above list, you probably want to use the following:
|
||||
|
|
|
@ -1,3 +1,4 @@
|
|||
<!-- don't edit here, its from https://@github.com/arangodb/arangosync.git / docs/Manual/ -->
|
||||
# Datacenter to datacenter Security
|
||||
|
||||
This section includes information related to the _datacenter to datacenter replication_
|
||||
|
@ -144,12 +145,14 @@ To create a certificate used for client authentication in the **keyfile** format
|
|||
you need the public key of the CA (`--cacert`), the private key of
|
||||
the CA (`--cakey`) and one or more hostnames (or IP addresses) or email addresses.
|
||||
Then run:
|
||||
```
|
||||
|
||||
```bash
|
||||
arangosync create client-auth keyfile \
|
||||
--cacert=my-client-auth-ca.crt --cakey=my-client-auth-ca.key \
|
||||
[--host=<hostname> | --email=<emailaddress>] \
|
||||
--keyfile=my-client-auth-cert.keyfile
|
||||
```
|
||||
|
||||
Make sure to protect and store the generated keyfile (`my-client-auth-cert.keyfile`) in a safe place.
|
||||
|
||||
#### CA certificates
|
||||
|
@ -166,10 +169,12 @@ Make sure to protect and store both generated files (`my-tls-ca.crt` & `my-tls-c
|
|||
Therefore even more care is needed to store them safely.
|
||||
|
||||
To create a CA certificate used to **sign client authentication certificates**, run:
|
||||
```
|
||||
|
||||
```bash
|
||||
arangosync create client-auth ca \
|
||||
--cert=my-client-auth-ca.crt --key=my-client-auth-ca.key
|
||||
```
|
||||
|
||||
Make sure to protect and store both generated files (`my-client-auth-ca.crt` & `my-client-auth-ca.key`)
|
||||
in a safe place.
|
||||
<br/>Note: CA certificates have a much longer lifetime than normal certificates.
|
||||
|
|
|
@ -1,3 +1,4 @@
|
|||
<!-- don't edit here, its from https://@github.com/arangodb/arangosync.git / docs/Manual/ -->
|
||||
# Troubleshooting datacenter to datacenter replication
|
||||
|
||||
The _datacenter to datacenter replication_ is a distributed system with a lot
|
||||
|
@ -8,7 +9,7 @@ This section includes information on how to troubleshoot the
|
|||
_datacenter to datacenter replication_.
|
||||
|
||||
For a general introduction to the _datacenter to datacenter replication_, please
|
||||
refer to the [Datacenter to datacenter replication](..\..\Scalability\DC2DC\README.md)
|
||||
refer to the [Datacenter to datacenter replication](../../Scalability/DC2DC/README.md)
|
||||
chapter.
|
||||
|
||||
## What means are available to monitor status
|
||||
|
@ -75,7 +76,7 @@ switching your applications to the target (backup) datacenter.
|
|||
|
||||
This is what you must do in that case:
|
||||
|
||||
1. [Stop synchronization](..\..\Administration\DC2DC\README.md#stoping-synchronization) using:
|
||||
1. [Stop synchronization](../../Administration/DC2DC/README.md#stoping-synchronization) using:
|
||||
|
||||
```bash
|
||||
arangosync stop sync ...
|
||||
|
@ -87,7 +88,7 @@ This is what you must do in that case:
|
|||
arangosync abort sync ...
|
||||
```
|
||||
|
||||
See [Stoping synchronization](..\..\Administration\DC2DC\README.md#stoping-synchronization)
|
||||
See [Stoping synchronization](../../Administration/DC2DC/README.md#stoping-synchronization)
|
||||
for how to cleanup the source datacenter when it becomes available again.
|
||||
1. Verify that configuration has completely stopped using:
|
||||
```bash
|
||||
|
@ -97,7 +98,7 @@ This is what you must do in that case:
|
|||
|
||||
When the original source datacenter is restored, you may switch roles and
|
||||
make it the target datacenter. To do so, use `arangosync configure sync ...`
|
||||
as described in [Reversing synchronization direction](..\..\Administration\DC2DC\README.md#reversing-synchronization-direction).
|
||||
as described in [Reversing synchronization direction](../../Administration/DC2DC/README.md#reversing-synchronization-direction).
|
||||
|
||||
## What to do in case of a planned network outage
|
||||
|
||||
|
@ -107,7 +108,7 @@ to indicate "it is still alive". The other datacenter assumes the connection is
|
|||
|
||||
If you're planning some sort of maintenance where you know the connectivity
|
||||
will be lost for some time (e.g. 3 hours), you can prepare ArangoSync for that
|
||||
such that it will hold of re-synchronization for a given period of time.
|
||||
such that it will hold off re-synchronization for a given period of time.
|
||||
|
||||
To do so, on both datacenters, run:
|
||||
|
||||
|
@ -118,13 +119,14 @@ arangosync set message timeout \
|
|||
--auth.password=<password of auth.user> \
|
||||
3h
|
||||
```
|
||||
The last argument is the period that ArangoSync should hold-of resynchronization for.
|
||||
|
||||
The last argument is the period that ArangoSync should hold-off resynchronization for.
|
||||
This can be minutes (e.g. `15m`) or hours (e.g. `3h`).
|
||||
|
||||
If maintenance is taking longer than expected, you can use the same command the extend
|
||||
the hold of period (e.g. to `4h`).
|
||||
the hold-off period (e.g. to `4h`).
|
||||
|
||||
After the maintenance, use the same command restore the hold of period to its
|
||||
After the maintenance, use the same command restore the hold-off period to its
|
||||
default of `1h`.
|
||||
|
||||
## What to do in case of a document that exceeds the message queue limits
|
||||
|
|
Loading…
Reference in New Issue