mirror of https://gitee.com/bigwinds/arangodb
81 lines
3.0 KiB
Markdown
81 lines
3.0 KiB
Markdown
Active Failover Administration
|
|
==============================
|
|
|
|
This _Section_ includes information related to the administration of an _Active Failover_
|
|
setup.
|
|
|
|
For a general introduction to the ArangoDB _Active Failover_ setup, please refer
|
|
to the _Active Failover_ [chapter](../../Scalability/ActiveFailover/README.md).
|
|
|
|
Introduction
|
|
------------
|
|
|
|
The _Active Failover_ setup requires almost no manual administration.
|
|
|
|
You may still need to replace, upgrade or remove individual nodes
|
|
in an _Active Failover_ setup.
|
|
|
|
|
|
Determining the current _Leader_
|
|
--------------------------------
|
|
|
|
It is possible to determine the _leader_ by asking any of the involved single-server
|
|
instances. Just send a request to the `/_api/cluster/endpoints` REST API.
|
|
|
|
```bash
|
|
curl http://server.domain.org:8530/_api/cluster/endpoints
|
|
{
|
|
"error": false,
|
|
"code": 200,
|
|
"endpoints": [
|
|
{
|
|
"endpoint": "tcp://[::1]:8530"
|
|
},
|
|
{
|
|
"endpoint": "tcp://[::1]:8531"
|
|
}
|
|
]
|
|
}
|
|
```
|
|
|
|
This API will return you all available endpoints, the first endpoint is defined to
|
|
be the current _Leader_. This endpoint is always available and will not be blocked
|
|
with a `HTTP/1.1 503 Service Unavailable` response on a _Follower_
|
|
|
|
Upgrading / Replacing / Removing a _Leader_
|
|
-------------------------------------------
|
|
|
|
A _Leader_ is the active server which can receive all read and write operations
|
|
in an _Active-Failover_ setup.
|
|
|
|
Upgrading or removing a _Leader_ can be a little tricky, because as soon as you
|
|
stop the leader's process you will trigger a failover situation. This can be intended
|
|
here, but you will probably want to halt all writes to the _leader_ for a certain
|
|
amount of time to allow the _follower_ to catch up on all operations.
|
|
|
|
After you have ensured that the _follower_ is sufficiently caught up, you can
|
|
stop the _leader_ process via the shutdown API or by sending a `SIGTERM` signal
|
|
to the process (i.e. `kill <process-id>`). This will trigger an orderly shutdown,
|
|
and should trigger an immediate switch to the _follower_. If your client drivers
|
|
are configured correctly, you should notice almost no interruption in your
|
|
applications.
|
|
|
|
Once you upgraded the local server via the `--database.auto-upgrade` option,
|
|
you can add it again to the _Active Failover_ setup. The server will resync automatically
|
|
with the new _Leader_ and become a _Follower_.
|
|
|
|
Upgrading / Replacing / Removing a _Follower_
|
|
---------------------------------------------
|
|
|
|
A _Follower_ is the passive server which tries to mirror all the data stored in
|
|
the _Leader_.
|
|
|
|
To upgrade a _follower_ you only need to stop the process and start it
|
|
with `--database.auto-upgrade`. The server process will automatically resync
|
|
with the master after a restart.
|
|
|
|
The clean way of removing a _Follower_ is to first start a replacement _Follower_
|
|
(otherwise you will lose resiliency). To start a _Follower_ please have a look
|
|
into our [deployment guide](../../Deployment/ActiveFailover/README.md).
|
|
After you have your replacement ready you can just kill the process and remove it.
|