mirror of https://gitee.com/bigwinds/arangodb
Doc - 3.3.14 new starter upgrade procedure + partial sync of external repos (#6020)
This commit is contained in:
parent
a4683fa118
commit
96db5356b8
|
@ -1,37 +1,7 @@
|
|||
<!-- don't edit here, its from https://@github.com/arangodb-helper/arangodb.git / docs/Manual/ -->
|
||||
ArangoDB Starter Recovery Procedure
|
||||
===================================
|
||||
# ArangoDB Starter Administration
|
||||
|
||||
This procedure is intended to recover a cluster (that was started with the ArangoDB
|
||||
_Starter_) when a machine of that cluster is broken without the possibility to recover
|
||||
it (e.g. complete HD failure). In the procedure is does not matter if a replacement
|
||||
machine uses the old or a new IP address.
|
||||
This chapter documents administering the _ArangoDB Starter_.
|
||||
|
||||
To recover from this scenario, you must:
|
||||
- Create a new (replacement) machine with ArangoDB (including _Starter_) installed.
|
||||
- Create a file called `RECOVERY` in the directory you are going to use as data
|
||||
directory of the _Starter_ (the one that is passed via the option `--starter.data-dir`).
|
||||
This file must contain the IP address and port of the _Starter_ that has been
|
||||
broken (and will be replaced with this new machine).
|
||||
|
||||
E.g.
|
||||
|
||||
```bash
|
||||
echo "192.168.1.25:8528" > $DATADIR/RECOVERY
|
||||
```
|
||||
|
||||
After creating the `RECOVERY` file, start the _Starter_ using all the normal command
|
||||
line arguments.
|
||||
|
||||
The _Starter_ will now:
|
||||
1. Talk to the remaining _Starters_ to find the ID of the _Starter_ it replaces and
|
||||
use that ID to join the remaining _Starters_.
|
||||
1. Talk to the remaining _Agents_ to find the ID of the _Agent_ it replaces and
|
||||
adjust the command-line arguments of the _Agent_ (it will start) to use that ID.
|
||||
This is skipped if the _Starter_ was not running an _Agent_.
|
||||
1. Remove the `RECOVERY` file from the data directory.
|
||||
|
||||
The cluster will now recover automatically. It will however have one more _Coordinators_
|
||||
and _DBServers_ than expected. Exactly one _Coordinator_ and one _DBServer_ will
|
||||
be listed "red" in the web UI of the database. They will have to be removed manually
|
||||
using the ArangoDB Web UI.
|
||||
- [Remove a machine from the cluster](./Removal.md)
|
||||
- [Recover from a failed machine](./Recovery.md)
|
||||
|
|
|
@ -0,0 +1,38 @@
|
|||
<!-- don't edit here, its from https://@github.com/arangodb-helper/arangodb.git / docs/Manual/ -->
|
||||
# ArangoDB Starter Recovery Procedure
|
||||
|
||||
This procedure is intended to recover a cluster (that was started with the ArangoDB
|
||||
_Starter_) when a machine of that cluster is broken without the possibility to recover
|
||||
it (e.g. complete HD failure). In the procedure is does not matter if a replacement
|
||||
machine uses the old or a new IP address.
|
||||
|
||||
To recover from this scenario, you must:
|
||||
|
||||
- Create a new (replacement) machine with ArangoDB (including _Starter_) installed.
|
||||
- Create a file called `RECOVERY` in the directory you are going to use as data
|
||||
directory of the _Starter_ (the one that is passed via the option `--starter.data-dir`).
|
||||
This file must contain the IP address and port of the _Starter_ that has been
|
||||
broken (and will be replaced with this new machine).
|
||||
|
||||
E.g.
|
||||
|
||||
```bash
|
||||
echo "192.168.1.25:8528" > $DATADIR/RECOVERY
|
||||
```
|
||||
|
||||
After creating the `RECOVERY` file, start the _Starter_ using all the normal command
|
||||
line arguments.
|
||||
|
||||
The _Starter_ will now:
|
||||
|
||||
1. Talk to the remaining _Starters_ to find the ID of the _Starter_ it replaces and
|
||||
use that ID to join the remaining _Starters_.
|
||||
1. Talk to the remaining _Agents_ to find the ID of the _Agent_ it replaces and
|
||||
adjust the command-line arguments of the _Agent_ (it will start) to use that ID.
|
||||
This is skipped if the _Starter_ was not running an _Agent_.
|
||||
1. Remove the `RECOVERY` file from the data directory.
|
||||
|
||||
The cluster will now recover automatically. It will however have one more _Coordinators_
|
||||
and _DBServers_ than expected. Exactly one _Coordinator_ and one _DBServer_ will
|
||||
be listed "red" in the web UI of the database. They will have to be removed manually
|
||||
using the ArangoDB Web UI.
|
|
@ -0,0 +1,60 @@
|
|||
<!-- don't edit here, its from https://@github.com/arangodb-helper/arangodb.git / docs/Manual/ -->
|
||||
# ArangoDB Starter Removal Procedure
|
||||
|
||||
This procedure is intended to remove a machine from a cluster
|
||||
(that was started with the ArangoDB _Starter_).
|
||||
|
||||
It is possible to run this procedure while the machine is still running
|
||||
or when it has already been removed.
|
||||
|
||||
It is not possible to remove machines that have an agent on it!
|
||||
Use the [recovery procedure](./Recovery.md) if you have a failed machine
|
||||
with an agent on it.
|
||||
|
||||
Note that it is highly recommended to remove a machine while it is still running.
|
||||
|
||||
To remove a machine from a cluster, run the following command:
|
||||
|
||||
```bash
|
||||
arangodb remove starter --starter.endpoint=<endpoint> [--starter.id=<id>] [--force]
|
||||
```
|
||||
|
||||
Where `<endpoint>` is the endpoint of the starter that you want to remove,
|
||||
or the endpoint of one of the remaining starters. E.g. `http://localhost:8528`.
|
||||
|
||||
If you want to remove a machine that is no longer running, use the `--starter.id`
|
||||
option. Set it to the ID of the ArangoDB _Starter_ on the machine that you want to remove.
|
||||
|
||||
You can find this ID in a `setup.json` file in the data directory of one of
|
||||
the remaining ArangoDB _Starters_.
|
||||
|
||||
E.g.
|
||||
```json
|
||||
{
|
||||
...
|
||||
"peers": {
|
||||
"Peers": [
|
||||
{
|
||||
"ID": "21e42415",
|
||||
"Address": "10.21.56.123",
|
||||
"Port": 8528,
|
||||
"PortOffset": 0,
|
||||
"DataDir": "/mydata/server1",
|
||||
"HasAgent": true,
|
||||
"IsSecure": false
|
||||
},
|
||||
...
|
||||
}
|
||||
```
|
||||
|
||||
If the machine you want to remove has address `10.21.56.123` and was listening
|
||||
on port `8528`, use ID `21e42415`.
|
||||
|
||||
The `remove starter` command will attempt the cleanout all data from the servers
|
||||
of the machine that you want to remove.
|
||||
This can take a long of time.
|
||||
If the cleanout fails, the `remove starter` command will fail.
|
||||
|
||||
If you want to remove the machine even when the cleanout has failed, use
|
||||
the `--force` option.
|
||||
Note that this may lead to data loss!
|
|
@ -127,17 +127,14 @@ cluster is down, or in a bad state, irrespective of the value of this setting.
|
|||
|
||||
### `spec.rocksdb.encryption.keySecretName`
|
||||
|
||||
{% hint 'info' %}
|
||||
This feature is only available in the
|
||||
[**Enterprise Edition**](https://www.arangodb.com/why-arangodb/arangodb-enterprise/)
|
||||
{% endhint %}
|
||||
|
||||
This setting specifies the name of a Kubernetes `Secret` that contains
|
||||
an encryption key used for encrypting all data stored by ArangoDB servers.
|
||||
When an encryption key is used, encryption of the data in the cluster is enabled,
|
||||
without it encryption is disabled.
|
||||
The default value is empty.
|
||||
|
||||
This requires the Enterprise version.
|
||||
|
||||
The encryption key cannot be changed after the cluster has been created.
|
||||
|
||||
The secret specified by this setting, must have a data field named 'key' containing
|
||||
|
|
|
@ -94,36 +94,42 @@ This results in a file called `ca.crt` containing a PEM encoded, x509 CA certifi
|
|||
|
||||
## Query requests
|
||||
|
||||
For most client requests made by a driver, it does not matter if there is any kind
|
||||
of load-balancer between your client application and the ArangoDB deployment.
|
||||
For most client requests made by a driver, it does not matter if there is any
|
||||
kind of load-balancer between your client application and the ArangoDB
|
||||
deployment.
|
||||
|
||||
{% hint 'info' %}
|
||||
Note that even a simple `Service` of type `ClusterIP` already behaves as a load-balancer.
|
||||
Note that even a simple `Service` of type `ClusterIP` already behaves as a
|
||||
load-balancer.
|
||||
{% endhint %}
|
||||
|
||||
The exception to this is cursor related requests made to an ArangoDB `Cluster` deployment.
|
||||
The coordinator that handles an initial query request (that results in a `Cursor`)
|
||||
will save some in-memory state in that coordinator, if the result of the query
|
||||
is too big to be transfer back in the response of the initial request.
|
||||
The exception to this is cursor-related requests made to an ArangoDB `Cluster`
|
||||
deployment. The coordinator that handles an initial query request (that results
|
||||
in a `Cursor`) will save some in-memory state in that coordinator, if the result
|
||||
of the query is too big to be transfer back in the response of the initial
|
||||
request.
|
||||
|
||||
Follow-up requests have to be made to fetch the remaining data.
|
||||
These follow-up requests must be handled by the same coordinator to which the initial
|
||||
request was made.
|
||||
Follow-up requests have to be made to fetch the remaining data. These follow-up
|
||||
requests must be handled by the same coordinator to which the initial request
|
||||
was made. As soon as there is a load-balancer between your client application
|
||||
and the ArangoDB cluster, it is uncertain which coordinator will receive the
|
||||
follow-up request.
|
||||
|
||||
As soon as there is a load-balancer between your client application and the ArangoDB cluster,
|
||||
it is uncertain which coordinator will actually handle the follow-up request.
|
||||
ArangoDB will transparently forward any mismatched requests to the correct
|
||||
coordinator, so the requests can be answered correctly without any additional
|
||||
configuration. However, this incurs a small latency penalty due to the extra
|
||||
request across the internal network.
|
||||
|
||||
To resolve this uncertainty, make sure to run your client application in the same
|
||||
Kubernetes cluster and synchronize your endpoints before making the
|
||||
initial query request.
|
||||
This will result in the use (by the driver) of internal DNS names of all coordinators.
|
||||
A follow-up request can then be sent to exactly the same coordinator.
|
||||
To prevent this uncertainty client-side, make sure to run your client
|
||||
application in the same Kubernetes cluster and synchronize your endpoints before
|
||||
making the initial query request. This will result in the use (by the driver) of
|
||||
internal DNS names of all coordinators. A follow-up request can then be sent to
|
||||
exactly the same coordinator.
|
||||
|
||||
If your client application is running outside the Kubernetes cluster this is much harder
|
||||
to solve.
|
||||
The easiest way to work around it, is by making sure that the query results are small
|
||||
enough.
|
||||
When that is not feasible, it is also possible to resolve this
|
||||
when the internal DNS names of your Kubernetes cluster are exposed to your client application
|
||||
and the resulting IP addresses are routable from your client application.
|
||||
To expose internal DNS names of your Kubernetes cluster, your can use [CoreDNS](https://coredns.io).
|
||||
If your client application is running outside the Kubernetes cluster the easiest
|
||||
way to work around it is by making sure that the query results are small enough
|
||||
to be returned by a single request. When that is not feasible, it is also
|
||||
possible to resolve this when the internal DNS names of your Kubernetes cluster
|
||||
are exposed to your client application and the resulting IP addresses are
|
||||
routable from your client application. To expose internal DNS names of your
|
||||
Kubernetes cluster, your can use [CoreDNS](https://coredns.io).
|
||||
|
|
|
@ -19,4 +19,4 @@ Each of these uses involves a different custom resource.
|
|||
|
||||
Continue with [Using the ArangoDB Kubernetes Operator](./Usage.md)
|
||||
to learn how to install the ArangoDB Kubernetes operator and create
|
||||
your first deployment.
|
||||
your first deployment.
|
||||
|
|
|
@ -39,4 +39,4 @@ kubectl apply -f examples/yourUpdatedDeployment.yaml
|
|||
|
||||
## See also
|
||||
|
||||
- [Scaling](./Scaling.md)
|
||||
- [Scaling](./Scaling.md)
|
||||
|
|
|
@ -262,6 +262,9 @@ This option only has to be specified if the standard search fails.
|
|||
Sets the storage engine used by the `arangod` servers.
|
||||
The value `rocksdb` is only allowed on `arangod` version 3.2 and up.
|
||||
|
||||
On `arangod` version 3.3 and earlier, the default value is `mmfiles`.
|
||||
On `arangod` version 3.4 and later, the default value is `rocksdb`.
|
||||
|
||||
- `--cluster.start-coordinator=bool`
|
||||
|
||||
This indicates whether or not a coordinator instance should be started
|
||||
|
|
|
@ -314,7 +314,9 @@
|
|||
# https://@github.com/arangodb/arangosync.git;arangosync;docs/Manual;;/
|
||||
* [Datacenter to datacenter replication](Administration/DC2DC/README.md)
|
||||
# https://@github.com/arangodb-helper/arangodb.git;arangodb;docs/Manual;;/
|
||||
* [ArangoDB Starter Recovery Procedure](Administration/Starter/README.md)
|
||||
* [ArangoDB Starter Administration](Administration/Starter/README.md)
|
||||
* [ArangoDB Starter Removal Procedure](Administration/Starter/Removal.md)
|
||||
* [ArangoDB Starter Recovery Procedure](Administration/Starter/Recovery.md)
|
||||
* [Security](Security/README.md)
|
||||
* [Change Root Password](Security/ChangeRootPassword.md)
|
||||
* [Encryption](Administration/Encryption/README.md)
|
||||
|
|
|
@ -149,11 +149,15 @@ Note: When you restart the starter, it remembers the original `--starter.local`
|
|||
|
||||
## Starting a cluster with datacenter to datacenter synchronization
|
||||
|
||||
{% hint 'info' %}
|
||||
This feature is only available in the
|
||||
[**Enterprise Edition**](https://www.arangodb.com/why-arangodb/arangodb-enterprise/)
|
||||
{% endhint %}
|
||||
|
||||
Datacenter to datacenter replication (DC2DC) requires a normal ArangoDB cluster in both data centers
|
||||
and one or more (`arangosync`) syncmasters & syncworkers in both data centers.
|
||||
The starter enables you to run these syncmasters & syncworkers in combination with your normal
|
||||
cluster.
|
||||
Note: Datacenter to datacenter replication is an ArangoDB Enterprise Edition feature.
|
||||
|
||||
To run a starter with DC2DC support you add the following arguments to the starters command line:
|
||||
|
||||
|
|
|
@ -61,5 +61,5 @@ In addition to the paragraph above, rolling upgrades via the tool _Starter_ are
|
|||
as documented in the _Section_ [Upgrading Starter Deployments](../Starter/README.md),
|
||||
with the following limitations:
|
||||
|
||||
- Rolling upgrades between 3.2 and 3.3 are not supported before 3.2.15 and 3.3.8.
|
||||
- Rolling upgrades between 3.2 and 3.3 are not supported before 3.2.15 and 3.3.9.
|
||||
|
||||
|
|
|
@ -1,6 +1,5 @@
|
|||
<!-- don't edit here, its from https://@github.com/arangodb-helper/arangodb.git / docs/Manual/ -->
|
||||
Upgrading _Starter_ Deployments
|
||||
===============================
|
||||
# Upgrading _Starter_ Deployments
|
||||
|
||||
Starting from versions 3.2.15 and 3.3.8, the ArangoDB [_Starter_](../../Programs/Starter/README.md)
|
||||
supports a new, automated, procedure to perform upgrades, including rolling upgrades
|
||||
|
@ -8,12 +7,13 @@ of a [Cluster](../../Scalability/Cluster/README.md) setup.
|
|||
|
||||
The upgrade procedure of the _Starter_ described in this _Section_ can be used to
|
||||
upgrade to a new hotfix, or to perform an upgrade to a new minor version of ArangoDB.
|
||||
Please refer to the [Upgrade Paths](../GeneralInfo/README.md#upgrade-paths) for detailed
|
||||
information.
|
||||
|
||||
**Important:** rolling upgrades of Cluster setups from 3.2 to 3.3 are only supported
|
||||
from versions 3.2.15 and 3.3.9.
|
||||
from versions 3.2.15 and 3.3.9.
|
||||
|
||||
Upgrade Procedure
|
||||
-----------------
|
||||
## Upgrade Procedure
|
||||
|
||||
The following procedure has to be executed on every ArangoDB _Starter_ instance.
|
||||
It is assumed that a _Starter_ deployment with mode `single`, `activefailover` or
|
||||
|
@ -24,21 +24,21 @@ It is assumed that a _Starter_ deployment with mode `single`, `activefailover` o
|
|||
Installing the new ArangoDB version binary also includes the latest ArangoDB _Starter_
|
||||
binary, which is necessary to perform the rolling upgrade.
|
||||
|
||||
The first step is to install the new ArangoDB package.
|
||||
The first step is to install the new ArangoDB package.
|
||||
|
||||
**Note:** you do not have to stop the _Starter_ processes before upgrading it.
|
||||
|
||||
For example, if you want to upgrade to `3.3.8-1` on Debian or Ubuntu, either call
|
||||
For example, if you want to upgrade to `3.3.14-1` on Debian or Ubuntu, either call
|
||||
|
||||
```
|
||||
$ apt install arangodb=3.3.8
|
||||
```bash
|
||||
apt install arangodb=3.3.14
|
||||
```
|
||||
|
||||
(`apt-get` on older versions) if you have added the ArangoDB repository. Or
|
||||
install a specific package using
|
||||
|
||||
```
|
||||
$ dpkg -i arangodb3-3.3.8-1_amd64.deb
|
||||
```bash
|
||||
dpkg -i arangodb3-3.3.14-1_amd64.deb
|
||||
```
|
||||
|
||||
after you have downloaded the corresponding file from https://download.arangodb.com/.
|
||||
|
@ -50,8 +50,8 @@ stop it now, as otherwise this standalone instance that is started on your machi
|
|||
can create some confusion later. As you are using the _Starter_ you do not need
|
||||
this standalone instance, and you can hence stop it:
|
||||
|
||||
```
|
||||
$ service arangodb3 stop
|
||||
```bash
|
||||
service arangodb3 stop
|
||||
```
|
||||
|
||||
Also, you might want to remove the standalone instance from the default
|
||||
|
@ -59,8 +59,8 @@ _runlevels_ to prevent it to start on the next reboot of your machine. How this
|
|||
is done depends on your distribution and _init_ system. For example, on older Debian
|
||||
and Ubuntu systems using a SystemV-compatible _init_, you can use:
|
||||
|
||||
```
|
||||
$ update-rc.d -f arangodb3 remove
|
||||
```bash
|
||||
update-rc.d -f arangodb3 remove
|
||||
```
|
||||
|
||||
### Stop the _Starter_ without stopping the ArangoDB Server processes
|
||||
|
@ -69,37 +69,36 @@ Now all the _Starter_ (_arangodb_) processes have to be stopped.
|
|||
|
||||
Please note that **no** _arangod_ processes should be stopped.
|
||||
|
||||
In order to stop the _arangodb_ processes, leaving the _arangod_ processes they
|
||||
In order to stop the _arangodb_ processes, leaving the _arangod_ processes they
|
||||
have started up and running (as we want for a rolling upgrade), we will need to
|
||||
use a command like `kill -9`:
|
||||
|
||||
```
|
||||
```bash
|
||||
kill -9 <pid-of-starter>
|
||||
```
|
||||
|
||||
The _pid_ associated to your _Starter_ can be checked using a command like _ps_:
|
||||
|
||||
|
||||
```
|
||||
```bash
|
||||
ps -C arangodb -fww
|
||||
```
|
||||
|
||||
The output of the command above does not only show the PID's of all _arangodb_
|
||||
The output of the command above does not only show the PID's of all _arangodb_
|
||||
processes but also the used commands, which can be useful for the following
|
||||
restart of all _arangodb_ processes.
|
||||
|
||||
The output belove is from a test machine where three instances of a _Starter_ are
|
||||
The output below is from a test machine where three instances of a _Starter_ are
|
||||
running locally. In a more production-like scenario, you will find only one instance
|
||||
of _arangodb_ running:
|
||||
|
||||
```
|
||||
```bash
|
||||
ps -C arangodb -fww
|
||||
UID PID PPID C STIME TTY TIME CMD
|
||||
max 29419 3684 0 11:46 pts/1 00:00:00 arangodb --starter.data-dir=./db1
|
||||
max 29504 3695 0 11:46 pts/2 00:00:00 arangodb --starter.data-dir=./db2 --starter.join 127.0.0.1
|
||||
max 29513 3898 0 11:46 pts/4 00:00:00 arangodb --starter.data-dir=./db3 --starter.join 127.0.0.1
|
||||
```
|
||||
|
||||
|
||||
### Restart the _Starter_
|
||||
|
||||
When using a supervisor like _SystemD_, this will happens automatically. In case
|
||||
|
@ -113,150 +112,82 @@ situation:
|
|||
- The ArangoDB Server processes are up and running, and they are still on the
|
||||
old version
|
||||
|
||||
### Send an HTTP `POST` request to all _Starters_
|
||||
### Start the upgrade process of all _arangod_ & _arangosync_ servers
|
||||
|
||||
A `POST` request with an empty body hast to be sent to `/database-auto-upgrade`
|
||||
on all _Starters_ one by one.
|
||||
Run the following command:
|
||||
|
||||
Once the upgrade on the first _Starter_ has finished, the same request can be sent
|
||||
to the next one.
|
||||
```bash
|
||||
arangodb upgrade --starter.endpoint=<endpoint-of-a-starter>
|
||||
```
|
||||
|
||||
The default port of the first _Starter_ is 8528. In a local test (all _Starters_
|
||||
running on the same machine), the ports of the additional _Starters_ are increased
|
||||
by 5 (before 3.2.15 and 3.3.8) or 10 (since 3.2.15 and 3.3.8).
|
||||
The `--starter.endpoint` option can be set to the endpoint of any
|
||||
of the starters. E.g. `http://localhost:8528`.
|
||||
|
||||
Please note that the _Starter_ port is also shown in the _Starter_ output e.g.
|
||||
`Listening on 0.0.0.0:8528 (:8528)`.
|
||||
**Important:**
|
||||
|
||||
As the port of the _Starter_ is a configurable variable, please identify and use
|
||||
the one of your specific setup.
|
||||
The command above was introduced with 3.3.14 (and 3.2.17). If you are rolling upgrade a 3.3.x version
|
||||
to a version higher or equal to 3.3.14, or if you are rolling upgrade a 3.2.x version to a version higher
|
||||
or equal to 3.2.17 please use the command above.
|
||||
|
||||
You might use _curl_ to send the `POST` request. For example:
|
||||
If you are doing the rolling upgrade of a 3.3.x version to a version between 3.3.8 and 3.3.13 (included),
|
||||
or if you are rolling upgrade a 3.2.x version to 3.2.15 or 3.2.16, a different command has to be used
|
||||
(on all _Starters_ one by one):
|
||||
|
||||
```
|
||||
curl -X POST --dump - http://localhost:8538/database-auto-upgrade
|
||||
|
||||
HTTP/1.1 200 OK
|
||||
Date: Wed, 09 May 2018 10:35:35 GMT
|
||||
Content-Length: 2
|
||||
Content-Type: text/plain; charset=utf-8
|
||||
```
|
||||
|
||||
Response `200 OK` means that the request was accepted and the upgrade process
|
||||
for this _Starter_ has begun.
|
||||
|
||||
### _Starter_ response
|
||||
|
||||
The _Starter_ will respond to the HTTP `POST` request depending on the deployment
|
||||
mode.
|
||||
|
||||
#### Deployment mode `single`
|
||||
|
||||
The _Starter_ will:
|
||||
For deployment mode `single`, the `arangodb upgrade` command will:
|
||||
|
||||
- Restart the single server with an additional `--database.auto-upgrade=true` argument.
|
||||
The server will perform the auto-upgrade and then stop.
|
||||
After that the _Starter_ will automatically restart it with its normal arguments.
|
||||
|
||||
#### Deployment mode `activefailover`
|
||||
The `arangodb upgrade` command will complete right away.
|
||||
Inspect the log of the _Starter_ to know when the upgrade has finished.
|
||||
|
||||
The _Starter_ will:
|
||||
#### Deployment mode `activefailover` or `cluster`
|
||||
|
||||
- Turning off _supervision_ in the _Agency_ and wait for it to be confirmed.
|
||||
- Restarting one _Agent_ at a time with an additional `--database.auto-upgrade=true` argument.
|
||||
The _Agent_ will perform the auto-upgrade and then stop.
|
||||
After that the _Starter_ will automatically restart it with its normal arguments.
|
||||
- Restarting one single server at a time with an additional `--database.auto-upgrade=true` argument.
|
||||
This server will perform the auto-upgrade and then stop.
|
||||
After that the _Starter_ will automatically restart it with its normal arguments.
|
||||
- Turning on _supervision_ in the _Agency_ and wait for it to be confirmed.
|
||||
The _Starters_ will now perform an initial check that upgrading is possible
|
||||
and when that all succeeds, create an upgrade _plan_. This _plan_ is then
|
||||
executed by every _Starter_.
|
||||
|
||||
#### Deployment mode `cluster`
|
||||
The `arangodb upgrade` command will show the progress of the upgrade
|
||||
and stop when the upgrade has either finished successfully or finished
|
||||
with an error.
|
||||
|
||||
The _Starter_ will:
|
||||
### Retrying a failed upgrade
|
||||
|
||||
- Turning off _supervision_ in the _Agency_ and wait for it to be confirmed.
|
||||
- Restarting one _Agent_ at a time with an additional `--database.auto-upgrade=true` argument.
|
||||
The _Agent_ will perform the auto-upgrade and then stop.
|
||||
After that the _Starter_ will automatically restart it with its normal arguments.
|
||||
- Restarting one _DBSserver_ at a time with an additional `--database.auto-upgrade=true` argument.
|
||||
This _DBSserver_ will perform the auto-upgrade and then stop.
|
||||
After that the _Starter_ will automatically restart it with its normal arguments.
|
||||
- Restarting one _Coordinator_ at a time with an additional `--database.auto-upgrade=true` argument.
|
||||
This _Coordinator_ will perform the auto-upgrade and then stop.
|
||||
After that the _Starter_ will automatically restart it with its normal arguments.
|
||||
- Turning on _supervision_ in the _Agency_ and wait for it to be confirmed.
|
||||
Starting with 3.3.14 and 3.2.17, when an upgrade _plan_ (in deployment
|
||||
mode `activefailover` or `cluster`) has failed, it can be retried.
|
||||
|
||||
To retry, run:
|
||||
|
||||
Example: Rolling Upgrade in a Cluster
|
||||
-------------------------------------
|
||||
|
||||
In this example we will perform a rolling upgrade of an ArangoDB Cluster setup
|
||||
from version 3.3.7 to version 3.3.8.
|
||||
|
||||
Once a `POST` request to the first _Starter_ is sent, the following output is shown
|
||||
when the upgrade has finished:
|
||||
|
||||
```
|
||||
2018/05/09 12:33:02 Upgrading agent
|
||||
2018/05/09 12:33:05 restarting agent
|
||||
2018/05/09 12:33:05 Looking for a running instance of agent on port 8531
|
||||
2018/05/09 12:33:05 Starting agent on port 8531
|
||||
2018/05/09 12:33:05 Agency is not yet healthy: Agent http://localhost:8531 is not responding
|
||||
2018/05/09 12:33:06 restarting agent
|
||||
2018/05/09 12:33:06 Looking for a running instance of agent on port 8531
|
||||
2018/05/09 12:33:06 Starting agent on port 8531
|
||||
2018/05/09 12:33:07 agent up and running (version 3.3.8).
|
||||
2018/05/09 12:33:10 Upgrading dbserver
|
||||
2018/05/09 12:33:15 restarting dbserver
|
||||
2018/05/09 12:33:15 Looking for a running instance of dbserver on port 8530
|
||||
2018/05/09 12:33:15 Starting dbserver on port 8530
|
||||
2018/05/09 12:33:15 DBServers are not yet all responding: Get http://localhost:8530/_admin/server/id: dial tcp 127.0.0.1:8530: connect: connection refused
|
||||
2018/05/09 12:33:15 restarting dbserver
|
||||
2018/05/09 12:33:15 Looking for a running instance of dbserver on port 8530
|
||||
2018/05/09 12:33:15 Starting dbserver on port 8530
|
||||
2018/05/09 12:33:16 dbserver up and running (version 3.3.8).
|
||||
2018/05/09 12:33:20 Upgrading coordinator
|
||||
2018/05/09 12:33:23 restarting coordinator
|
||||
2018/05/09 12:33:23 Looking for a running instance of coordinator on port 8529
|
||||
2018/05/09 12:33:23 Starting coordinator on port 8529
|
||||
2018/05/09 12:33:23 Coordinator are not yet all responding: Get http://localhost:8529/_admin/server/id: dial tcp 127.0.0.1:8529: connect: connection refused
|
||||
2018/05/09 12:33:23 restarting coordinator
|
||||
2018/05/09 12:33:23 Looking for a running instance of coordinator on port 8529
|
||||
2018/05/09 12:33:23 Starting coordinator on port 8529
|
||||
2018/05/09 12:33:24 coordinator up and running (version 3.3.8).
|
||||
2018/05/09 12:33:24 Your cluster can now be accessed with a browser at `http://localhost:8529` or
|
||||
2018/05/09 12:33:24 using `arangosh --server.endpoint tcp://localhost:8529`.
|
||||
2018/05/09 12:33:28 Server versions:
|
||||
2018/05/09 12:33:28 agent 1 3.3.8
|
||||
2018/05/09 12:33:28 agent 2 3.3.7
|
||||
2018/05/09 12:33:28 agent 3 3.3.7
|
||||
2018/05/09 12:33:28 dbserver 1 3.3.8
|
||||
2018/05/09 12:33:28 dbserver 2 3.3.7
|
||||
2018/05/09 12:33:28 dbserver 3 3.3.7
|
||||
2018/05/09 12:33:28 coordinator 1 3.3.8
|
||||
2018/05/09 12:33:28 coordinator 2 3.3.7
|
||||
2018/05/09 12:33:28 coordinator 3 3.3.7
|
||||
2018/05/09 12:33:28 Upgrading of all servers controlled by this starter done, you can continue with the next starter now.
|
||||
```bash
|
||||
arangodb retry upgrade --starter.endpoint=<endpoint-of-a-starter>
|
||||
```
|
||||
|
||||
_Agent_ 1, _DBSserver_ 1 and _Coordinator_ 1 are successively updated and the last
|
||||
messages indicate that the `POST` request can be sent so the next _Starter_. After this
|
||||
procedure has been repeated for every _Starter_ the last _Starter_ will show:
|
||||
The `--starter.endpoint` option can be set to the endpoint of any
|
||||
of the starters. E.g. `http://localhost:8528`.
|
||||
|
||||
```
|
||||
2018/05/09 12:35:59 Server versions:
|
||||
2018/05/09 12:35:59 agent 1 3.3.8
|
||||
2018/05/09 12:35:59 agent 2 3.3.8
|
||||
2018/05/09 12:35:59 agent 3 3.3.8
|
||||
2018/05/09 12:35:59 dbserver 1 3.3.8
|
||||
2018/05/09 12:35:59 dbserver 2 3.3.8
|
||||
2018/05/09 12:35:59 dbserver 3 3.3.8
|
||||
2018/05/09 12:35:59 coordinator 1 3.3.8
|
||||
2018/05/09 12:35:59 coordinator 2 3.3.8
|
||||
2018/05/09 12:35:59 coordinator 3 3.3.8
|
||||
2018/05/09 12:35:59 Upgrading done.
|
||||
### Aborting an upgrade
|
||||
|
||||
Starting with 3.3.14 and 3.2.17, when an upgrade _plan_ (in deployment
|
||||
mode `activefailover` or `cluster`) is in progress or has failed, it can
|
||||
be aborted.
|
||||
|
||||
To abort, run:
|
||||
|
||||
```bash
|
||||
arangodb abort upgrade --starter.endpoint=<endpoint-of-a-starter>
|
||||
```
|
||||
|
||||
All _Agents_, _DBServers_ and _Coordinators_ are upgraded and the rolling upgrade
|
||||
has successfully finished.
|
||||
The `--starter.endpoint` option can be set to the endpoint of any
|
||||
of the starters. E.g. `http://localhost:8528`.
|
||||
|
||||
Note that an abort does not stop all upgrade processes immediately.
|
||||
If an _arangod_ or _arangosync_ server is being upgraded when the abort
|
||||
was issued, this upgrade will be finished. Remaining servers will not be
|
||||
upgraded.
|
||||
|
|
Loading…
Reference in New Issue