mirror of https://gitee.com/bigwinds/arangodb
Doc - 3.3.14 new starter upgrade procedure + partial sync of external repos (#6020)
This commit is contained in:
parent
a4683fa118
commit
96db5356b8
|
@ -1,37 +1,7 @@
|
||||||
<!-- don't edit here, its from https://@github.com/arangodb-helper/arangodb.git / docs/Manual/ -->
|
<!-- don't edit here, its from https://@github.com/arangodb-helper/arangodb.git / docs/Manual/ -->
|
||||||
ArangoDB Starter Recovery Procedure
|
# ArangoDB Starter Administration
|
||||||
===================================
|
|
||||||
|
|
||||||
This procedure is intended to recover a cluster (that was started with the ArangoDB
|
This chapter documents administering the _ArangoDB Starter_.
|
||||||
_Starter_) when a machine of that cluster is broken without the possibility to recover
|
|
||||||
it (e.g. complete HD failure). In the procedure is does not matter if a replacement
|
|
||||||
machine uses the old or a new IP address.
|
|
||||||
|
|
||||||
To recover from this scenario, you must:
|
- [Remove a machine from the cluster](./Removal.md)
|
||||||
- Create a new (replacement) machine with ArangoDB (including _Starter_) installed.
|
- [Recover from a failed machine](./Recovery.md)
|
||||||
- Create a file called `RECOVERY` in the directory you are going to use as data
|
|
||||||
directory of the _Starter_ (the one that is passed via the option `--starter.data-dir`).
|
|
||||||
This file must contain the IP address and port of the _Starter_ that has been
|
|
||||||
broken (and will be replaced with this new machine).
|
|
||||||
|
|
||||||
E.g.
|
|
||||||
|
|
||||||
```bash
|
|
||||||
echo "192.168.1.25:8528" > $DATADIR/RECOVERY
|
|
||||||
```
|
|
||||||
|
|
||||||
After creating the `RECOVERY` file, start the _Starter_ using all the normal command
|
|
||||||
line arguments.
|
|
||||||
|
|
||||||
The _Starter_ will now:
|
|
||||||
1. Talk to the remaining _Starters_ to find the ID of the _Starter_ it replaces and
|
|
||||||
use that ID to join the remaining _Starters_.
|
|
||||||
1. Talk to the remaining _Agents_ to find the ID of the _Agent_ it replaces and
|
|
||||||
adjust the command-line arguments of the _Agent_ (it will start) to use that ID.
|
|
||||||
This is skipped if the _Starter_ was not running an _Agent_.
|
|
||||||
1. Remove the `RECOVERY` file from the data directory.
|
|
||||||
|
|
||||||
The cluster will now recover automatically. It will however have one more _Coordinators_
|
|
||||||
and _DBServers_ than expected. Exactly one _Coordinator_ and one _DBServer_ will
|
|
||||||
be listed "red" in the web UI of the database. They will have to be removed manually
|
|
||||||
using the ArangoDB Web UI.
|
|
||||||
|
|
|
@ -0,0 +1,38 @@
|
||||||
|
<!-- don't edit here, its from https://@github.com/arangodb-helper/arangodb.git / docs/Manual/ -->
|
||||||
|
# ArangoDB Starter Recovery Procedure
|
||||||
|
|
||||||
|
This procedure is intended to recover a cluster (that was started with the ArangoDB
|
||||||
|
_Starter_) when a machine of that cluster is broken without the possibility to recover
|
||||||
|
it (e.g. complete HD failure). In the procedure is does not matter if a replacement
|
||||||
|
machine uses the old or a new IP address.
|
||||||
|
|
||||||
|
To recover from this scenario, you must:
|
||||||
|
|
||||||
|
- Create a new (replacement) machine with ArangoDB (including _Starter_) installed.
|
||||||
|
- Create a file called `RECOVERY` in the directory you are going to use as data
|
||||||
|
directory of the _Starter_ (the one that is passed via the option `--starter.data-dir`).
|
||||||
|
This file must contain the IP address and port of the _Starter_ that has been
|
||||||
|
broken (and will be replaced with this new machine).
|
||||||
|
|
||||||
|
E.g.
|
||||||
|
|
||||||
|
```bash
|
||||||
|
echo "192.168.1.25:8528" > $DATADIR/RECOVERY
|
||||||
|
```
|
||||||
|
|
||||||
|
After creating the `RECOVERY` file, start the _Starter_ using all the normal command
|
||||||
|
line arguments.
|
||||||
|
|
||||||
|
The _Starter_ will now:
|
||||||
|
|
||||||
|
1. Talk to the remaining _Starters_ to find the ID of the _Starter_ it replaces and
|
||||||
|
use that ID to join the remaining _Starters_.
|
||||||
|
1. Talk to the remaining _Agents_ to find the ID of the _Agent_ it replaces and
|
||||||
|
adjust the command-line arguments of the _Agent_ (it will start) to use that ID.
|
||||||
|
This is skipped if the _Starter_ was not running an _Agent_.
|
||||||
|
1. Remove the `RECOVERY` file from the data directory.
|
||||||
|
|
||||||
|
The cluster will now recover automatically. It will however have one more _Coordinators_
|
||||||
|
and _DBServers_ than expected. Exactly one _Coordinator_ and one _DBServer_ will
|
||||||
|
be listed "red" in the web UI of the database. They will have to be removed manually
|
||||||
|
using the ArangoDB Web UI.
|
|
@ -0,0 +1,60 @@
|
||||||
|
<!-- don't edit here, its from https://@github.com/arangodb-helper/arangodb.git / docs/Manual/ -->
|
||||||
|
# ArangoDB Starter Removal Procedure
|
||||||
|
|
||||||
|
This procedure is intended to remove a machine from a cluster
|
||||||
|
(that was started with the ArangoDB _Starter_).
|
||||||
|
|
||||||
|
It is possible to run this procedure while the machine is still running
|
||||||
|
or when it has already been removed.
|
||||||
|
|
||||||
|
It is not possible to remove machines that have an agent on it!
|
||||||
|
Use the [recovery procedure](./Recovery.md) if you have a failed machine
|
||||||
|
with an agent on it.
|
||||||
|
|
||||||
|
Note that it is highly recommended to remove a machine while it is still running.
|
||||||
|
|
||||||
|
To remove a machine from a cluster, run the following command:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
arangodb remove starter --starter.endpoint=<endpoint> [--starter.id=<id>] [--force]
|
||||||
|
```
|
||||||
|
|
||||||
|
Where `<endpoint>` is the endpoint of the starter that you want to remove,
|
||||||
|
or the endpoint of one of the remaining starters. E.g. `http://localhost:8528`.
|
||||||
|
|
||||||
|
If you want to remove a machine that is no longer running, use the `--starter.id`
|
||||||
|
option. Set it to the ID of the ArangoDB _Starter_ on the machine that you want to remove.
|
||||||
|
|
||||||
|
You can find this ID in a `setup.json` file in the data directory of one of
|
||||||
|
the remaining ArangoDB _Starters_.
|
||||||
|
|
||||||
|
E.g.
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
...
|
||||||
|
"peers": {
|
||||||
|
"Peers": [
|
||||||
|
{
|
||||||
|
"ID": "21e42415",
|
||||||
|
"Address": "10.21.56.123",
|
||||||
|
"Port": 8528,
|
||||||
|
"PortOffset": 0,
|
||||||
|
"DataDir": "/mydata/server1",
|
||||||
|
"HasAgent": true,
|
||||||
|
"IsSecure": false
|
||||||
|
},
|
||||||
|
...
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
If the machine you want to remove has address `10.21.56.123` and was listening
|
||||||
|
on port `8528`, use ID `21e42415`.
|
||||||
|
|
||||||
|
The `remove starter` command will attempt the cleanout all data from the servers
|
||||||
|
of the machine that you want to remove.
|
||||||
|
This can take a long of time.
|
||||||
|
If the cleanout fails, the `remove starter` command will fail.
|
||||||
|
|
||||||
|
If you want to remove the machine even when the cleanout has failed, use
|
||||||
|
the `--force` option.
|
||||||
|
Note that this may lead to data loss!
|
|
@ -127,17 +127,14 @@ cluster is down, or in a bad state, irrespective of the value of this setting.
|
||||||
|
|
||||||
### `spec.rocksdb.encryption.keySecretName`
|
### `spec.rocksdb.encryption.keySecretName`
|
||||||
|
|
||||||
{% hint 'info' %}
|
|
||||||
This feature is only available in the
|
|
||||||
[**Enterprise Edition**](https://www.arangodb.com/why-arangodb/arangodb-enterprise/)
|
|
||||||
{% endhint %}
|
|
||||||
|
|
||||||
This setting specifies the name of a Kubernetes `Secret` that contains
|
This setting specifies the name of a Kubernetes `Secret` that contains
|
||||||
an encryption key used for encrypting all data stored by ArangoDB servers.
|
an encryption key used for encrypting all data stored by ArangoDB servers.
|
||||||
When an encryption key is used, encryption of the data in the cluster is enabled,
|
When an encryption key is used, encryption of the data in the cluster is enabled,
|
||||||
without it encryption is disabled.
|
without it encryption is disabled.
|
||||||
The default value is empty.
|
The default value is empty.
|
||||||
|
|
||||||
|
This requires the Enterprise version.
|
||||||
|
|
||||||
The encryption key cannot be changed after the cluster has been created.
|
The encryption key cannot be changed after the cluster has been created.
|
||||||
|
|
||||||
The secret specified by this setting, must have a data field named 'key' containing
|
The secret specified by this setting, must have a data field named 'key' containing
|
||||||
|
|
|
@ -94,36 +94,42 @@ This results in a file called `ca.crt` containing a PEM encoded, x509 CA certifi
|
||||||
|
|
||||||
## Query requests
|
## Query requests
|
||||||
|
|
||||||
For most client requests made by a driver, it does not matter if there is any kind
|
For most client requests made by a driver, it does not matter if there is any
|
||||||
of load-balancer between your client application and the ArangoDB deployment.
|
kind of load-balancer between your client application and the ArangoDB
|
||||||
|
deployment.
|
||||||
|
|
||||||
{% hint 'info' %}
|
{% hint 'info' %}
|
||||||
Note that even a simple `Service` of type `ClusterIP` already behaves as a load-balancer.
|
Note that even a simple `Service` of type `ClusterIP` already behaves as a
|
||||||
|
load-balancer.
|
||||||
{% endhint %}
|
{% endhint %}
|
||||||
|
|
||||||
The exception to this is cursor related requests made to an ArangoDB `Cluster` deployment.
|
The exception to this is cursor-related requests made to an ArangoDB `Cluster`
|
||||||
The coordinator that handles an initial query request (that results in a `Cursor`)
|
deployment. The coordinator that handles an initial query request (that results
|
||||||
will save some in-memory state in that coordinator, if the result of the query
|
in a `Cursor`) will save some in-memory state in that coordinator, if the result
|
||||||
is too big to be transfer back in the response of the initial request.
|
of the query is too big to be transfer back in the response of the initial
|
||||||
|
request.
|
||||||
|
|
||||||
Follow-up requests have to be made to fetch the remaining data.
|
Follow-up requests have to be made to fetch the remaining data. These follow-up
|
||||||
These follow-up requests must be handled by the same coordinator to which the initial
|
requests must be handled by the same coordinator to which the initial request
|
||||||
request was made.
|
was made. As soon as there is a load-balancer between your client application
|
||||||
|
and the ArangoDB cluster, it is uncertain which coordinator will receive the
|
||||||
|
follow-up request.
|
||||||
|
|
||||||
As soon as there is a load-balancer between your client application and the ArangoDB cluster,
|
ArangoDB will transparently forward any mismatched requests to the correct
|
||||||
it is uncertain which coordinator will actually handle the follow-up request.
|
coordinator, so the requests can be answered correctly without any additional
|
||||||
|
configuration. However, this incurs a small latency penalty due to the extra
|
||||||
|
request across the internal network.
|
||||||
|
|
||||||
To resolve this uncertainty, make sure to run your client application in the same
|
To prevent this uncertainty client-side, make sure to run your client
|
||||||
Kubernetes cluster and synchronize your endpoints before making the
|
application in the same Kubernetes cluster and synchronize your endpoints before
|
||||||
initial query request.
|
making the initial query request. This will result in the use (by the driver) of
|
||||||
This will result in the use (by the driver) of internal DNS names of all coordinators.
|
internal DNS names of all coordinators. A follow-up request can then be sent to
|
||||||
A follow-up request can then be sent to exactly the same coordinator.
|
exactly the same coordinator.
|
||||||
|
|
||||||
If your client application is running outside the Kubernetes cluster this is much harder
|
If your client application is running outside the Kubernetes cluster the easiest
|
||||||
to solve.
|
way to work around it is by making sure that the query results are small enough
|
||||||
The easiest way to work around it, is by making sure that the query results are small
|
to be returned by a single request. When that is not feasible, it is also
|
||||||
enough.
|
possible to resolve this when the internal DNS names of your Kubernetes cluster
|
||||||
When that is not feasible, it is also possible to resolve this
|
are exposed to your client application and the resulting IP addresses are
|
||||||
when the internal DNS names of your Kubernetes cluster are exposed to your client application
|
routable from your client application. To expose internal DNS names of your
|
||||||
and the resulting IP addresses are routable from your client application.
|
Kubernetes cluster, your can use [CoreDNS](https://coredns.io).
|
||||||
To expose internal DNS names of your Kubernetes cluster, your can use [CoreDNS](https://coredns.io).
|
|
||||||
|
|
|
@ -19,4 +19,4 @@ Each of these uses involves a different custom resource.
|
||||||
|
|
||||||
Continue with [Using the ArangoDB Kubernetes Operator](./Usage.md)
|
Continue with [Using the ArangoDB Kubernetes Operator](./Usage.md)
|
||||||
to learn how to install the ArangoDB Kubernetes operator and create
|
to learn how to install the ArangoDB Kubernetes operator and create
|
||||||
your first deployment.
|
your first deployment.
|
||||||
|
|
|
@ -39,4 +39,4 @@ kubectl apply -f examples/yourUpdatedDeployment.yaml
|
||||||
|
|
||||||
## See also
|
## See also
|
||||||
|
|
||||||
- [Scaling](./Scaling.md)
|
- [Scaling](./Scaling.md)
|
||||||
|
|
|
@ -262,6 +262,9 @@ This option only has to be specified if the standard search fails.
|
||||||
Sets the storage engine used by the `arangod` servers.
|
Sets the storage engine used by the `arangod` servers.
|
||||||
The value `rocksdb` is only allowed on `arangod` version 3.2 and up.
|
The value `rocksdb` is only allowed on `arangod` version 3.2 and up.
|
||||||
|
|
||||||
|
On `arangod` version 3.3 and earlier, the default value is `mmfiles`.
|
||||||
|
On `arangod` version 3.4 and later, the default value is `rocksdb`.
|
||||||
|
|
||||||
- `--cluster.start-coordinator=bool`
|
- `--cluster.start-coordinator=bool`
|
||||||
|
|
||||||
This indicates whether or not a coordinator instance should be started
|
This indicates whether or not a coordinator instance should be started
|
||||||
|
|
|
@ -314,7 +314,9 @@
|
||||||
# https://@github.com/arangodb/arangosync.git;arangosync;docs/Manual;;/
|
# https://@github.com/arangodb/arangosync.git;arangosync;docs/Manual;;/
|
||||||
* [Datacenter to datacenter replication](Administration/DC2DC/README.md)
|
* [Datacenter to datacenter replication](Administration/DC2DC/README.md)
|
||||||
# https://@github.com/arangodb-helper/arangodb.git;arangodb;docs/Manual;;/
|
# https://@github.com/arangodb-helper/arangodb.git;arangodb;docs/Manual;;/
|
||||||
* [ArangoDB Starter Recovery Procedure](Administration/Starter/README.md)
|
* [ArangoDB Starter Administration](Administration/Starter/README.md)
|
||||||
|
* [ArangoDB Starter Removal Procedure](Administration/Starter/Removal.md)
|
||||||
|
* [ArangoDB Starter Recovery Procedure](Administration/Starter/Recovery.md)
|
||||||
* [Security](Security/README.md)
|
* [Security](Security/README.md)
|
||||||
* [Change Root Password](Security/ChangeRootPassword.md)
|
* [Change Root Password](Security/ChangeRootPassword.md)
|
||||||
* [Encryption](Administration/Encryption/README.md)
|
* [Encryption](Administration/Encryption/README.md)
|
||||||
|
|
|
@ -149,11 +149,15 @@ Note: When you restart the starter, it remembers the original `--starter.local`
|
||||||
|
|
||||||
## Starting a cluster with datacenter to datacenter synchronization
|
## Starting a cluster with datacenter to datacenter synchronization
|
||||||
|
|
||||||
|
{% hint 'info' %}
|
||||||
|
This feature is only available in the
|
||||||
|
[**Enterprise Edition**](https://www.arangodb.com/why-arangodb/arangodb-enterprise/)
|
||||||
|
{% endhint %}
|
||||||
|
|
||||||
Datacenter to datacenter replication (DC2DC) requires a normal ArangoDB cluster in both data centers
|
Datacenter to datacenter replication (DC2DC) requires a normal ArangoDB cluster in both data centers
|
||||||
and one or more (`arangosync`) syncmasters & syncworkers in both data centers.
|
and one or more (`arangosync`) syncmasters & syncworkers in both data centers.
|
||||||
The starter enables you to run these syncmasters & syncworkers in combination with your normal
|
The starter enables you to run these syncmasters & syncworkers in combination with your normal
|
||||||
cluster.
|
cluster.
|
||||||
Note: Datacenter to datacenter replication is an ArangoDB Enterprise Edition feature.
|
|
||||||
|
|
||||||
To run a starter with DC2DC support you add the following arguments to the starters command line:
|
To run a starter with DC2DC support you add the following arguments to the starters command line:
|
||||||
|
|
||||||
|
|
|
@ -61,5 +61,5 @@ In addition to the paragraph above, rolling upgrades via the tool _Starter_ are
|
||||||
as documented in the _Section_ [Upgrading Starter Deployments](../Starter/README.md),
|
as documented in the _Section_ [Upgrading Starter Deployments](../Starter/README.md),
|
||||||
with the following limitations:
|
with the following limitations:
|
||||||
|
|
||||||
- Rolling upgrades between 3.2 and 3.3 are not supported before 3.2.15 and 3.3.8.
|
- Rolling upgrades between 3.2 and 3.3 are not supported before 3.2.15 and 3.3.9.
|
||||||
|
|
||||||
|
|
|
@ -1,6 +1,5 @@
|
||||||
<!-- don't edit here, its from https://@github.com/arangodb-helper/arangodb.git / docs/Manual/ -->
|
<!-- don't edit here, its from https://@github.com/arangodb-helper/arangodb.git / docs/Manual/ -->
|
||||||
Upgrading _Starter_ Deployments
|
# Upgrading _Starter_ Deployments
|
||||||
===============================
|
|
||||||
|
|
||||||
Starting from versions 3.2.15 and 3.3.8, the ArangoDB [_Starter_](../../Programs/Starter/README.md)
|
Starting from versions 3.2.15 and 3.3.8, the ArangoDB [_Starter_](../../Programs/Starter/README.md)
|
||||||
supports a new, automated, procedure to perform upgrades, including rolling upgrades
|
supports a new, automated, procedure to perform upgrades, including rolling upgrades
|
||||||
|
@ -8,12 +7,13 @@ of a [Cluster](../../Scalability/Cluster/README.md) setup.
|
||||||
|
|
||||||
The upgrade procedure of the _Starter_ described in this _Section_ can be used to
|
The upgrade procedure of the _Starter_ described in this _Section_ can be used to
|
||||||
upgrade to a new hotfix, or to perform an upgrade to a new minor version of ArangoDB.
|
upgrade to a new hotfix, or to perform an upgrade to a new minor version of ArangoDB.
|
||||||
|
Please refer to the [Upgrade Paths](../GeneralInfo/README.md#upgrade-paths) for detailed
|
||||||
|
information.
|
||||||
|
|
||||||
**Important:** rolling upgrades of Cluster setups from 3.2 to 3.3 are only supported
|
**Important:** rolling upgrades of Cluster setups from 3.2 to 3.3 are only supported
|
||||||
from versions 3.2.15 and 3.3.9.
|
from versions 3.2.15 and 3.3.9.
|
||||||
|
|
||||||
Upgrade Procedure
|
## Upgrade Procedure
|
||||||
-----------------
|
|
||||||
|
|
||||||
The following procedure has to be executed on every ArangoDB _Starter_ instance.
|
The following procedure has to be executed on every ArangoDB _Starter_ instance.
|
||||||
It is assumed that a _Starter_ deployment with mode `single`, `activefailover` or
|
It is assumed that a _Starter_ deployment with mode `single`, `activefailover` or
|
||||||
|
@ -24,21 +24,21 @@ It is assumed that a _Starter_ deployment with mode `single`, `activefailover` o
|
||||||
Installing the new ArangoDB version binary also includes the latest ArangoDB _Starter_
|
Installing the new ArangoDB version binary also includes the latest ArangoDB _Starter_
|
||||||
binary, which is necessary to perform the rolling upgrade.
|
binary, which is necessary to perform the rolling upgrade.
|
||||||
|
|
||||||
The first step is to install the new ArangoDB package.
|
The first step is to install the new ArangoDB package.
|
||||||
|
|
||||||
**Note:** you do not have to stop the _Starter_ processes before upgrading it.
|
**Note:** you do not have to stop the _Starter_ processes before upgrading it.
|
||||||
|
|
||||||
For example, if you want to upgrade to `3.3.8-1` on Debian or Ubuntu, either call
|
For example, if you want to upgrade to `3.3.14-1` on Debian or Ubuntu, either call
|
||||||
|
|
||||||
```
|
```bash
|
||||||
$ apt install arangodb=3.3.8
|
apt install arangodb=3.3.14
|
||||||
```
|
```
|
||||||
|
|
||||||
(`apt-get` on older versions) if you have added the ArangoDB repository. Or
|
(`apt-get` on older versions) if you have added the ArangoDB repository. Or
|
||||||
install a specific package using
|
install a specific package using
|
||||||
|
|
||||||
```
|
```bash
|
||||||
$ dpkg -i arangodb3-3.3.8-1_amd64.deb
|
dpkg -i arangodb3-3.3.14-1_amd64.deb
|
||||||
```
|
```
|
||||||
|
|
||||||
after you have downloaded the corresponding file from https://download.arangodb.com/.
|
after you have downloaded the corresponding file from https://download.arangodb.com/.
|
||||||
|
@ -50,8 +50,8 @@ stop it now, as otherwise this standalone instance that is started on your machi
|
||||||
can create some confusion later. As you are using the _Starter_ you do not need
|
can create some confusion later. As you are using the _Starter_ you do not need
|
||||||
this standalone instance, and you can hence stop it:
|
this standalone instance, and you can hence stop it:
|
||||||
|
|
||||||
```
|
```bash
|
||||||
$ service arangodb3 stop
|
service arangodb3 stop
|
||||||
```
|
```
|
||||||
|
|
||||||
Also, you might want to remove the standalone instance from the default
|
Also, you might want to remove the standalone instance from the default
|
||||||
|
@ -59,8 +59,8 @@ _runlevels_ to prevent it to start on the next reboot of your machine. How this
|
||||||
is done depends on your distribution and _init_ system. For example, on older Debian
|
is done depends on your distribution and _init_ system. For example, on older Debian
|
||||||
and Ubuntu systems using a SystemV-compatible _init_, you can use:
|
and Ubuntu systems using a SystemV-compatible _init_, you can use:
|
||||||
|
|
||||||
```
|
```bash
|
||||||
$ update-rc.d -f arangodb3 remove
|
update-rc.d -f arangodb3 remove
|
||||||
```
|
```
|
||||||
|
|
||||||
### Stop the _Starter_ without stopping the ArangoDB Server processes
|
### Stop the _Starter_ without stopping the ArangoDB Server processes
|
||||||
|
@ -69,37 +69,36 @@ Now all the _Starter_ (_arangodb_) processes have to be stopped.
|
||||||
|
|
||||||
Please note that **no** _arangod_ processes should be stopped.
|
Please note that **no** _arangod_ processes should be stopped.
|
||||||
|
|
||||||
In order to stop the _arangodb_ processes, leaving the _arangod_ processes they
|
In order to stop the _arangodb_ processes, leaving the _arangod_ processes they
|
||||||
have started up and running (as we want for a rolling upgrade), we will need to
|
have started up and running (as we want for a rolling upgrade), we will need to
|
||||||
use a command like `kill -9`:
|
use a command like `kill -9`:
|
||||||
|
|
||||||
```
|
```bash
|
||||||
kill -9 <pid-of-starter>
|
kill -9 <pid-of-starter>
|
||||||
```
|
```
|
||||||
|
|
||||||
The _pid_ associated to your _Starter_ can be checked using a command like _ps_:
|
The _pid_ associated to your _Starter_ can be checked using a command like _ps_:
|
||||||
|
|
||||||
|
```bash
|
||||||
```
|
|
||||||
ps -C arangodb -fww
|
ps -C arangodb -fww
|
||||||
```
|
```
|
||||||
|
|
||||||
The output of the command above does not only show the PID's of all _arangodb_
|
The output of the command above does not only show the PID's of all _arangodb_
|
||||||
processes but also the used commands, which can be useful for the following
|
processes but also the used commands, which can be useful for the following
|
||||||
restart of all _arangodb_ processes.
|
restart of all _arangodb_ processes.
|
||||||
|
|
||||||
The output belove is from a test machine where three instances of a _Starter_ are
|
The output below is from a test machine where three instances of a _Starter_ are
|
||||||
running locally. In a more production-like scenario, you will find only one instance
|
running locally. In a more production-like scenario, you will find only one instance
|
||||||
of _arangodb_ running:
|
of _arangodb_ running:
|
||||||
|
|
||||||
```
|
```bash
|
||||||
ps -C arangodb -fww
|
ps -C arangodb -fww
|
||||||
UID PID PPID C STIME TTY TIME CMD
|
UID PID PPID C STIME TTY TIME CMD
|
||||||
max 29419 3684 0 11:46 pts/1 00:00:00 arangodb --starter.data-dir=./db1
|
max 29419 3684 0 11:46 pts/1 00:00:00 arangodb --starter.data-dir=./db1
|
||||||
max 29504 3695 0 11:46 pts/2 00:00:00 arangodb --starter.data-dir=./db2 --starter.join 127.0.0.1
|
max 29504 3695 0 11:46 pts/2 00:00:00 arangodb --starter.data-dir=./db2 --starter.join 127.0.0.1
|
||||||
max 29513 3898 0 11:46 pts/4 00:00:00 arangodb --starter.data-dir=./db3 --starter.join 127.0.0.1
|
max 29513 3898 0 11:46 pts/4 00:00:00 arangodb --starter.data-dir=./db3 --starter.join 127.0.0.1
|
||||||
```
|
```
|
||||||
|
|
||||||
### Restart the _Starter_
|
### Restart the _Starter_
|
||||||
|
|
||||||
When using a supervisor like _SystemD_, this will happens automatically. In case
|
When using a supervisor like _SystemD_, this will happens automatically. In case
|
||||||
|
@ -113,150 +112,82 @@ situation:
|
||||||
- The ArangoDB Server processes are up and running, and they are still on the
|
- The ArangoDB Server processes are up and running, and they are still on the
|
||||||
old version
|
old version
|
||||||
|
|
||||||
### Send an HTTP `POST` request to all _Starters_
|
### Start the upgrade process of all _arangod_ & _arangosync_ servers
|
||||||
|
|
||||||
A `POST` request with an empty body hast to be sent to `/database-auto-upgrade`
|
Run the following command:
|
||||||
on all _Starters_ one by one.
|
|
||||||
|
|
||||||
Once the upgrade on the first _Starter_ has finished, the same request can be sent
|
```bash
|
||||||
to the next one.
|
arangodb upgrade --starter.endpoint=<endpoint-of-a-starter>
|
||||||
|
```
|
||||||
|
|
||||||
The default port of the first _Starter_ is 8528. In a local test (all _Starters_
|
The `--starter.endpoint` option can be set to the endpoint of any
|
||||||
running on the same machine), the ports of the additional _Starters_ are increased
|
of the starters. E.g. `http://localhost:8528`.
|
||||||
by 5 (before 3.2.15 and 3.3.8) or 10 (since 3.2.15 and 3.3.8).
|
|
||||||
|
|
||||||
Please note that the _Starter_ port is also shown in the _Starter_ output e.g.
|
**Important:**
|
||||||
`Listening on 0.0.0.0:8528 (:8528)`.
|
|
||||||
|
|
||||||
As the port of the _Starter_ is a configurable variable, please identify and use
|
The command above was introduced with 3.3.14 (and 3.2.17). If you are rolling upgrade a 3.3.x version
|
||||||
the one of your specific setup.
|
to a version higher or equal to 3.3.14, or if you are rolling upgrade a 3.2.x version to a version higher
|
||||||
|
or equal to 3.2.17 please use the command above.
|
||||||
|
|
||||||
You might use _curl_ to send the `POST` request. For example:
|
If you are doing the rolling upgrade of a 3.3.x version to a version between 3.3.8 and 3.3.13 (included),
|
||||||
|
or if you are rolling upgrade a 3.2.x version to 3.2.15 or 3.2.16, a different command has to be used
|
||||||
|
(on all _Starters_ one by one):
|
||||||
|
|
||||||
```
|
```
|
||||||
curl -X POST --dump - http://localhost:8538/database-auto-upgrade
|
curl -X POST --dump - http://localhost:8538/database-auto-upgrade
|
||||||
|
|
||||||
HTTP/1.1 200 OK
|
|
||||||
Date: Wed, 09 May 2018 10:35:35 GMT
|
|
||||||
Content-Length: 2
|
|
||||||
Content-Type: text/plain; charset=utf-8
|
|
||||||
```
|
```
|
||||||
|
|
||||||
Response `200 OK` means that the request was accepted and the upgrade process
|
|
||||||
for this _Starter_ has begun.
|
|
||||||
|
|
||||||
### _Starter_ response
|
|
||||||
|
|
||||||
The _Starter_ will respond to the HTTP `POST` request depending on the deployment
|
|
||||||
mode.
|
|
||||||
|
|
||||||
#### Deployment mode `single`
|
#### Deployment mode `single`
|
||||||
|
|
||||||
The _Starter_ will:
|
For deployment mode `single`, the `arangodb upgrade` command will:
|
||||||
|
|
||||||
- Restart the single server with an additional `--database.auto-upgrade=true` argument.
|
- Restart the single server with an additional `--database.auto-upgrade=true` argument.
|
||||||
The server will perform the auto-upgrade and then stop.
|
The server will perform the auto-upgrade and then stop.
|
||||||
After that the _Starter_ will automatically restart it with its normal arguments.
|
After that the _Starter_ will automatically restart it with its normal arguments.
|
||||||
|
|
||||||
#### Deployment mode `activefailover`
|
The `arangodb upgrade` command will complete right away.
|
||||||
|
Inspect the log of the _Starter_ to know when the upgrade has finished.
|
||||||
|
|
||||||
The _Starter_ will:
|
#### Deployment mode `activefailover` or `cluster`
|
||||||
|
|
||||||
- Turning off _supervision_ in the _Agency_ and wait for it to be confirmed.
|
The _Starters_ will now perform an initial check that upgrading is possible
|
||||||
- Restarting one _Agent_ at a time with an additional `--database.auto-upgrade=true` argument.
|
and when that all succeeds, create an upgrade _plan_. This _plan_ is then
|
||||||
The _Agent_ will perform the auto-upgrade and then stop.
|
executed by every _Starter_.
|
||||||
After that the _Starter_ will automatically restart it with its normal arguments.
|
|
||||||
- Restarting one single server at a time with an additional `--database.auto-upgrade=true` argument.
|
|
||||||
This server will perform the auto-upgrade and then stop.
|
|
||||||
After that the _Starter_ will automatically restart it with its normal arguments.
|
|
||||||
- Turning on _supervision_ in the _Agency_ and wait for it to be confirmed.
|
|
||||||
|
|
||||||
#### Deployment mode `cluster`
|
The `arangodb upgrade` command will show the progress of the upgrade
|
||||||
|
and stop when the upgrade has either finished successfully or finished
|
||||||
|
with an error.
|
||||||
|
|
||||||
The _Starter_ will:
|
### Retrying a failed upgrade
|
||||||
|
|
||||||
- Turning off _supervision_ in the _Agency_ and wait for it to be confirmed.
|
Starting with 3.3.14 and 3.2.17, when an upgrade _plan_ (in deployment
|
||||||
- Restarting one _Agent_ at a time with an additional `--database.auto-upgrade=true` argument.
|
mode `activefailover` or `cluster`) has failed, it can be retried.
|
||||||
The _Agent_ will perform the auto-upgrade and then stop.
|
|
||||||
After that the _Starter_ will automatically restart it with its normal arguments.
|
|
||||||
- Restarting one _DBSserver_ at a time with an additional `--database.auto-upgrade=true` argument.
|
|
||||||
This _DBSserver_ will perform the auto-upgrade and then stop.
|
|
||||||
After that the _Starter_ will automatically restart it with its normal arguments.
|
|
||||||
- Restarting one _Coordinator_ at a time with an additional `--database.auto-upgrade=true` argument.
|
|
||||||
This _Coordinator_ will perform the auto-upgrade and then stop.
|
|
||||||
After that the _Starter_ will automatically restart it with its normal arguments.
|
|
||||||
- Turning on _supervision_ in the _Agency_ and wait for it to be confirmed.
|
|
||||||
|
|
||||||
|
To retry, run:
|
||||||
|
|
||||||
Example: Rolling Upgrade in a Cluster
|
```bash
|
||||||
-------------------------------------
|
arangodb retry upgrade --starter.endpoint=<endpoint-of-a-starter>
|
||||||
|
|
||||||
In this example we will perform a rolling upgrade of an ArangoDB Cluster setup
|
|
||||||
from version 3.3.7 to version 3.3.8.
|
|
||||||
|
|
||||||
Once a `POST` request to the first _Starter_ is sent, the following output is shown
|
|
||||||
when the upgrade has finished:
|
|
||||||
|
|
||||||
```
|
|
||||||
2018/05/09 12:33:02 Upgrading agent
|
|
||||||
2018/05/09 12:33:05 restarting agent
|
|
||||||
2018/05/09 12:33:05 Looking for a running instance of agent on port 8531
|
|
||||||
2018/05/09 12:33:05 Starting agent on port 8531
|
|
||||||
2018/05/09 12:33:05 Agency is not yet healthy: Agent http://localhost:8531 is not responding
|
|
||||||
2018/05/09 12:33:06 restarting agent
|
|
||||||
2018/05/09 12:33:06 Looking for a running instance of agent on port 8531
|
|
||||||
2018/05/09 12:33:06 Starting agent on port 8531
|
|
||||||
2018/05/09 12:33:07 agent up and running (version 3.3.8).
|
|
||||||
2018/05/09 12:33:10 Upgrading dbserver
|
|
||||||
2018/05/09 12:33:15 restarting dbserver
|
|
||||||
2018/05/09 12:33:15 Looking for a running instance of dbserver on port 8530
|
|
||||||
2018/05/09 12:33:15 Starting dbserver on port 8530
|
|
||||||
2018/05/09 12:33:15 DBServers are not yet all responding: Get http://localhost:8530/_admin/server/id: dial tcp 127.0.0.1:8530: connect: connection refused
|
|
||||||
2018/05/09 12:33:15 restarting dbserver
|
|
||||||
2018/05/09 12:33:15 Looking for a running instance of dbserver on port 8530
|
|
||||||
2018/05/09 12:33:15 Starting dbserver on port 8530
|
|
||||||
2018/05/09 12:33:16 dbserver up and running (version 3.3.8).
|
|
||||||
2018/05/09 12:33:20 Upgrading coordinator
|
|
||||||
2018/05/09 12:33:23 restarting coordinator
|
|
||||||
2018/05/09 12:33:23 Looking for a running instance of coordinator on port 8529
|
|
||||||
2018/05/09 12:33:23 Starting coordinator on port 8529
|
|
||||||
2018/05/09 12:33:23 Coordinator are not yet all responding: Get http://localhost:8529/_admin/server/id: dial tcp 127.0.0.1:8529: connect: connection refused
|
|
||||||
2018/05/09 12:33:23 restarting coordinator
|
|
||||||
2018/05/09 12:33:23 Looking for a running instance of coordinator on port 8529
|
|
||||||
2018/05/09 12:33:23 Starting coordinator on port 8529
|
|
||||||
2018/05/09 12:33:24 coordinator up and running (version 3.3.8).
|
|
||||||
2018/05/09 12:33:24 Your cluster can now be accessed with a browser at `http://localhost:8529` or
|
|
||||||
2018/05/09 12:33:24 using `arangosh --server.endpoint tcp://localhost:8529`.
|
|
||||||
2018/05/09 12:33:28 Server versions:
|
|
||||||
2018/05/09 12:33:28 agent 1 3.3.8
|
|
||||||
2018/05/09 12:33:28 agent 2 3.3.7
|
|
||||||
2018/05/09 12:33:28 agent 3 3.3.7
|
|
||||||
2018/05/09 12:33:28 dbserver 1 3.3.8
|
|
||||||
2018/05/09 12:33:28 dbserver 2 3.3.7
|
|
||||||
2018/05/09 12:33:28 dbserver 3 3.3.7
|
|
||||||
2018/05/09 12:33:28 coordinator 1 3.3.8
|
|
||||||
2018/05/09 12:33:28 coordinator 2 3.3.7
|
|
||||||
2018/05/09 12:33:28 coordinator 3 3.3.7
|
|
||||||
2018/05/09 12:33:28 Upgrading of all servers controlled by this starter done, you can continue with the next starter now.
|
|
||||||
```
|
```
|
||||||
|
|
||||||
_Agent_ 1, _DBSserver_ 1 and _Coordinator_ 1 are successively updated and the last
|
The `--starter.endpoint` option can be set to the endpoint of any
|
||||||
messages indicate that the `POST` request can be sent so the next _Starter_. After this
|
of the starters. E.g. `http://localhost:8528`.
|
||||||
procedure has been repeated for every _Starter_ the last _Starter_ will show:
|
|
||||||
|
|
||||||
```
|
### Aborting an upgrade
|
||||||
2018/05/09 12:35:59 Server versions:
|
|
||||||
2018/05/09 12:35:59 agent 1 3.3.8
|
Starting with 3.3.14 and 3.2.17, when an upgrade _plan_ (in deployment
|
||||||
2018/05/09 12:35:59 agent 2 3.3.8
|
mode `activefailover` or `cluster`) is in progress or has failed, it can
|
||||||
2018/05/09 12:35:59 agent 3 3.3.8
|
be aborted.
|
||||||
2018/05/09 12:35:59 dbserver 1 3.3.8
|
|
||||||
2018/05/09 12:35:59 dbserver 2 3.3.8
|
To abort, run:
|
||||||
2018/05/09 12:35:59 dbserver 3 3.3.8
|
|
||||||
2018/05/09 12:35:59 coordinator 1 3.3.8
|
```bash
|
||||||
2018/05/09 12:35:59 coordinator 2 3.3.8
|
arangodb abort upgrade --starter.endpoint=<endpoint-of-a-starter>
|
||||||
2018/05/09 12:35:59 coordinator 3 3.3.8
|
|
||||||
2018/05/09 12:35:59 Upgrading done.
|
|
||||||
```
|
```
|
||||||
|
|
||||||
All _Agents_, _DBServers_ and _Coordinators_ are upgraded and the rolling upgrade
|
The `--starter.endpoint` option can be set to the endpoint of any
|
||||||
has successfully finished.
|
of the starters. E.g. `http://localhost:8528`.
|
||||||
|
|
||||||
|
Note that an abort does not stop all upgrade processes immediately.
|
||||||
|
If an _arangod_ or _arangosync_ server is being upgraded when the abort
|
||||||
|
was issued, this upgrade will be finished. Remaining servers will not be
|
||||||
|
upgraded.
|
||||||
|
|
Loading…
Reference in New Issue