mirror of https://gitee.com/bigwinds/arangodb
Doc - Arangodump improvements (#5881)
This commit is contained in:
parent
d5180ef838
commit
b95767247c
|
@ -42,7 +42,8 @@ arguments:
|
||||||
- *--dump-data <bool>*: set to *true* to include documents in the dump. Set to *false*
|
- *--dump-data <bool>*: set to *true* to include documents in the dump. Set to *false*
|
||||||
to exclude documents. The default value is *true*.
|
to exclude documents. The default value is *true*.
|
||||||
- *--include-system-collections <bool>*: whether or not to include system collections
|
- *--include-system-collections <bool>*: whether or not to include system collections
|
||||||
in the dump. The default value is *false*.
|
in the dump. The default value is *false*. **Set to _true_ if you are using named
|
||||||
|
graphs that you are interested in restoring.**
|
||||||
|
|
||||||
For example, to only dump structural information of all collections (including system
|
For example, to only dump structural information of all collections (including system
|
||||||
collections), use:
|
collections), use:
|
||||||
|
@ -64,16 +65,15 @@ Document data for a collection will be saved in files with name pattern
|
||||||
*<collection-name>.data.json*. Each line in a data file is a document insertion/update or
|
*<collection-name>.data.json*. Each line in a data file is a document insertion/update or
|
||||||
deletion marker, alongside with some meta data.
|
deletion marker, alongside with some meta data.
|
||||||
|
|
||||||
|
Cluster Backup
|
||||||
|
--------------
|
||||||
|
|
||||||
Starting with Version 2.1 of ArangoDB, the *arangodump* tool also
|
Starting with Version 2.1 of ArangoDB, the *arangodump* tool also
|
||||||
supports sharding. Simply point it to one of the coordinators and it
|
supports sharding. Simply point it to one of the coordinators and it
|
||||||
will behave exactly as described above, working on sharded collections
|
will behave exactly as described above, working on sharded collections
|
||||||
in the cluster.
|
in the cluster.
|
||||||
|
|
||||||
However, as opposed to the single instance situation, this operation
|
Please see the [Limitations](Limitations.md).
|
||||||
does not guarantee to dump a consistent snapshot if write operations
|
|
||||||
happen during the dump operation. It is therefore recommended not to
|
|
||||||
perform any data-modification operations on the cluster whilst *arangodump*
|
|
||||||
is running.
|
|
||||||
|
|
||||||
As above, the output will be one structure description file and one data
|
As above, the output will be one structure description file and one data
|
||||||
file per sharded collection. Note that the data in the data file is
|
file per sharded collection. Note that the data in the data file is
|
||||||
|
@ -84,68 +84,98 @@ and the shard keys.
|
||||||
Note that the version of the arangodump client tool needs to match the
|
Note that the version of the arangodump client tool needs to match the
|
||||||
version of the ArangoDB server it connects to.
|
version of the ArangoDB server it connects to.
|
||||||
|
|
||||||
Advanced cluster options
|
### Advanced Cluster Options
|
||||||
------------------------
|
|
||||||
|
|
||||||
Starting with version 3.1.17, collections may be created with shard
|
Starting with version 3.1.17, collections may be [created with shard
|
||||||
distribution identical to an existing prototypical collection;
|
distribution](../../DataModeling/Collections/DatabaseMethods.md#create)
|
||||||
i.e. shards are distributed in the very same pattern as in the
|
identical to an existing prototypical collection; i.e. shards are distributed in
|
||||||
prototype collection. Such collections cannot be dumped without the
|
the very same pattern as in the prototype collection. Such collections cannot be
|
||||||
reference collection or arangodump yields an error.
|
dumped without the referenced collection or arangodump yields an error.
|
||||||
|
|
||||||
arangodump --collection clonedCollection --output-directory "dump"
|
arangodump --collection clonedCollection --output-directory "dump"
|
||||||
|
|
||||||
ERROR Collection clonedCollection's shard distribution is based on a that of collection prototypeCollection, which is not dumped along. You may dump the collection regardless of the missing prototype collection by using the --ignore-distribute-shards-like-errors parameter.
|
ERROR Collection clonedCollection's shard distribution is based on a that of collection prototypeCollection, which is not dumped along. You may dump the collection regardless of the missing prototype collection by using the --ignore-distribute-shards-like-errors parameter.
|
||||||
|
|
||||||
There are two ways to approach that problem.
|
There are two ways to approach that problem.
|
||||||
Dump the prototype collection along:
|
Dump the prototype collection as well:
|
||||||
|
|
||||||
arangodump --collection clonedCollection --collection prototypeCollection --output-directory "dump"
|
arangodump --collection clonedCollection --collection prototypeCollection --output-directory "dump"
|
||||||
|
|
||||||
Processed 2 collection(s), wrote 81920 byte(s) into datafiles, sent 1 batch(es)
|
Processed 2 collection(s), wrote 81920 byte(s) into datafiles, sent 1 batch(es)
|
||||||
|
|
||||||
Or override that behavior to be able to dump the collection
|
Or override that behavior to be able to dump the collection in isolation
|
||||||
individually:
|
individually:
|
||||||
|
|
||||||
arangodump --collection B clonedCollection --output-directory "dump" --ignore-distribute-shards-like-errors
|
arangodump --collection clonedCollection --output-directory "dump" --ignore-distribute-shards-like-errors
|
||||||
|
|
||||||
Processed 1 collection(s), wrote 34217 byte(s) into datafiles, sent 1 batch(es)
|
Processed 1 collection(s), wrote 34217 byte(s) into datafiles, sent 1 batch(es)
|
||||||
|
|
||||||
Note that in consequence, restoring such a collection without its
|
Note that in consequence, restoring such a collection without its prototype is
|
||||||
prototype is affected. [arangorestore](../Arangorestore/README.md)
|
affected. See documentation on [arangorestore](../Arangorestore/README.md) for
|
||||||
|
more details about restoring the collection.
|
||||||
|
|
||||||
Encryption
|
Encryption
|
||||||
----------
|
----------
|
||||||
|
|
||||||
In the ArangoDB Enterprise Edition there are the additional parameters:
|
{% hint 'info' %}
|
||||||
|
This feature is only available in the
|
||||||
|
[**Enterprise Edition**](https://www.arangodb.com/why-arangodb/arangodb-enterprise/)
|
||||||
|
{% endhint %}
|
||||||
|
|
||||||
### Encryption key stored in file
|
Starting from version 3.3 encryption of the dump is supported.
|
||||||
|
|
||||||
*--encryption.keyfile path-of-keyfile*
|
The dump is encrypted using an encryption keyfile, which must contain exactly 32
|
||||||
|
bytes of data (required by the AES block cipher).
|
||||||
|
|
||||||
The file `path-to-keyfile` must contain the encryption key. This
|
The keyfile can be created by an external program, or, on Linux, by using a command
|
||||||
file must be secured, so that only `arangod` can access it. You should
|
like the following:
|
||||||
also ensure that in case some-one steals the hardware, he will not be
|
|
||||||
able to read the file. For example, by encryption `/mytmpfs` or
|
|
||||||
creating a in-memory file-system under `/mytmpfs`.
|
|
||||||
|
|
||||||
### Encryption key generated by a program
|
|
||||||
|
|
||||||
*--encryption.key-generator path-to-my-generator*
|
|
||||||
|
|
||||||
The program `path-to-my-generator` must output the encryption on
|
|
||||||
standard output and exit.
|
|
||||||
|
|
||||||
### Creating keys
|
|
||||||
|
|
||||||
The encryption keyfile must contain 32 bytes of random data.
|
|
||||||
|
|
||||||
You can create it with a command line this.
|
|
||||||
|
|
||||||
```
|
```
|
||||||
dd if=/dev/random bs=1 count=32 of=yourSecretKeyFile
|
dd if=/dev/random bs=1 count=32 of=yourSecretKeyFile
|
||||||
```
|
```
|
||||||
|
|
||||||
For security, it is best to create these keys offline (away from your
|
For security reasons, it is best to create these keys offline (away from your
|
||||||
database servers) and directly store them in you secret management
|
database servers) and directly store them in you secret management
|
||||||
tool.
|
tool.
|
||||||
|
|
||||||
|
|
||||||
|
In order to create an encrypted backup, add the `--encryption.keyfile`
|
||||||
|
option when invoking _arangodump_, in addition to any other option you
|
||||||
|
are already using. The following example assumes that your secret key
|
||||||
|
is stored in ~/SECRET-KEY:
|
||||||
|
|
||||||
|
```
|
||||||
|
arangodump --collection "secret-collection" dump --encryption.keyfile ~/SECRET-KEY
|
||||||
|
```
|
||||||
|
|
||||||
|
Note that _arangodump_ will not store the key anywhere. It is the responsibility
|
||||||
|
of the user to find a safe place for the key. However, _arangodump_ will store
|
||||||
|
the used encryption method in a file named `ENCRYPTION` in the dump directory.
|
||||||
|
That way _arangorestore_ can later find out whether it is dealing with an
|
||||||
|
encrypted dump or not.
|
||||||
|
|
||||||
|
Trying to restore the encrypted dump without specifying the key will fail:
|
||||||
|
|
||||||
|
```
|
||||||
|
arangorestore --collection "secret-collection" dump --create-collection true
|
||||||
|
```
|
||||||
|
|
||||||
|
and _arangorestore_ will report the following error:
|
||||||
|
|
||||||
|
```
|
||||||
|
the dump data seems to be encrypted with aes-256-ctr, but no key information was specified to decrypt the dump
|
||||||
|
it is recommended to specify either `--encryption.keyfile` or `--encryption.key-generator` when invoking arangorestore with an encrypted dump
|
||||||
|
```
|
||||||
|
|
||||||
|
It is required to use the exact same key when restoring the data. Again this is
|
||||||
|
done by providing the `--encryption.keyfile` parameter:
|
||||||
|
|
||||||
|
```
|
||||||
|
arangorestore --collection "secret-collection" dump --create-collection true --encryption.keyfile ~/SECRET-KEY
|
||||||
|
```
|
||||||
|
|
||||||
|
Using a different key will lead to the backup being non-recoverable.
|
||||||
|
|
||||||
|
Note that encrypted backups can be used together with the already existing
|
||||||
|
RocksDB encryption-at-rest feature, but they can also be used for the MMFiles
|
||||||
|
engine, which does not have encryption-at-rest.
|
||||||
|
|
|
@ -0,0 +1,16 @@
|
||||||
|
Arangodump Limitations
|
||||||
|
======================
|
||||||
|
|
||||||
|
_Arangodump_ has the following limitations:
|
||||||
|
|
||||||
|
- In a Cluster, _arangodump_ does not guarantee to dump a consistent snapshot if write
|
||||||
|
operations happen while the dump is in progress. It is therefore recommended not to
|
||||||
|
perform any data-modification operations on the cluster while _arangodump_
|
||||||
|
is running. This is in contrast to what happens on a single instance, a master/slave,
|
||||||
|
or active failover setup, where even if write operations are ongoing, the created dump
|
||||||
|
is consistent, as a snapshot is taken when the dump starts.
|
||||||
|
- If the MMFiles engine is in use, on a single instance, a master/slave, or active failover
|
||||||
|
setup, even if the write operations are suspended, it is not guaranteed that the dump includes
|
||||||
|
all the data that has been previously written as _arangodump_ will only dump the data
|
||||||
|
included in the _datafiles_ but not the data that has not been transferred from the _WAL_
|
||||||
|
to the _datafiles_. A WAL flush can be forced as documented in the [WAL flush](../../Appendix/JavaScriptModules/WAL.md#flushing) section.
|
|
@ -4,3 +4,26 @@ Arangodump Options
|
||||||
Usage: `arangodump [<options>]`
|
Usage: `arangodump [<options>]`
|
||||||
|
|
||||||
@startDocuBlock program_options_arangodump
|
@startDocuBlock program_options_arangodump
|
||||||
|
|
||||||
|
Encryption Options
|
||||||
|
------------------
|
||||||
|
|
||||||
|
{% hint 'info' %}
|
||||||
|
This feature is only available in the
|
||||||
|
[**Enterprise Edition**](https://www.arangodb.com/why-arangodb/arangodb-enterprise/)
|
||||||
|
{% endhint %}
|
||||||
|
|
||||||
|
*--encryption.keyfile path-of-keyfile*
|
||||||
|
|
||||||
|
The file `path-to-keyfile` must contain the encryption key. This
|
||||||
|
file must be secured, so that only `arangodump` or `arangorestore` can access it.
|
||||||
|
You should also ensure that in case someone steals your hardware, they will not be
|
||||||
|
able to read the file. For example, by encrypting `/mytmpfs` or
|
||||||
|
creating an in-memory file-system under `/mytmpfs`. The encryption keyfile must
|
||||||
|
contain 32 bytes of data.
|
||||||
|
|
||||||
|
*--encryption.key-generator path-to-my-generator*
|
||||||
|
|
||||||
|
This output is used if you want to use the program to generate your encryption key.
|
||||||
|
The program `path-to-my-generator` must output the encryption on standard output
|
||||||
|
and exit. The encryption keyfile must contain 32 bytes of data.
|
|
@ -2,14 +2,16 @@ Arangodump
|
||||||
==========
|
==========
|
||||||
|
|
||||||
_Arangodump_ is a command-line client tool to create backups of the data and
|
_Arangodump_ is a command-line client tool to create backups of the data and
|
||||||
structures stored in [ArangoDB servers](../Arangod/README.md).
|
structures stored in ArangoDB.
|
||||||
|
|
||||||
Dumps are meant to be restored with [_Arangorestore_](../Arangorestore/README.md).
|
Dumps are meant to be restored with [_Arangorestore_](../Arangorestore/README.md).
|
||||||
|
|
||||||
If you want to export for external programs to formats like JSON or CSV, see
|
If you want to export for external programs to formats like JSON or CSV, see
|
||||||
[_Arangoexport_](../Arangoexport/README.md) instead.
|
[_Arangoexport_](../Arangoexport/README.md) instead.
|
||||||
|
|
||||||
_Arangodump_ can backup selected collections or all collections of a database,
|
_Arangodump_ can be used for all ArangoDB deployments modes (Single Instance,
|
||||||
optionally including _system_ collections. One can backup the structure, i.e.
|
Master/Slave, Active Failover, Cluster and DC2DC) and it can backup selected collections
|
||||||
the collections with their configuration without any data, only the data stored
|
or all collections of a database, optionally including _system_ collections. One
|
||||||
in them, or both. Dumps can optionally be encrypted.
|
can backup the structure, i.e. the collections with their configuration without
|
||||||
|
any data, only the data stored in them, or both. If you are using the Enterprise
|
||||||
|
Edition, dumps can optionally be encrypted.
|
||||||
|
|
|
@ -64,10 +64,10 @@ Trying to restore the encrypted dump without specifying the key will fail:
|
||||||
|
|
||||||
arangorestore will complain with:
|
arangorestore will complain with:
|
||||||
|
|
||||||
> the dump data seems to be encrypted with aes-256-ctr, but no key information
|
```
|
||||||
> was specified to decrypt the dump it is recommended to specify either
|
the dump data seems to be encrypted with aes-256-ctr, but no key information was specified to decrypt the dump
|
||||||
> `--encryption.key-file` or `--encryption.key-generator` when invoking
|
it is recommended to specify either `--encryption.keyfile` or `--encryption.key-generator` when invoking arangorestore with an encrypted dump
|
||||||
> arangorestore with an encrypted dump
|
```
|
||||||
|
|
||||||
It is required to use the exact same key when restoring the data. Again this is
|
It is required to use the exact same key when restoring the data. Again this is
|
||||||
done by providing the `--encryption.keyfile` parameter:
|
done by providing the `--encryption.keyfile` parameter:
|
||||||
|
|
|
@ -47,6 +47,7 @@
|
||||||
* [Arangodump](Programs/Arangodump/README.md)
|
* [Arangodump](Programs/Arangodump/README.md)
|
||||||
* [Examples](Programs/Arangodump/Examples.md)
|
* [Examples](Programs/Arangodump/Examples.md)
|
||||||
* [Options](Programs/Arangodump/Options.md)
|
* [Options](Programs/Arangodump/Options.md)
|
||||||
|
* [Limitations](Programs/Arangodump/Limitations.md)
|
||||||
* [Arangorestore](Programs/Arangorestore/README.md)
|
* [Arangorestore](Programs/Arangorestore/README.md)
|
||||||
* [Examples](Programs/Arangorestore/Examples.md)
|
* [Examples](Programs/Arangorestore/Examples.md)
|
||||||
* [Options](Programs/Arangorestore/Options.md)
|
* [Options](Programs/Arangorestore/Options.md)
|
||||||
|
|
|
@ -197,7 +197,7 @@ void checkEncryption(arangodb::ManagedDirectory& directory) {
|
||||||
<< ", but no key information was specified to decrypt the dump";
|
<< ", but no key information was specified to decrypt the dump";
|
||||||
LOG_TOPIC(WARN, Logger::RESTORE)
|
LOG_TOPIC(WARN, Logger::RESTORE)
|
||||||
<< "it is recommended to specify either "
|
<< "it is recommended to specify either "
|
||||||
"`--encryption.key-file` or `--encryption.key-generator` "
|
"`--encryption.keyfile` or `--encryption.key-generator` "
|
||||||
"when invoking arangorestore with an encrypted dump";
|
"when invoking arangorestore with an encrypted dump";
|
||||||
} else {
|
} else {
|
||||||
LOG_TOPIC(INFO, Logger::RESTORE)
|
LOG_TOPIC(INFO, Logger::RESTORE)
|
||||||
|
|
Loading…
Reference in New Issue