mirror of https://gitee.com/bigwinds/arangodb
Doc - Arangodump improvements (#5881)
This commit is contained in:
parent
d5180ef838
commit
b95767247c
|
@ -42,7 +42,8 @@ arguments:
|
|||
- *--dump-data <bool>*: set to *true* to include documents in the dump. Set to *false*
|
||||
to exclude documents. The default value is *true*.
|
||||
- *--include-system-collections <bool>*: whether or not to include system collections
|
||||
in the dump. The default value is *false*.
|
||||
in the dump. The default value is *false*. **Set to _true_ if you are using named
|
||||
graphs that you are interested in restoring.**
|
||||
|
||||
For example, to only dump structural information of all collections (including system
|
||||
collections), use:
|
||||
|
@ -64,16 +65,15 @@ Document data for a collection will be saved in files with name pattern
|
|||
*<collection-name>.data.json*. Each line in a data file is a document insertion/update or
|
||||
deletion marker, alongside with some meta data.
|
||||
|
||||
Cluster Backup
|
||||
--------------
|
||||
|
||||
Starting with Version 2.1 of ArangoDB, the *arangodump* tool also
|
||||
supports sharding. Simply point it to one of the coordinators and it
|
||||
will behave exactly as described above, working on sharded collections
|
||||
in the cluster.
|
||||
|
||||
However, as opposed to the single instance situation, this operation
|
||||
does not guarantee to dump a consistent snapshot if write operations
|
||||
happen during the dump operation. It is therefore recommended not to
|
||||
perform any data-modification operations on the cluster whilst *arangodump*
|
||||
is running.
|
||||
Please see the [Limitations](Limitations.md).
|
||||
|
||||
As above, the output will be one structure description file and one data
|
||||
file per sharded collection. Note that the data in the data file is
|
||||
|
@ -84,68 +84,98 @@ and the shard keys.
|
|||
Note that the version of the arangodump client tool needs to match the
|
||||
version of the ArangoDB server it connects to.
|
||||
|
||||
Advanced cluster options
|
||||
------------------------
|
||||
### Advanced Cluster Options
|
||||
|
||||
Starting with version 3.1.17, collections may be created with shard
|
||||
distribution identical to an existing prototypical collection;
|
||||
i.e. shards are distributed in the very same pattern as in the
|
||||
prototype collection. Such collections cannot be dumped without the
|
||||
reference collection or arangodump yields an error.
|
||||
Starting with version 3.1.17, collections may be [created with shard
|
||||
distribution](../../DataModeling/Collections/DatabaseMethods.md#create)
|
||||
identical to an existing prototypical collection; i.e. shards are distributed in
|
||||
the very same pattern as in the prototype collection. Such collections cannot be
|
||||
dumped without the referenced collection or arangodump yields an error.
|
||||
|
||||
arangodump --collection clonedCollection --output-directory "dump"
|
||||
|
||||
ERROR Collection clonedCollection's shard distribution is based on a that of collection prototypeCollection, which is not dumped along. You may dump the collection regardless of the missing prototype collection by using the --ignore-distribute-shards-like-errors parameter.
|
||||
|
||||
There are two ways to approach that problem.
|
||||
Dump the prototype collection along:
|
||||
Dump the prototype collection as well:
|
||||
|
||||
arangodump --collection clonedCollection --collection prototypeCollection --output-directory "dump"
|
||||
|
||||
Processed 2 collection(s), wrote 81920 byte(s) into datafiles, sent 1 batch(es)
|
||||
|
||||
Or override that behavior to be able to dump the collection
|
||||
Or override that behavior to be able to dump the collection in isolation
|
||||
individually:
|
||||
|
||||
arangodump --collection B clonedCollection --output-directory "dump" --ignore-distribute-shards-like-errors
|
||||
arangodump --collection clonedCollection --output-directory "dump" --ignore-distribute-shards-like-errors
|
||||
|
||||
Processed 1 collection(s), wrote 34217 byte(s) into datafiles, sent 1 batch(es)
|
||||
|
||||
Note that in consequence, restoring such a collection without its
|
||||
prototype is affected. [arangorestore](../Arangorestore/README.md)
|
||||
Note that in consequence, restoring such a collection without its prototype is
|
||||
affected. See documentation on [arangorestore](../Arangorestore/README.md) for
|
||||
more details about restoring the collection.
|
||||
|
||||
Encryption
|
||||
----------
|
||||
|
||||
In the ArangoDB Enterprise Edition there are the additional parameters:
|
||||
{% hint 'info' %}
|
||||
This feature is only available in the
|
||||
[**Enterprise Edition**](https://www.arangodb.com/why-arangodb/arangodb-enterprise/)
|
||||
{% endhint %}
|
||||
|
||||
### Encryption key stored in file
|
||||
Starting from version 3.3 encryption of the dump is supported.
|
||||
|
||||
*--encryption.keyfile path-of-keyfile*
|
||||
The dump is encrypted using an encryption keyfile, which must contain exactly 32
|
||||
bytes of data (required by the AES block cipher).
|
||||
|
||||
The file `path-to-keyfile` must contain the encryption key. This
|
||||
file must be secured, so that only `arangod` can access it. You should
|
||||
also ensure that in case some-one steals the hardware, he will not be
|
||||
able to read the file. For example, by encryption `/mytmpfs` or
|
||||
creating a in-memory file-system under `/mytmpfs`.
|
||||
|
||||
### Encryption key generated by a program
|
||||
|
||||
*--encryption.key-generator path-to-my-generator*
|
||||
|
||||
The program `path-to-my-generator` must output the encryption on
|
||||
standard output and exit.
|
||||
|
||||
### Creating keys
|
||||
|
||||
The encryption keyfile must contain 32 bytes of random data.
|
||||
|
||||
You can create it with a command line this.
|
||||
The keyfile can be created by an external program, or, on Linux, by using a command
|
||||
like the following:
|
||||
|
||||
```
|
||||
dd if=/dev/random bs=1 count=32 of=yourSecretKeyFile
|
||||
```
|
||||
|
||||
For security, it is best to create these keys offline (away from your
|
||||
For security reasons, it is best to create these keys offline (away from your
|
||||
database servers) and directly store them in you secret management
|
||||
tool.
|
||||
|
||||
|
||||
In order to create an encrypted backup, add the `--encryption.keyfile`
|
||||
option when invoking _arangodump_, in addition to any other option you
|
||||
are already using. The following example assumes that your secret key
|
||||
is stored in ~/SECRET-KEY:
|
||||
|
||||
```
|
||||
arangodump --collection "secret-collection" dump --encryption.keyfile ~/SECRET-KEY
|
||||
```
|
||||
|
||||
Note that _arangodump_ will not store the key anywhere. It is the responsibility
|
||||
of the user to find a safe place for the key. However, _arangodump_ will store
|
||||
the used encryption method in a file named `ENCRYPTION` in the dump directory.
|
||||
That way _arangorestore_ can later find out whether it is dealing with an
|
||||
encrypted dump or not.
|
||||
|
||||
Trying to restore the encrypted dump without specifying the key will fail:
|
||||
|
||||
```
|
||||
arangorestore --collection "secret-collection" dump --create-collection true
|
||||
```
|
||||
|
||||
and _arangorestore_ will report the following error:
|
||||
|
||||
```
|
||||
the dump data seems to be encrypted with aes-256-ctr, but no key information was specified to decrypt the dump
|
||||
it is recommended to specify either `--encryption.keyfile` or `--encryption.key-generator` when invoking arangorestore with an encrypted dump
|
||||
```
|
||||
|
||||
It is required to use the exact same key when restoring the data. Again this is
|
||||
done by providing the `--encryption.keyfile` parameter:
|
||||
|
||||
```
|
||||
arangorestore --collection "secret-collection" dump --create-collection true --encryption.keyfile ~/SECRET-KEY
|
||||
```
|
||||
|
||||
Using a different key will lead to the backup being non-recoverable.
|
||||
|
||||
Note that encrypted backups can be used together with the already existing
|
||||
RocksDB encryption-at-rest feature, but they can also be used for the MMFiles
|
||||
engine, which does not have encryption-at-rest.
|
||||
|
|
|
@ -0,0 +1,16 @@
|
|||
Arangodump Limitations
|
||||
======================
|
||||
|
||||
_Arangodump_ has the following limitations:
|
||||
|
||||
- In a Cluster, _arangodump_ does not guarantee to dump a consistent snapshot if write
|
||||
operations happen while the dump is in progress. It is therefore recommended not to
|
||||
perform any data-modification operations on the cluster while _arangodump_
|
||||
is running. This is in contrast to what happens on a single instance, a master/slave,
|
||||
or active failover setup, where even if write operations are ongoing, the created dump
|
||||
is consistent, as a snapshot is taken when the dump starts.
|
||||
- If the MMFiles engine is in use, on a single instance, a master/slave, or active failover
|
||||
setup, even if the write operations are suspended, it is not guaranteed that the dump includes
|
||||
all the data that has been previously written as _arangodump_ will only dump the data
|
||||
included in the _datafiles_ but not the data that has not been transferred from the _WAL_
|
||||
to the _datafiles_. A WAL flush can be forced as documented in the [WAL flush](../../Appendix/JavaScriptModules/WAL.md#flushing) section.
|
|
@ -4,3 +4,26 @@ Arangodump Options
|
|||
Usage: `arangodump [<options>]`
|
||||
|
||||
@startDocuBlock program_options_arangodump
|
||||
|
||||
Encryption Options
|
||||
------------------
|
||||
|
||||
{% hint 'info' %}
|
||||
This feature is only available in the
|
||||
[**Enterprise Edition**](https://www.arangodb.com/why-arangodb/arangodb-enterprise/)
|
||||
{% endhint %}
|
||||
|
||||
*--encryption.keyfile path-of-keyfile*
|
||||
|
||||
The file `path-to-keyfile` must contain the encryption key. This
|
||||
file must be secured, so that only `arangodump` or `arangorestore` can access it.
|
||||
You should also ensure that in case someone steals your hardware, they will not be
|
||||
able to read the file. For example, by encrypting `/mytmpfs` or
|
||||
creating an in-memory file-system under `/mytmpfs`. The encryption keyfile must
|
||||
contain 32 bytes of data.
|
||||
|
||||
*--encryption.key-generator path-to-my-generator*
|
||||
|
||||
This output is used if you want to use the program to generate your encryption key.
|
||||
The program `path-to-my-generator` must output the encryption on standard output
|
||||
and exit. The encryption keyfile must contain 32 bytes of data.
|
|
@ -2,14 +2,16 @@ Arangodump
|
|||
==========
|
||||
|
||||
_Arangodump_ is a command-line client tool to create backups of the data and
|
||||
structures stored in [ArangoDB servers](../Arangod/README.md).
|
||||
structures stored in ArangoDB.
|
||||
|
||||
Dumps are meant to be restored with [_Arangorestore_](../Arangorestore/README.md).
|
||||
|
||||
If you want to export for external programs to formats like JSON or CSV, see
|
||||
[_Arangoexport_](../Arangoexport/README.md) instead.
|
||||
|
||||
_Arangodump_ can backup selected collections or all collections of a database,
|
||||
optionally including _system_ collections. One can backup the structure, i.e.
|
||||
the collections with their configuration without any data, only the data stored
|
||||
in them, or both. Dumps can optionally be encrypted.
|
||||
_Arangodump_ can be used for all ArangoDB deployments modes (Single Instance,
|
||||
Master/Slave, Active Failover, Cluster and DC2DC) and it can backup selected collections
|
||||
or all collections of a database, optionally including _system_ collections. One
|
||||
can backup the structure, i.e. the collections with their configuration without
|
||||
any data, only the data stored in them, or both. If you are using the Enterprise
|
||||
Edition, dumps can optionally be encrypted.
|
||||
|
|
|
@ -64,10 +64,10 @@ Trying to restore the encrypted dump without specifying the key will fail:
|
|||
|
||||
arangorestore will complain with:
|
||||
|
||||
> the dump data seems to be encrypted with aes-256-ctr, but no key information
|
||||
> was specified to decrypt the dump it is recommended to specify either
|
||||
> `--encryption.key-file` or `--encryption.key-generator` when invoking
|
||||
> arangorestore with an encrypted dump
|
||||
```
|
||||
the dump data seems to be encrypted with aes-256-ctr, but no key information was specified to decrypt the dump
|
||||
it is recommended to specify either `--encryption.keyfile` or `--encryption.key-generator` when invoking arangorestore with an encrypted dump
|
||||
```
|
||||
|
||||
It is required to use the exact same key when restoring the data. Again this is
|
||||
done by providing the `--encryption.keyfile` parameter:
|
||||
|
|
|
@ -47,6 +47,7 @@
|
|||
* [Arangodump](Programs/Arangodump/README.md)
|
||||
* [Examples](Programs/Arangodump/Examples.md)
|
||||
* [Options](Programs/Arangodump/Options.md)
|
||||
* [Limitations](Programs/Arangodump/Limitations.md)
|
||||
* [Arangorestore](Programs/Arangorestore/README.md)
|
||||
* [Examples](Programs/Arangorestore/Examples.md)
|
||||
* [Options](Programs/Arangorestore/Options.md)
|
||||
|
|
|
@ -197,7 +197,7 @@ void checkEncryption(arangodb::ManagedDirectory& directory) {
|
|||
<< ", but no key information was specified to decrypt the dump";
|
||||
LOG_TOPIC(WARN, Logger::RESTORE)
|
||||
<< "it is recommended to specify either "
|
||||
"`--encryption.key-file` or `--encryption.key-generator` "
|
||||
"`--encryption.keyfile` or `--encryption.key-generator` "
|
||||
"when invoking arangorestore with an encrypted dump";
|
||||
} else {
|
||||
LOG_TOPIC(INFO, Logger::RESTORE)
|
||||
|
|
Loading…
Reference in New Issue