1
0
Fork 0

Doc - _rev warning (#8044)

This commit is contained in:
sleto-it 2019-02-19 00:14:39 +01:00 committed by GitHub
parent dd6329da25
commit 43faa28d90
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
4 changed files with 104 additions and 60 deletions

View File

@ -118,11 +118,7 @@ keys (see
Document Revision
-----------------
As ArangoDB supports MVCC, documents can exist in more than one revision. The document revision is the MVCC token used to identify a particular revision of a document. It is a string value currently containing an integer number and is unique within the list of document revisions for a single document. Document revisions can be used to conditionally update, replace or delete documents in the database. In order to find a particular revision of a document, you need the document handle and the document revision.
The document revision is stored in the `_rev` attribute of a document, and is set and updated by ArangoDB automatically. The `_rev` value cannot be set from the outside.
ArangoDB currently uses 64bit unsigned integer values to maintain document revisions internally. When returning document revisions to clients, ArangoDB will put them into a string to ensure the revision id is not clipped by clients that do not support big integers. Clients should treat the revision id returned by ArangoDB as an opaque string when they store or use it locally. This will allow ArangoDB to change the format of revision ids later if this should be required. Clients can use revisions ids to perform simple equality/non-equality comparisons (e.g. to check whether a document has changed or not), but they should not use revision ids to perform greater/less than comparisons with them to check if a document revision is older than one another, even if this might work for some cases.
@startDocuBlock documentRevision
Edge
----

View File

@ -78,61 +78,7 @@ values.
Document Revision
-----------------
As ArangoDB supports MVCC (Multiple Version Concurrency Control),
documents can exist in more than one
revision. The document revision is the MVCC token used to specify
a particular revision of a document (identified by its `_id`).
It is a string value that contained (up to ArangoDB 3.0)
an integer number and is unique within the list of document
revisions for a single document.
In ArangoDB >= 3.1 the _rev strings
are in fact time stamps. They use the local clock of the DBserver that
actually writes the document and have millisecond accuracy.
Actually, a "Hybrid Logical Clock" is used (for
this concept see
[this paper](http://www.cse.buffalo.edu/tech-reports/2014-04.pdf)).
Within one shard it is guaranteed that two different document revisions
have a different _rev string, even if they are written in the same
millisecond, and that these stamps are ascending.
Note however that different servers in your cluster might have a clock
skew, and therefore between different shards or even between different
collections the time stamps are not guaranteed to be comparable.
The Hybrid Logical Clock feature does one thing to address this
issue: Whenever a message is sent from some server A in your cluster to
another one B, it is ensured that any timestamp taken on B after the
message has arrived is greater than any timestamp taken on A before the
message was sent. This ensures that if there is some "causality" between
events on different servers, time stamps increase from cause to effect.
A direct consequence of this is that sometimes a server has to take
timestamps that seem to come from the future of its own clock. It will
however still produce ever increasing timestamps. If the clock skew is
small, then your timestamps will relatively accurately describe the time
when the document revision was actually written.
ArangoDB uses 64bit unsigned integer values to maintain
document revisions internally. At this stage we intentionally do not
document the exact format of the revision values. When returning
document revisions to
clients, ArangoDB will put them into a string to ensure the revision
is not clipped by clients that do not support big integers. Clients
should treat the revision returned by ArangoDB as an opaque string
when they store or use it locally. This will allow ArangoDB to change
the format of revisions later if this should be required (as has happened
with 3.1 with the Hybrid Logical Clock). Clients can
use revisions to perform simple equality/non-equality comparisons
(e.g. to check whether a document has changed or not), but they should
not use revision ids to perform greater/less than comparisons with them
to check if a document revision is older than one another, even if this
might work for some cases.
Document revisions can be used to
conditionally query, update, replace or delete documents in the database. In
order to find a particular revision of a document, you need the document
handle or key, and the document revision.
@startDocuBlock documentRevision
Multiple Documents in a single Command
--------------------------------------

View File

@ -0,0 +1,59 @@
As ArangoDB uses MVCC (Multiple Version Concurrency Control)
internally, documents can exist in more than one revision.
The document revision is the MVCC token used to specify
a particular revision of a document (identified by its `_id`).
It is a string value that contained (up to ArangoDB 3.0)
an integer number and is unique within the list of document
revisions for a single document.
In ArangoDB >= 3.1 the _rev strings
are in fact time stamps. They use the local clock of the DBserver that
actually writes the document and have millisecond accuracy.
A [_Hybrid Logical Clock_](http://www.cse.buffalo.edu/tech-reports/2014-04.pdf)
is used.
In a single server setup, `_rev` values are unique across all documents
and all collections. In a cluster setup,
within one shard it is guaranteed that two different document revisions
have a different `_rev` string, even if they are written in the same
millisecond, and that these stamps are ascending.
Note however that different servers in your cluster might have a clock
skew, and therefore between different shards or even between different
collections the time stamps are not guaranteed to be comparable.
The Hybrid Logical Clock feature does one thing to address this
issue: Whenever a message is sent from some server A in your cluster to
another one B, it is ensured that any timestamp taken on B after the
message has arrived is greater than any timestamp taken on A before the
message was sent. This ensures that if there is some "causality" between
events on different servers, time stamps increase from cause to effect.
A direct consequence of this is that sometimes a server has to take
timestamps that seem to come from the future of its own clock. It will
however still produce ever increasing timestamps. If the clock skew is
small, then your timestamps will relatively accurately describe the time
when the document revision was actually written.
ArangoDB uses 64bit unsigned integer values to maintain
document revisions internally. At this stage we intentionally do not
document the exact format of the revision values. When returning
document revisions to
clients, ArangoDB will put them into a string to ensure the revision
is not clipped by clients that do not support big integers. Clients
should treat the revision returned by ArangoDB as an opaque string
when they store or use it locally. This will allow ArangoDB to change
the format of revisions later if this should be required (as has happened
with 3.1 with the Hybrid Logical Clock). Clients can
use revisions to perform simple equality/non-equality comparisons
(e.g. to check whether a document has changed or not), but they should
not use revision ids to perform greater/less than comparisons with them
to check if a document revision is older than one another, even if this
might work for some cases.
Document revisions can be used to conditionally query, update, replace
or delete documents in the database.
In order to find a particular revision of a document, you need the document
handle or key, and the document revision.

View File

@ -0,0 +1,43 @@
@startDocuBlock documentRevision
Every document in ArangoDB has a revision, stored in the system attribute
`_rev`. It is fully managed by the server and read-only for the user.
Its value should be treated as opaque, no guarantees regarding its format
and properties are given except that it will be different after a
document update. More specifically, `_rev` values are unique across all
documents and all collections in a single server setup. In a cluster setup,
within one shard it is guaranteed that two different document revisions
have a different `_rev` string, even if they are written in the same
millisecond.
The `_rev` attribute can be used as a pre-condition for queries, to avoid
_lost update_ situations. That is, if a client fetches a document from the server,
modifies it locally (but with the `_rev` attribute untouched) and sends it back
to the server to update the document, but meanwhile the document was changed by
another operation, then the revisions do not match anymore and the operation
is cancelled by the server. Without this mechanism, the client would
accidentally overwrite changes made to the document without knowing about it.
When an existing document is updated or replaced, ArangoDB will write a new
version of this document to the write-ahead logfile (regardless of the
storage engine). When the new version of the document has been written, the
old version(s) will still be present, at least on disk. The same is true when
an existing document (version) gets removed: the old version of the document
plus the removal operation will be on disk for some time.
On disk it is therefore possible that multiple revisions of the same document
(as identified by the same `_key` value) exist at the same time. However,
stale revisions **are not accessible**. Once a document was updated or removed
successfully, no query or other data retrieval operation done by the user
will be able to see it any more. Furthermore, after some time, old revisions
will be removed internally. This is to avoid ever-growing disk usage.
{% hint 'warning' %}
From a **user perspective**, there is just **one single document revision
present per different `_key`** at every point in time. There is no built-in
system to automatically keep a history of all changes done to a document
and old versions of a document can not be restored via the `_rev` value.
{% endhint %}
@endDocuBlock