mirror of https://gitee.com/bigwinds/arangodb
Doc - _rev warning (#8044)
This commit is contained in:
parent
dd6329da25
commit
43faa28d90
|
@ -118,11 +118,7 @@ keys (see
|
|||
Document Revision
|
||||
-----------------
|
||||
|
||||
As ArangoDB supports MVCC, documents can exist in more than one revision. The document revision is the MVCC token used to identify a particular revision of a document. It is a string value currently containing an integer number and is unique within the list of document revisions for a single document. Document revisions can be used to conditionally update, replace or delete documents in the database. In order to find a particular revision of a document, you need the document handle and the document revision.
|
||||
|
||||
The document revision is stored in the `_rev` attribute of a document, and is set and updated by ArangoDB automatically. The `_rev` value cannot be set from the outside.
|
||||
|
||||
ArangoDB currently uses 64bit unsigned integer values to maintain document revisions internally. When returning document revisions to clients, ArangoDB will put them into a string to ensure the revision id is not clipped by clients that do not support big integers. Clients should treat the revision id returned by ArangoDB as an opaque string when they store or use it locally. This will allow ArangoDB to change the format of revision ids later if this should be required. Clients can use revisions ids to perform simple equality/non-equality comparisons (e.g. to check whether a document has changed or not), but they should not use revision ids to perform greater/less than comparisons with them to check if a document revision is older than one another, even if this might work for some cases.
|
||||
@startDocuBlock documentRevision
|
||||
|
||||
Edge
|
||||
----
|
||||
|
|
|
@ -78,61 +78,7 @@ values.
|
|||
Document Revision
|
||||
-----------------
|
||||
|
||||
As ArangoDB supports MVCC (Multiple Version Concurrency Control),
|
||||
documents can exist in more than one
|
||||
revision. The document revision is the MVCC token used to specify
|
||||
a particular revision of a document (identified by its `_id`).
|
||||
It is a string value that contained (up to ArangoDB 3.0)
|
||||
an integer number and is unique within the list of document
|
||||
revisions for a single document.
|
||||
In ArangoDB >= 3.1 the _rev strings
|
||||
are in fact time stamps. They use the local clock of the DBserver that
|
||||
actually writes the document and have millisecond accuracy.
|
||||
Actually, a "Hybrid Logical Clock" is used (for
|
||||
this concept see
|
||||
[this paper](http://www.cse.buffalo.edu/tech-reports/2014-04.pdf)).
|
||||
|
||||
Within one shard it is guaranteed that two different document revisions
|
||||
have a different _rev string, even if they are written in the same
|
||||
millisecond, and that these stamps are ascending.
|
||||
|
||||
Note however that different servers in your cluster might have a clock
|
||||
skew, and therefore between different shards or even between different
|
||||
collections the time stamps are not guaranteed to be comparable.
|
||||
|
||||
The Hybrid Logical Clock feature does one thing to address this
|
||||
issue: Whenever a message is sent from some server A in your cluster to
|
||||
another one B, it is ensured that any timestamp taken on B after the
|
||||
message has arrived is greater than any timestamp taken on A before the
|
||||
message was sent. This ensures that if there is some "causality" between
|
||||
events on different servers, time stamps increase from cause to effect.
|
||||
A direct consequence of this is that sometimes a server has to take
|
||||
timestamps that seem to come from the future of its own clock. It will
|
||||
however still produce ever increasing timestamps. If the clock skew is
|
||||
small, then your timestamps will relatively accurately describe the time
|
||||
when the document revision was actually written.
|
||||
|
||||
ArangoDB uses 64bit unsigned integer values to maintain
|
||||
document revisions internally. At this stage we intentionally do not
|
||||
document the exact format of the revision values. When returning
|
||||
document revisions to
|
||||
clients, ArangoDB will put them into a string to ensure the revision
|
||||
is not clipped by clients that do not support big integers. Clients
|
||||
should treat the revision returned by ArangoDB as an opaque string
|
||||
when they store or use it locally. This will allow ArangoDB to change
|
||||
the format of revisions later if this should be required (as has happened
|
||||
with 3.1 with the Hybrid Logical Clock). Clients can
|
||||
use revisions to perform simple equality/non-equality comparisons
|
||||
(e.g. to check whether a document has changed or not), but they should
|
||||
not use revision ids to perform greater/less than comparisons with them
|
||||
to check if a document revision is older than one another, even if this
|
||||
might work for some cases.
|
||||
|
||||
Document revisions can be used to
|
||||
conditionally query, update, replace or delete documents in the database. In
|
||||
order to find a particular revision of a document, you need the document
|
||||
handle or key, and the document revision.
|
||||
|
||||
@startDocuBlock documentRevision
|
||||
|
||||
Multiple Documents in a single Command
|
||||
--------------------------------------
|
||||
|
|
|
@ -0,0 +1,59 @@
|
|||
As ArangoDB uses MVCC (Multiple Version Concurrency Control)
|
||||
internally, documents can exist in more than one revision.
|
||||
The document revision is the MVCC token used to specify
|
||||
a particular revision of a document (identified by its `_id`).
|
||||
|
||||
|
||||
|
||||
It is a string value that contained (up to ArangoDB 3.0)
|
||||
an integer number and is unique within the list of document
|
||||
revisions for a single document.
|
||||
In ArangoDB >= 3.1 the _rev strings
|
||||
are in fact time stamps. They use the local clock of the DBserver that
|
||||
actually writes the document and have millisecond accuracy.
|
||||
A [_Hybrid Logical Clock_](http://www.cse.buffalo.edu/tech-reports/2014-04.pdf)
|
||||
is used.
|
||||
|
||||
In a single server setup, `_rev` values are unique across all documents
|
||||
and all collections. In a cluster setup,
|
||||
within one shard it is guaranteed that two different document revisions
|
||||
have a different `_rev` string, even if they are written in the same
|
||||
millisecond, and that these stamps are ascending.
|
||||
|
||||
Note however that different servers in your cluster might have a clock
|
||||
skew, and therefore between different shards or even between different
|
||||
collections the time stamps are not guaranteed to be comparable.
|
||||
|
||||
The Hybrid Logical Clock feature does one thing to address this
|
||||
issue: Whenever a message is sent from some server A in your cluster to
|
||||
another one B, it is ensured that any timestamp taken on B after the
|
||||
message has arrived is greater than any timestamp taken on A before the
|
||||
message was sent. This ensures that if there is some "causality" between
|
||||
events on different servers, time stamps increase from cause to effect.
|
||||
A direct consequence of this is that sometimes a server has to take
|
||||
timestamps that seem to come from the future of its own clock. It will
|
||||
however still produce ever increasing timestamps. If the clock skew is
|
||||
small, then your timestamps will relatively accurately describe the time
|
||||
when the document revision was actually written.
|
||||
|
||||
ArangoDB uses 64bit unsigned integer values to maintain
|
||||
document revisions internally. At this stage we intentionally do not
|
||||
document the exact format of the revision values. When returning
|
||||
document revisions to
|
||||
clients, ArangoDB will put them into a string to ensure the revision
|
||||
is not clipped by clients that do not support big integers. Clients
|
||||
should treat the revision returned by ArangoDB as an opaque string
|
||||
when they store or use it locally. This will allow ArangoDB to change
|
||||
the format of revisions later if this should be required (as has happened
|
||||
with 3.1 with the Hybrid Logical Clock). Clients can
|
||||
use revisions to perform simple equality/non-equality comparisons
|
||||
(e.g. to check whether a document has changed or not), but they should
|
||||
not use revision ids to perform greater/less than comparisons with them
|
||||
to check if a document revision is older than one another, even if this
|
||||
might work for some cases.
|
||||
|
||||
Document revisions can be used to conditionally query, update, replace
|
||||
or delete documents in the database.
|
||||
|
||||
In order to find a particular revision of a document, you need the document
|
||||
handle or key, and the document revision.
|
|
@ -0,0 +1,43 @@
|
|||
@startDocuBlock documentRevision
|
||||
|
||||
Every document in ArangoDB has a revision, stored in the system attribute
|
||||
`_rev`. It is fully managed by the server and read-only for the user.
|
||||
|
||||
Its value should be treated as opaque, no guarantees regarding its format
|
||||
and properties are given except that it will be different after a
|
||||
document update. More specifically, `_rev` values are unique across all
|
||||
documents and all collections in a single server setup. In a cluster setup,
|
||||
within one shard it is guaranteed that two different document revisions
|
||||
have a different `_rev` string, even if they are written in the same
|
||||
millisecond.
|
||||
|
||||
The `_rev` attribute can be used as a pre-condition for queries, to avoid
|
||||
_lost update_ situations. That is, if a client fetches a document from the server,
|
||||
modifies it locally (but with the `_rev` attribute untouched) and sends it back
|
||||
to the server to update the document, but meanwhile the document was changed by
|
||||
another operation, then the revisions do not match anymore and the operation
|
||||
is cancelled by the server. Without this mechanism, the client would
|
||||
accidentally overwrite changes made to the document without knowing about it.
|
||||
|
||||
When an existing document is updated or replaced, ArangoDB will write a new
|
||||
version of this document to the write-ahead logfile (regardless of the
|
||||
storage engine). When the new version of the document has been written, the
|
||||
old version(s) will still be present, at least on disk. The same is true when
|
||||
an existing document (version) gets removed: the old version of the document
|
||||
plus the removal operation will be on disk for some time.
|
||||
|
||||
On disk it is therefore possible that multiple revisions of the same document
|
||||
(as identified by the same `_key` value) exist at the same time. However,
|
||||
stale revisions **are not accessible**. Once a document was updated or removed
|
||||
successfully, no query or other data retrieval operation done by the user
|
||||
will be able to see it any more. Furthermore, after some time, old revisions
|
||||
will be removed internally. This is to avoid ever-growing disk usage.
|
||||
|
||||
{% hint 'warning' %}
|
||||
From a **user perspective**, there is just **one single document revision
|
||||
present per different `_key`** at every point in time. There is no built-in
|
||||
system to automatically keep a history of all changes done to a document
|
||||
and old versions of a document can not be restored via the `_rev` value.
|
||||
{% endhint %}
|
||||
|
||||
@endDocuBlock
|
Loading…
Reference in New Issue