mirror of https://gitee.com/bigwinds/arangodb
updated documentation
This commit is contained in:
parent
52783bd9fa
commit
bcdbf30ca2
56
CHANGELOG
56
CHANGELOG
|
@ -1,6 +1,62 @@
|
||||||
v2.5.0 (XXXX-XX-XX)
|
v2.5.0 (XXXX-XX-XX)
|
||||||
-------------------
|
-------------------
|
||||||
|
|
||||||
|
* added support for sparse hash and skiplist indexes
|
||||||
|
|
||||||
|
Hash and skiplist indexes can optionally be made sparse. Sparse indexes exclude documents
|
||||||
|
in which at least one of the index attributes is either not set or has a value of `null`.
|
||||||
|
|
||||||
|
As such documents are excluded from sparse indexes, they may contain fewer documents than
|
||||||
|
their non-sparse counterparts. This enables faster indexing and can lead to reduced memory
|
||||||
|
usage in case the indexed attribute does occur only in some, but not all documents of the
|
||||||
|
collection. Sparse indexes will also reduce the number of collisions in non-unique hash
|
||||||
|
indexes in case non-existing or optional attributes are indexed.
|
||||||
|
|
||||||
|
In order to create a sparse index, an object with the attribute `sparse` can be added to
|
||||||
|
the index creation commands:
|
||||||
|
|
||||||
|
db.collection.ensureHashIndex(attributeName, { sparse: true });
|
||||||
|
db.collection.ensureHashIndex(attributeName1, attributeName2, { sparse: true });
|
||||||
|
db.collection.ensureUniqueConstraint(attributeName, { sparse: true });
|
||||||
|
db.collection.ensureUniqueConstraint(attributeName1, attributeName2, { sparse: true });
|
||||||
|
|
||||||
|
db.collection.ensureSkiplist(attributeName, { sparse: true });
|
||||||
|
db.collection.ensureSkiplist(attributeName1, attributeName2, { sparse: true });
|
||||||
|
db.collection.ensureUniqueSkiplist(attributeName, { sparse: true });
|
||||||
|
db.collection.ensureUniqueSkiplist(attributeName1, attributeName2, { sparse: true });
|
||||||
|
|
||||||
|
When not explicitly set, the `sparse` attribute defaults to `false` for new indexes.
|
||||||
|
Other indexes than hash and skiplist do not support sparsity.
|
||||||
|
|
||||||
|
As sparse indexes may exclude some documents from the collection, they cannot be used for
|
||||||
|
all types of queries. Sparse hash indexes cannot be used to find documents for which at
|
||||||
|
least one of the indexed attributes has a value of `null`. For example, the following AQL
|
||||||
|
query cannot use a sparse index, even if one was created on attribute `attr`:
|
||||||
|
|
||||||
|
FOR doc In collection
|
||||||
|
FILTER doc.attr == null
|
||||||
|
RETURN doc
|
||||||
|
|
||||||
|
If the lookup value is non-constant, a sparse index may or may not be used, depending on
|
||||||
|
the other types of conditions in the query. If the optimizer can safely determine that
|
||||||
|
the lookup value cannot be `null`, a sparse index may be used. When uncertain, the optimizer
|
||||||
|
will not make use of a sparse index in a query in order to produce correct results.
|
||||||
|
|
||||||
|
For example, the following queries cannot use a sparse index on `attr` because the optimizer
|
||||||
|
will not know beforehand whether the comparsion values for `doc.attr` will include `null`:
|
||||||
|
|
||||||
|
FOR doc In collection
|
||||||
|
FILTER doc.attr == SOME_FUNCTION(...)
|
||||||
|
RETURN doc
|
||||||
|
|
||||||
|
FOR other IN otherCollection
|
||||||
|
FOR doc In collection
|
||||||
|
FILTER doc.attr == other.attr
|
||||||
|
RETURN doc
|
||||||
|
|
||||||
|
Sparse skiplist indexes can be used for sorting if the optimizer can safely detect that the
|
||||||
|
index range does not include `null` for any of the index attributes.
|
||||||
|
|
||||||
* inspection of AQL data-modification queries will now detect if the data-modification part
|
* inspection of AQL data-modification queries will now detect if the data-modification part
|
||||||
of the query can run in lockstep with the data retrieval part of the query, or if the data
|
of the query can run in lockstep with the data retrieval part of the query, or if the data
|
||||||
retrieval part must be executed before the data modification can start.
|
retrieval part must be executed before the data modification can start.
|
||||||
|
|
|
@ -23,7 +23,9 @@ ArangoDB provides the following index types:
|
||||||
For each collection there will always be a *primary index* which is a hash index
|
For each collection there will always be a *primary index* which is a hash index
|
||||||
for the [document keys](../Glossary/README.html#document_key) (`_key` attribute)
|
for the [document keys](../Glossary/README.html#document_key) (`_key` attribute)
|
||||||
of all documents in the collection. The primary index allows quick selection
|
of all documents in the collection. The primary index allows quick selection
|
||||||
of documents in the collection using either the `_key` or `_id` attributes.
|
of documents in the collection using either the `_key` or `_id` attributes. It will
|
||||||
|
be used from within AQL queries automatically when performing equality lookups on
|
||||||
|
`_key` or `_id`.
|
||||||
|
|
||||||
There are also dedicated functions to find a document given its `_key` or `_id`
|
There are also dedicated functions to find a document given its `_key` or `_id`
|
||||||
that will always make use of the primary index:
|
that will always make use of the primary index:
|
||||||
|
@ -33,7 +35,11 @@ db.collection.document("<document-key>");
|
||||||
db._document("<document-id>");
|
db._document("<document-id>");
|
||||||
```
|
```
|
||||||
|
|
||||||
The primary index of a collection cannot be dropped or changed.
|
As the primary index is a hash index, it cannot be used for range queries or for sorting
|
||||||
|
on `_key` or `_id`.
|
||||||
|
|
||||||
|
The primary index of a collection cannot be dropped or changed, and there is no
|
||||||
|
mechanism to create user-defined primary indexes.
|
||||||
|
|
||||||
|
|
||||||
!SUBSECTION Edges Index
|
!SUBSECTION Edges Index
|
||||||
|
@ -44,11 +50,10 @@ documents by either their `_from` or `_to` attributes. It can therefore be
|
||||||
used to quickly find connections between vertex documents and is invoked when
|
used to quickly find connections between vertex documents and is invoked when
|
||||||
the connecting edges of a vertex are queried.
|
the connecting edges of a vertex are queried.
|
||||||
|
|
||||||
The edges index cannot be dropped or changed. Extra edges indexes cannot be
|
Edges indexes are used from within AQL when performing equality lookups on `_from`
|
||||||
created on other attributes or in non-edge collections.
|
or `_to` values in an edge collections. There are also dedidacted functions to
|
||||||
|
find edges given their `_from` or `_to` values that will always make use of the
|
||||||
There are also dedidacted functions to find edges given their `_from` or `_to`
|
edges index:
|
||||||
values that will always make use of the edges index:
|
|
||||||
|
|
||||||
```js
|
```js
|
||||||
db.collection.edges("<from-value>");
|
db.collection.edges("<from-value>");
|
||||||
|
@ -59,146 +64,172 @@ db.collection.inEdges("<from-value>");
|
||||||
db.collection.inEdges("<to-value>");
|
db.collection.inEdges("<to-value>");
|
||||||
```
|
```
|
||||||
|
|
||||||
|
The edges index is a hash index. It can be used for equality lookups only, but not for range
|
||||||
|
queries or for sorting. As edges indexes are automatically created for edge collections, it
|
||||||
|
is not possible to create user-defined edges indexes.
|
||||||
|
|
||||||
|
The edges index cannot be dropped or changed.
|
||||||
|
|
||||||
|
|
||||||
!SUBSECTION Hash Index
|
!SUBSECTION Hash Index
|
||||||
|
|
||||||
A hash index can be used to quickly find documents with specific attribute values.
|
A hash index can be used to quickly find documents with specific attribute values.
|
||||||
The hash index is unsorted, so it supports equality lookups but no range queries.
|
The hash index is unsorted, so it supports equality lookups but no range queries or sorting.
|
||||||
|
|
||||||
A hash index can be created on one or multiple document attributes. A hash index will
|
A hash index can be created on one or multiple document attributes. A hash index will
|
||||||
only be used by a query if all indexed attributes are present in the search condition,
|
only be used by a query if all indexed attributes are present in the search condition,
|
||||||
and if all attributes are compared using the equality (`==`) operator.
|
and if all attributes are compared using the equality (`==`) operator. Hash indexes are
|
||||||
|
used from within AQL and several query functions, e.g. `byExample`, `firstExample` etc.
|
||||||
|
|
||||||
Hash indexes can optionally be declared to be unique, disallowing saving the same
|
Hash indexes can optionally be declared to be unique, disallowing saving the same
|
||||||
value in the indexed attribute.
|
value in the indexed attribute. Hash indexes can optionally be sparse.
|
||||||
|
|
||||||
Hash indexes are supported by AQL and several query functions, e.g. `byExample`,
|
The different types of hash indexes have the following characteristics:
|
||||||
`firstExample` etc.
|
|
||||||
|
* **unique hash index**: all documents in the collection must have different values for
|
||||||
|
the attributes covered by the unique index. Trying to insert a document with the same
|
||||||
|
key value as an already existing document will lead to a unique constraint
|
||||||
|
violation.
|
||||||
|
|
||||||
|
This type of index is not sparse. Documents that do not contain the index attributes or
|
||||||
|
that have a value of `null` in the index attribute(s) will still be indexed.
|
||||||
|
A key value of `null` may only occur once in the index, so this type of index cannot
|
||||||
|
be used for optional attributes.
|
||||||
|
|
||||||
|
* **unique, sparse hash index**: all documents in the collection must have different
|
||||||
|
values for the attributes covered by the unique index. Documents in which at least one
|
||||||
|
of the index attributes is not set or has a value of `null` are not included in the
|
||||||
|
index. This type of index can be used to ensure that there are no duplicate keys in
|
||||||
|
the collection for documents which have the indexed attributes set. As the index will
|
||||||
|
exclude documents for which the indexed attributes are `null` or not set, it can be
|
||||||
|
used for optional attributes.
|
||||||
|
|
||||||
|
* **non-unique hash index**: all documents in the collection will be indexed. This type
|
||||||
|
of index is not sparse. Documents that do not contain the index attributes or that have
|
||||||
|
a value of `null` in the index attribute(s) will still be indexed. Duplicate key values
|
||||||
|
can occur and do not lead to unique constraint violations.
|
||||||
|
|
||||||
|
* **non-unique, sparse hash index**: only those documents will be indexed that have all
|
||||||
|
the indexed attributes set to a value other than `null`. It can be used for optional
|
||||||
|
attributes.
|
||||||
|
|
||||||
|
The amortized complexity of lookup, insert, update, and removal operations in unique hash
|
||||||
|
indexes is O(1).
|
||||||
|
|
||||||
|
Non-unique hash indexes have an amortized complexity of O(1) for inserts. Lookup, update
|
||||||
|
and removal operations in non-unique hash indexes have an amortized complexity that is
|
||||||
|
linearly correlated with the number of duplicates for a given key. That means non-unique
|
||||||
|
hash indexes should not be used on attributes with very low cardinality.
|
||||||
|
|
||||||
|
If a hash index is created on an attribute that it is missing in all or many of the documents,
|
||||||
|
the behavior is as follows:
|
||||||
|
|
||||||
|
* if the index is sparse, the documents missing the attribute will not be indexed and not
|
||||||
|
use index memory. These documents will not influence the update or removal performance
|
||||||
|
for the index.
|
||||||
|
|
||||||
|
* if the index is non-sparse, the documents missing the attribute will be contained in the
|
||||||
|
index with a key value of `null`. If many such documents get indexed, a lot of collisions
|
||||||
|
will occur, and lookup, update and removal of documents will become expensive. This
|
||||||
|
should be avoided if possible.
|
||||||
|
|
||||||
|
|
||||||
!SUBSECTION Skiplist Index
|
!SUBSECTION Skiplist Index
|
||||||
|
|
||||||
A skiplist is a sorted index structure. They can be used to quickly find documents
|
A skiplist is a sorted index structure. It can be used to quickly find documents
|
||||||
with specific attribute values but also support range queries. They can also be used
|
with specific attribute values but also for range queries and returning documents from
|
||||||
for sorting in AQL.
|
the index in sorted order. Skiplists will be used from within AQL and several query
|
||||||
|
functions, e.g. `byExample`, `firstExample` etc.
|
||||||
|
|
||||||
A skiplist can be created on one or multiple document attributes.
|
Skiplist indexes will be used for lookups, range queries and sorting only if either all
|
||||||
|
index attributes are provided in a query, or if a leftmost prefix of the index attributes
|
||||||
|
is specified.
|
||||||
|
|
||||||
|
For example, if a skiplist index is created on attributes `value1` and `value2`, the
|
||||||
|
following conditions could use the index (note: the `<=` and `>=` operators are intentionally
|
||||||
|
omitted here for the sake of brevity):
|
||||||
|
|
||||||
|
FILTER doc.value1 == ...
|
||||||
|
FILTER doc.value1 < ...
|
||||||
|
FILTER doc.value1 > ...
|
||||||
|
FILTER doc.value1 > ... && doc.value1 < ...
|
||||||
|
|
||||||
|
FILTER doc.value1 == ... && doc.value2 == ...
|
||||||
|
FILTER doc.value1 == ... && doc.value2 > ...
|
||||||
|
FILTER doc.value1 == ... && doc.value2 > ... && doc.value2 < ...
|
||||||
|
|
||||||
|
In order to use a skiplist index for sorting, the index attributes must be specified in
|
||||||
|
the `SORT` clause of the query in the same order as they appear in the index definition.
|
||||||
|
Sort orders cannot be mixed, i.e. the sort orders specified in the `SORT` clause must all
|
||||||
|
be either ascending (optionally ommitted as ascending is the default) or descending.
|
||||||
|
|
||||||
Skiplists can optionally be declared to be unique, disallowing saving the same
|
Skiplists can optionally be declared to be unique, disallowing saving the same
|
||||||
value in the indexed attribute.
|
value in the indexed attribute. They can be sparse or non-sparse.
|
||||||
|
|
||||||
Skiplists are supported by AQL and several query functions, e.g. `byExample`,
|
The different types of skiplist indexes have the following characteristics:
|
||||||
`firstExample` etc.
|
|
||||||
|
* **unique skiplist index**: all documents in the collection must have different values for
|
||||||
|
the attributes covered by the unique index. Trying to insert a document with the same
|
||||||
|
key value as an already existing document will lead to a unique constraint
|
||||||
|
violation.
|
||||||
|
|
||||||
|
This type of index is not sparse. Documents that do not contain the index attributes or
|
||||||
|
that have a value of `null` in the index attribute(s) will still be indexed.
|
||||||
|
A key value of `null` may only occur once in the index, so this type of index cannot
|
||||||
|
be used for optional attributes.
|
||||||
|
|
||||||
|
* **unique, sparse skiplist index**: all documents in the collection must have different
|
||||||
|
values for the attributes covered by the unique index. Documents in which at least one
|
||||||
|
of the index attributes is not set or has a value of `null` are not included in the
|
||||||
|
index. This type of index can be used to ensure that there are no duplicate keys in
|
||||||
|
the collection for documents which have the indexed attributes set. As the index will
|
||||||
|
exclude documents for which the indexed attributes are `null` or not set, it can be
|
||||||
|
used for optional attributes.
|
||||||
|
|
||||||
|
* **non-unique skiplist index**: all documents in the collection will be indexed. This type
|
||||||
|
of index is not sparse. Documents that do not contain the index attributes or that have
|
||||||
|
a value of `null` in the index attribute(s) will still be indexed. Duplicate key values
|
||||||
|
can occur and do not lead to unique constraint violations.
|
||||||
|
|
||||||
|
* **non-unique, sparse skiplist index**: only those documents will be indexed that have all
|
||||||
|
the indexed attributes set to a value other than `null`. It can be used for optional
|
||||||
|
attributes.
|
||||||
|
|
||||||
|
The operational amortized complexity for skiplist indexes is logarithmically correlated
|
||||||
|
with the number of documents in the index.
|
||||||
|
|
||||||
|
|
||||||
!SUBSECTION Geo Index
|
!SUBSECTION Geo Index
|
||||||
|
|
||||||
A geo index is used to find places on the surface of the earth fast. The
|
Users can create additional geo indexes on one or multiple attributes in collections.
|
||||||
geo index in ArangoDB supports near and within queries. There are special functions
|
A geo index is used to find places on the surface of the earth fast.
|
||||||
to query geo indexes.
|
|
||||||
|
The geo index stores two-dimensional coordinates. It can be created on either two
|
||||||
|
separate document attributes (latitude and longitude) or a single array attribute that
|
||||||
|
contains both latitude and longitude. Latitude and longitude must be numeric values.
|
||||||
|
|
||||||
|
Th geo index provides operations to find documents with coordinates nearest to a given
|
||||||
|
comparsion coordinate, and to find documents with coordinates that are within a specifiable
|
||||||
|
radius around a comparsion coordinate.
|
||||||
|
|
||||||
|
The geo index is used via dedicated functions in AQL or the simple queries, but will
|
||||||
|
not enabled for other types of queries or conditions.
|
||||||
|
|
||||||
|
|
||||||
!SUBSECTION Fulltext Index
|
!SUBSECTION Fulltext Index
|
||||||
|
|
||||||
A fulltext index can be used to find words, or prefixes of words inside documents.
|
A fulltext index can be used to find words, or prefixes of words inside documents.
|
||||||
A fulltext index can be set on one attribute only, and will index all words contained
|
A fulltext index can be created on a single attribute only, and will index all words
|
||||||
in documents that have a textual value in this attribute. Only words with a (specifyable)
|
contained in documents that have a textual value in that attribute. Only words with a (specifyable)
|
||||||
minimum length are indexed. Word tokenization is done using the word boundary analysis
|
minimum length are indexed. Word tokenization is done using the word boundary analysis
|
||||||
provided by libicu, which is taking into account the selected language provided at
|
provided by libicu, which is taking into account the selected language provided at
|
||||||
server start. Words are indexed in their lower-cased form. The index supports complete
|
server start. Words are indexed in their lower-cased form. The index supports complete
|
||||||
match queries (full words) and prefix queries.
|
match queries (full words) and prefix queries, plus basic logical operations such as
|
||||||
|
`and`, `or` and `not` for combining partial results.
|
||||||
|
|
||||||
|
The fulltext index is sparse, meaning it will only index documents for which the index
|
||||||
|
attribute is set and contains a string value. Additionally, only words with a configurable
|
||||||
|
minimum length will be included in the index.
|
||||||
|
|
||||||
!SECTION Index Identifiers and Handles
|
The fulltext index is used via dedicated functions in AQL or the simple queries, but will
|
||||||
|
not be enabled for other types of queries or conditions.
|
||||||
An *index handle* uniquely identifies an index in the database. It is a string and
|
|
||||||
consists of the collection name and an *index identifier* separated by a `/`. The
|
|
||||||
index identifier part is a numeric value that is auto-generated by ArangoDB.
|
|
||||||
|
|
||||||
A specific index of a collection can be accessed using its *index handle* or
|
|
||||||
*index identifier* as follows:
|
|
||||||
|
|
||||||
```js
|
|
||||||
db.collection.index("<index-handle>");
|
|
||||||
db.collection.index("<index-identifier>");
|
|
||||||
db._index("<index-handle>");
|
|
||||||
```
|
|
||||||
|
|
||||||
For example: Assume that the index handle, which is stored in the `_id`
|
|
||||||
attribute of the index, is `demo/362549736` and the index was created in a collection
|
|
||||||
named `demo`. Then this index can be accessed as:
|
|
||||||
|
|
||||||
```js
|
|
||||||
db.demo.index("demo/362549736");
|
|
||||||
```
|
|
||||||
|
|
||||||
Because the index handle is unique within the database, you can leave out the
|
|
||||||
*collection* and use the shortcut:
|
|
||||||
|
|
||||||
```js
|
|
||||||
db._index("demo/362549736");
|
|
||||||
```
|
|
||||||
|
|
||||||
!SECTION Which Index type to use when
|
|
||||||
|
|
||||||
ArangoDB automatically indexes the `_key` attribute in each collection. There
|
|
||||||
is no need to index this attribute separately. Please note that a document's
|
|
||||||
`_id` attribute is derived from the `_key` attribute, and is thus implicitly
|
|
||||||
indexed, too.
|
|
||||||
|
|
||||||
ArangoDB will also automatically create an index on `_from` and `_to` in any
|
|
||||||
edge collection, meaning incoming and outgoing connections can be determined
|
|
||||||
efficiently.
|
|
||||||
|
|
||||||
Users can define additional indexes on one or multiple document attributes.
|
|
||||||
Several different index types are provided by ArangoDB. These indexes have
|
|
||||||
different usage scenarios:
|
|
||||||
|
|
||||||
- hash index: provides quick access to individual documents if (and only if)
|
|
||||||
all indexed attributes are provided in the search query. The index will only
|
|
||||||
be used for equality comparisons. It does not support range queries and
|
|
||||||
cannot be used for sorting..
|
|
||||||
|
|
||||||
The hash index is a good candidate if all or most queries on the indexed
|
|
||||||
attribute(s) are equality comparisons. It will be the most efficient index
|
|
||||||
type if the index is declared unique.
|
|
||||||
|
|
||||||
Insertions into a non-unique hash index are also very efficent. Removal
|
|
||||||
performance in a non-unique hash index depends on how often the indexed
|
|
||||||
attribute's values repeat. If there are a lot of value repetitions, the
|
|
||||||
removal performance in a non-unique hash index will suffer.
|
|
||||||
|
|
||||||
A non-unique hash index should there not be used if duplicate index values
|
|
||||||
are allowed (i.e. when the hash index is not declared *unique*) and there
|
|
||||||
will be many duplicate values in the index plus a lot of document removal
|
|
||||||
operations in the collection.
|
|
||||||
|
|
||||||
- skip list index: skip lists keep the indexed values in an order, so they can
|
|
||||||
be used for equality lookups, range queries and for sorting. Skip list indexes
|
|
||||||
will have a higher overhead than hash indexes but they are more general and
|
|
||||||
allow more use cases (e.g. range queries). Additionally, they can be used
|
|
||||||
for lower selectivity attributes, when non-unique hash indexes are not a
|
|
||||||
good fit.
|
|
||||||
|
|
||||||
- geo index: the geo index provided by ArangoDB allows searching for documents
|
|
||||||
within a radius around a two-dimensional earth coordinate (point), or to
|
|
||||||
find documents with are closest to a point. Document coordinates can either
|
|
||||||
be specified in two different document attributes or in a single attribute, e.g.
|
|
||||||
|
|
||||||
{ "latitude": 50.9406645, "longitude": 6.9599115 }
|
|
||||||
|
|
||||||
or
|
|
||||||
|
|
||||||
{ "coords": [ 50.9406645, 6.9599115 ] }
|
|
||||||
|
|
||||||
- fulltext index: a fulltext index can be used to index all words contained in
|
|
||||||
a specific attribute of all documents in a collection. Only words with a
|
|
||||||
(specifiable) minimum length are indexed. Word tokenization is done using
|
|
||||||
the word boundary analysis provided by libicu, which is taking into account
|
|
||||||
the selected language provided at server start.
|
|
||||||
|
|
||||||
The index supports complete match queries (full words) and prefix queries.
|
|
||||||
|
|
||||||
- cap constraint: the cap constraint provided by ArangoDB indexes documents
|
|
||||||
not to speed up search queries, but to limit (cap) the number or size of
|
|
||||||
documents in a collection.
|
|
||||||
|
|
||||||
|
|
|
@ -0,0 +1,135 @@
|
||||||
|
!SECTION Which Index to use when
|
||||||
|
|
||||||
|
ArangoDB automatically indexes the `_key` attribute in each collection. There
|
||||||
|
is no need to index this attribute separately. Please note that a document's
|
||||||
|
`_id` attribute is derived from the `_key` attribute, and is thus implicitly
|
||||||
|
indexed, too.
|
||||||
|
|
||||||
|
ArangoDB will also automatically create an index on `_from` and `_to` in any
|
||||||
|
edge collection, meaning incoming and outgoing connections can be determined
|
||||||
|
efficiently.
|
||||||
|
|
||||||
|
!SUBSECTION Index types
|
||||||
|
|
||||||
|
Users can define additional indexes on one or multiple document attributes.
|
||||||
|
Several different index types are provided by ArangoDB. These indexes have
|
||||||
|
different usage scenarios:
|
||||||
|
|
||||||
|
- hash index: provides quick access to individual documents if (and only if)
|
||||||
|
all indexed attributes are provided in the search query. The index will only
|
||||||
|
be used for equality comparisons. It does not support range queries and
|
||||||
|
cannot be used for sorting.
|
||||||
|
|
||||||
|
The hash index is a good candidate if all or most queries on the indexed
|
||||||
|
attribute(s) are equality comparisons. It will be the most efficient index
|
||||||
|
type if the index is declared unique.
|
||||||
|
|
||||||
|
Insertions into a non-unique hash index are also very efficent. Update and
|
||||||
|
removal performance in a non-unique hash index depend on the key selectivity.
|
||||||
|
If the selectivity is low and keys repeat a lot, update and removal performance
|
||||||
|
in a non-unique hash index will degarde.
|
||||||
|
|
||||||
|
A non-unique hash index should therefore not be used if duplicate index values
|
||||||
|
are allowed and it is known that there will be many duplicate values in the index
|
||||||
|
and there will be updates or removals.
|
||||||
|
|
||||||
|
A non-unique hash index on an optional document attribute should be declared
|
||||||
|
sparse so that it will not index documents for which the index attribute is
|
||||||
|
not set.
|
||||||
|
|
||||||
|
- skiplist index: skiplists keep the indexed values in an order, so they can
|
||||||
|
be used for equality lookups, range queries and for sorting. For high selectivity
|
||||||
|
attributes, skiplist indexes will have a higher overhead than hash indexes. For
|
||||||
|
low selectivity attributes, skiplist indexes will be more efficient than non-unique
|
||||||
|
hash indexes.
|
||||||
|
|
||||||
|
Additionally, skiplist indexes allow more use cases (e.g. range queries, sorting)
|
||||||
|
than hash indexes. Furthermore, they can be used for lookups based on a leftmost
|
||||||
|
prefix of the index attributes.
|
||||||
|
|
||||||
|
- geo index: the geo index provided by ArangoDB allows searching for documents
|
||||||
|
within a radius around a two-dimensional earth coordinate (point), or to
|
||||||
|
find documents with are closest to a point. Document coordinates can either
|
||||||
|
be specified in two different document attributes or in a single attribute, e.g.
|
||||||
|
|
||||||
|
{ "latitude": 50.9406645, "longitude": 6.9599115 }
|
||||||
|
|
||||||
|
or
|
||||||
|
|
||||||
|
{ "coords": [ 50.9406645, 6.9599115 ] }
|
||||||
|
|
||||||
|
Geo indexes will only be invoked via special functions.
|
||||||
|
|
||||||
|
- fulltext index: a fulltext index can be used to index all words contained in
|
||||||
|
a specific attribute of all documents in a collection. Only words with a
|
||||||
|
(specifiable) minimum length are indexed. Word tokenization is done using
|
||||||
|
the word boundary analysis provided by libicu, which is taking into account
|
||||||
|
the selected language provided at server start.
|
||||||
|
|
||||||
|
The index supports complete match queries (full words) and prefix queries.
|
||||||
|
Fulltexts indexes will only be invoked via special functions.
|
||||||
|
|
||||||
|
- cap constraint: the cap constraint provided by ArangoDB indexes documents
|
||||||
|
not to speed up search queries, but to limit (cap) the number or size of
|
||||||
|
documents in a collection. This can be used to prevent collections from growing
|
||||||
|
permanently.
|
||||||
|
|
||||||
|
|
||||||
|
!SUBSECTION Sparse vs. non-sparse indexes
|
||||||
|
|
||||||
|
Hash indexes and skiplist indexes can optionally be created sparse. A sparse index
|
||||||
|
does not contain documents for which at least one of the index attribute is not set
|
||||||
|
or contains a value of `null`.
|
||||||
|
|
||||||
|
As such documents are excluded from sparse indexes, they may contain fewer documents than
|
||||||
|
their non-sparse counterparts. This enables faster indexing and can lead to reduced memory
|
||||||
|
usage in case the indexed attribute does occur only in some, but not all documents of the
|
||||||
|
collection. Sparse indexes will also reduce the number of collisions in non-unique hash
|
||||||
|
indexes in case non-existing or optional attributes are indexed.
|
||||||
|
|
||||||
|
In order to create a sparse index, an object with the attribute `sparse` can be added to
|
||||||
|
the index creation commands:
|
||||||
|
|
||||||
|
```js
|
||||||
|
db.collection.ensureHashIndex(attributeName, { sparse: true });
|
||||||
|
db.collection.ensureHashIndex(attributeName1, attributeName2, { sparse: true });
|
||||||
|
db.collection.ensureUniqueConstraint(attributeName, { sparse: true });
|
||||||
|
db.collection.ensureUniqueConstraint(attributeName1, attributeName2, { sparse: true });
|
||||||
|
|
||||||
|
db.collection.ensureSkiplist(attributeName, { sparse: true });
|
||||||
|
db.collection.ensureSkiplist(attributeName1, attributeName2, { sparse: true });
|
||||||
|
db.collection.ensureUniqueSkiplist(attributeName, { sparse: true });
|
||||||
|
db.collection.ensureUniqueSkiplist(attributeName1, attributeName2, { sparse: true });
|
||||||
|
```
|
||||||
|
|
||||||
|
When not explicitly set, the `sparse` attribute defaults to `false` for new indexes.
|
||||||
|
Other indexes than hash and skiplist do not support sparsity.
|
||||||
|
|
||||||
|
As sparse indexes may exclude some documents from the collection, they cannot be used for
|
||||||
|
all types of queries. Sparse hash indexes cannot be used to find documents for which at
|
||||||
|
least one of the indexed attributes has a value of `null`. For example, the following AQL
|
||||||
|
query cannot use a sparse index, even if one was created on attribute `attr`:
|
||||||
|
|
||||||
|
FOR doc In collection
|
||||||
|
FILTER doc.attr == null
|
||||||
|
RETURN doc
|
||||||
|
|
||||||
|
If the lookup value is non-constant, a sparse index may or may not be used, depending on
|
||||||
|
the other types of conditions in the query. If the optimizer can safely determine that
|
||||||
|
the lookup value cannot be `null`, a sparse index may be used. When uncertain, the optimizer
|
||||||
|
will not make use of a sparse index in a query in order to produce correct results.
|
||||||
|
|
||||||
|
For example, the following queries cannot use a sparse index on `attr` because the optimizer
|
||||||
|
will not know beforehand whether the comparsion values for `doc.attr` will include `null`:
|
||||||
|
|
||||||
|
FOR doc In collection
|
||||||
|
FILTER doc.attr == SOME_FUNCTION(...)
|
||||||
|
RETURN doc
|
||||||
|
|
||||||
|
FOR other IN otherCollection
|
||||||
|
FOR doc In collection
|
||||||
|
FILTER doc.attr == other.attr
|
||||||
|
RETURN doc
|
||||||
|
|
||||||
|
Sparse skiplist indexes can be used for sorting if the optimizer can safely detect that the
|
||||||
|
index range does not include `null` for any of the index attributes.
|
|
@ -1,5 +1,35 @@
|
||||||
!CHAPTER Working with Indexes
|
!CHAPTER Working with Indexes
|
||||||
|
|
||||||
|
!SECTION Index Identifiers and Handles
|
||||||
|
|
||||||
|
An *index handle* uniquely identifies an index in the database. It is a string and
|
||||||
|
consists of the collection name and an *index identifier* separated by a `/`. The
|
||||||
|
index identifier part is a numeric value that is auto-generated by ArangoDB.
|
||||||
|
|
||||||
|
A specific index of a collection can be accessed using its *index handle* or
|
||||||
|
*index identifier* as follows:
|
||||||
|
|
||||||
|
```js
|
||||||
|
db.collection.index("<index-handle>");
|
||||||
|
db.collection.index("<index-identifier>");
|
||||||
|
db._index("<index-handle>");
|
||||||
|
```
|
||||||
|
|
||||||
|
For example: Assume that the index handle, which is stored in the `_id`
|
||||||
|
attribute of the index, is `demo/362549736` and the index was created in a collection
|
||||||
|
named `demo`. Then this index can be accessed as:
|
||||||
|
|
||||||
|
```js
|
||||||
|
db.demo.index("demo/362549736");
|
||||||
|
```
|
||||||
|
|
||||||
|
Because the index handle is unique within the database, you can leave out the
|
||||||
|
*collection* and use the shortcut:
|
||||||
|
|
||||||
|
```js
|
||||||
|
db._index("demo/362549736");
|
||||||
|
```
|
||||||
|
|
||||||
!SECTION Collection Methods
|
!SECTION Collection Methods
|
||||||
|
|
||||||
!SUBSECTION Listing all indexes of a collection
|
!SUBSECTION Listing all indexes of a collection
|
||||||
|
|
|
@ -211,6 +211,7 @@
|
||||||
* [Administrating ArangoDB](AdministratingArango/README.md)
|
* [Administrating ArangoDB](AdministratingArango/README.md)
|
||||||
* [Indexing](IndexHandling/README.md)
|
* [Indexing](IndexHandling/README.md)
|
||||||
* [Index Basics](IndexHandling/IndexBasics.md)
|
* [Index Basics](IndexHandling/IndexBasics.md)
|
||||||
|
* [Which Index to use when](IndexHandling/WhichIndex.md)
|
||||||
* [Working with Indexes](IndexHandling/WorkingWithIndexes.md)
|
* [Working with Indexes](IndexHandling/WorkingWithIndexes.md)
|
||||||
* [Hash Indexes](IndexHandling/Hash.md)
|
* [Hash Indexes](IndexHandling/Hash.md)
|
||||||
* [Skiplists](IndexHandling/Skiplist.md)
|
* [Skiplists](IndexHandling/Skiplist.md)
|
||||||
|
|
Loading…
Reference in New Issue