mirror of https://gitee.com/bigwinds/arangodb
205 lines
8.2 KiB
Plaintext
205 lines
8.2 KiB
Plaintext
!SECTION Index basics
|
|
|
|
Indexes allow fast access to documents, provided the indexed attribute(s)
|
|
are used in a query. While ArangoDB automatically indexes some system
|
|
attributes, users are free to create extra indexes on non-system attributes
|
|
of documents.
|
|
|
|
A user-defined index is created on collection level. Most user-defined indexes
|
|
can be created by specifying the names of the attributes which should be indexed.
|
|
Some index types allow indexing just one attribute (e.g. fulltext index) whereas
|
|
other index types allow indexing multiple attributes at the same time.
|
|
|
|
The system attributes `_id`, `_key`, `_from` and `-to` are automatically indexed
|
|
by ArangoDB, without the user being required to create extra indexes for them.
|
|
|
|
Therefore, indexing `_id`, `_key`, `_rev`, `_from`, and `_to` in a user-defined
|
|
index is often not required and is currently not supported by ArangoDB.
|
|
|
|
ArangoDB provides the following index types:
|
|
|
|
!SUBSECTION Primary Index
|
|
|
|
For each collection there will always be a *primary index* which is a hash index
|
|
for the [document keys](../Glossary/README.html#document_key) (`_key` attribute)
|
|
of all documents in the collection. The primary index allows quick selection
|
|
of documents in the collection using either the `_key` or `_id` attributes.
|
|
|
|
There are also dedicated functions to find a document given its `_key` or `_id`
|
|
that will always make use of the primary index:
|
|
|
|
```js
|
|
db.collection.document("<document-key>");
|
|
db._document("<document-id>");
|
|
```
|
|
|
|
The primary index of a collection cannot be dropped or changed.
|
|
|
|
|
|
!SUBSECTION Edges Index
|
|
|
|
Every [edge collection](../Glossary/README.html#edge_collection) also has an
|
|
automatically created *edges index*. The edges index provides quick access to
|
|
documents by either their `_from` or `_to` attributes. It can therefore be
|
|
used to quickly find connections between vertex documents and is invoked when
|
|
the connecting edges of a vertex are queried.
|
|
|
|
The edges index cannot be dropped or changed. Extra edges indexes cannot be
|
|
created on other attributes or in non-edge collections.
|
|
|
|
There are also dedidacted functions to find edges given their `_from` or `_to`
|
|
values that will always make use of the edges index:
|
|
|
|
```js
|
|
db.collection.edges("<from-value>");
|
|
db.collection.edges("<to-value>");
|
|
db.collection.outEdges("<from-value>");
|
|
db.collection.outEdges("<to-value>");
|
|
db.collection.inEdges("<from-value>");
|
|
db.collection.inEdges("<to-value>");
|
|
```
|
|
|
|
!SUBSECTION Hash Index
|
|
|
|
A hash index can be used to quickly find documents with specific attribute values.
|
|
The hash index is unsorted, so it supports equality lookups but no range queries.
|
|
|
|
A hash index can be created on one or multiple document attributes. A hash index will
|
|
only be used by a query if all indexed attributes are present in the search condition,
|
|
and if all attributes are compared using the equality (`==`) operator.
|
|
|
|
Hash indexes can optionally be declared to be unique, disallowing saving the same
|
|
value in the indexed attribute.
|
|
|
|
Hash indexes are supported by AQL and several query functions, e.g. `byExample`,
|
|
`firstExample` etc.
|
|
|
|
|
|
!SUBSECTION Skiplist Index
|
|
|
|
A skiplist is a sorted index structure. They can be used to quickly find documents
|
|
with specific attribute values but also support range queries. They can also be used
|
|
for sorting in AQL.
|
|
|
|
A skiplist can be created on one or multiple document attributes.
|
|
|
|
Skiplists can optionally be declared to be unique, disallowing saving the same
|
|
value in the indexed attribute.
|
|
|
|
Skiplists are supported by AQL and several query functions, e.g. `byExample`,
|
|
`firstExample` etc.
|
|
|
|
|
|
!SUBSECTION Geo Index
|
|
|
|
A geo index is used to find places on the surface of the earth fast. The
|
|
geo index in ArangoDB supports near and within queries. There are special functions
|
|
to query geo indexes.
|
|
|
|
|
|
!SUBSECTION Fulltext Index
|
|
|
|
A fulltext index can be used to find words, or prefixes of words inside documents.
|
|
A fulltext index can be set on one attribute only, and will index all words contained
|
|
in documents that have a textual value in this attribute. Only words with a (specifyable)
|
|
minimum length are indexed. Word tokenization is done using the word boundary analysis
|
|
provided by libicu, which is taking into account the selected language provided at
|
|
server start. Words are indexed in their lower-cased form. The index supports complete
|
|
match queries (full words) and prefix queries.
|
|
|
|
|
|
!SECTION Index Identifiers and Handles
|
|
|
|
An *index handle* uniquely identifies an index in the database. It is a string and
|
|
consists of the collection name and an *index identifier* separated by a `/`. The
|
|
index identifier part is a numeric value that is auto-generated by ArangoDB.
|
|
|
|
A specific index of a collection can be accessed using its *index handle* or
|
|
*index identifier* as follows:
|
|
|
|
```js
|
|
db.collection.index("<index-handle>");
|
|
db.collection.index("<index-identifier>");
|
|
db._index("<index-handle>");
|
|
```
|
|
|
|
For example: Assume that the index handle, which is stored in the `_id`
|
|
attribute of the index, is `demo/362549736` and the index was created in a collection
|
|
named `demo`. Then this index can be accessed as:
|
|
|
|
```js
|
|
db.demo.index("demo/362549736");
|
|
```
|
|
|
|
Because the index handle is unique within the database, you can leave out the
|
|
*collection* and use the shortcut:
|
|
|
|
```js
|
|
db._index("demo/362549736");
|
|
```
|
|
|
|
!SECTION Which Index type to use when
|
|
|
|
ArangoDB automatically indexes the `_key` attribute in each collection. There
|
|
is no need to index this attribute separately. Please note that a document's
|
|
`_id` attribute is derived from the `_key` attribute, and is thus implicitly
|
|
indexed, too.
|
|
|
|
ArangoDB will also automatically create an index on `_from` and `_to` in any
|
|
edge collection, meaning incoming and outgoing connections can be determined
|
|
efficiently.
|
|
|
|
Users can define additional indexes on one or multiple document attributes.
|
|
Several different index types are provided by ArangoDB. These indexes have
|
|
different usage scenarios:
|
|
|
|
- hash index: provides quick access to individual documents if (and only if)
|
|
all indexed attributes are provided in the search query. The index will only
|
|
be used for equality comparisons. It does not support range queries and
|
|
cannot be used for sorting..
|
|
|
|
The hash index is a good candidate if all or most queries on the indexed
|
|
attribute(s) are equality comparisons. It will be the most efficient index
|
|
type if the index is declared unique.
|
|
|
|
Insertions into a non-unique hash index are also very efficent. Removal
|
|
performance in a non-unique hash index depends on how often the indexed
|
|
attribute's values repeat. If there are a lot of value repetitions, the
|
|
removal performance in a non-unique hash index will suffer.
|
|
|
|
A non-unique hash index should there not be used if duplicate index values
|
|
are allowed (i.e. when the hash index is not declared *unique*) and there
|
|
will be many duplicate values in the index plus a lot of document removal
|
|
operations in the collection.
|
|
|
|
- skip list index: skip lists keep the indexed values in an order, so they can
|
|
be used for equality lookups, range queries and for sorting. Skip list indexes
|
|
will have a higher overhead than hash indexes but they are more general and
|
|
allow more use cases (e.g. range queries). Additionally, they can be used
|
|
for lower selectivity attributes, when non-unique hash indexes are not a
|
|
good fit.
|
|
|
|
- geo index: the geo index provided by ArangoDB allows searching for documents
|
|
within a radius around a two-dimensional earth coordinate (point), or to
|
|
find documents with are closest to a point. Document coordinates can either
|
|
be specified in two different document attributes or in a single attribute, e.g.
|
|
|
|
{ "latitude": 50.9406645, "longitude": 6.9599115 }
|
|
|
|
or
|
|
|
|
{ "coords": [ 50.9406645, 6.9599115 ] }
|
|
|
|
- fulltext index: a fulltext index can be used to index all words contained in
|
|
a specific attribute of all documents in a collection. Only words with a
|
|
(specifiable) minimum length are indexed. Word tokenization is done using
|
|
the word boundary analysis provided by libicu, which is taking into account
|
|
the selected language provided at server start.
|
|
|
|
The index supports complete match queries (full words) and prefix queries.
|
|
|
|
- cap constraint: the cap constraint provided by ArangoDB indexes documents
|
|
not to speed up search queries, but to limit (cap) the number or size of
|
|
documents in a collection.
|
|
|