1
0
Fork 0
arangodb/Documentation/Books/Users/IndexHandling/HowArangoDBUsesIndexes.mdpp

74 lines
4.2 KiB
Plaintext

!SECTION How ArangoDB uses Indexes
In general, ArangoDB will use a single index per collection in a given query.
Creating multiple indexes on different attributes of the same collection will not make ArangoDB
use more than one index for the collection in a query, but may give the optimizer more choices
when picking an index. Creating multiple indexes on different attributes can also help in speeding
up different queries, with FILTER conditions on different attributes.
However, as a single index will be used per collection in a given query, it is often beneficial to
create an index on more than just one attribute. By adding more attributes to an index, an index can
become more selective and thus reduce the number of documents that operations later in a query need
to process.
ArangoDB will provide index selectivity estimates for edge and hash indexes in the web interface,
the `getIndexes()` return value and in the `explain()` outputs for a given query. The more selective an
index is, the more documents it will filter on average. The query optimizer will also try to use the
most selective index possible when it has the choice between multiple indexes with a known selectivity
estimate.
Sparse indexes do not contain `null` values. If the optimizer cannot safely determine whether a filter
condition used includes `null` values, it will not make use of a sparse index. The optimizer policy is
to produce correct results, regardless of whether or which index is used to satisfy filter conditions.
If it is unsure about whether using an index will violate the policy, it will not make use of the index.
!SUBSECTION Troubleshooting
When in doubt about whether and which indexes will be used for executing a given AQL query, use
the `explain()` method for the statement as follows (from the ArangoShell):
```js
var query = "FOR doc IN collection FILTER doc.value > 42 RETURN doc";
var stmt = db._createStatement(query);
stmt.explain();
```
The `explain()` command will return a JSON representation of the query's execution plan, with many
details. This might be overwhelming when just looking for index usage.
In this case, the following command will provide a much more compact explanation of the query:
```js
var query = "FOR doc IN collection FILTER doc.value > 42 RETURN doc";
require("org/arangodb/aql/explainer").explain(query);
```
If any of the explain methods shows that a query is not using indexes, the following steps may help:
* check if the attribute names in the query are correctly spelled. In a schema-free database, documents
in the same collection can have varying structures. There is no such thing as a *non-existing attribute*
error. A query that refers to attribute names not present in any of the documents will not return an
error, and obviously will not benefit from indexes.
* check the return value of the `getIndexes()` method for the collections used in the query and validate
that indexes are actually present on the attributes used in the query's filter conditions.
* if indexes are present but not used by the query, the indexes may have the wrong type. For example, a
hash index will only be used for equality comparisons (i.e. `==`) but not for other comparison types such
as `<`, `<=`, `>`, `>=`. Additionally, a hash index will only be used if all of the index attributes are
used in the query's FILTER conditions. A skiplist index will only be used if at least its first attribute
is used in a FILTER condition. If additionally of the skiplist index attributes are specified in the query
(from left-to-right), they may also be used and allow to filter more documents.
* using indexed attributes as function parameters or in arbitrary expressions will likely lead to the index
on the attribute not being used. For example, the following queries will not use an index on `value`:
FOR doc IN collection FILTER TO_NUMBER(doc.value) == 42 RETURN doc
FOR doc IN collection FILTER doc.value - 1 == 42 RETURN doc
FOR doc IN collection FILTER doc.value == 42 || LENGTH(doc.value) == 0 RETURN doc
In the cases the queries should be rewritten so that only the index attribute is present on one side of
the operator, or additional filters and indexes should be used to restrict the amount of documents otherwise.