mirror of https://gitee.com/bigwinds/arangodb
Merge branch 'devel' of github.com:arangodb/arangodb into devel
This commit is contained in:
commit
fe455671d4
|
@ -7,8 +7,10 @@ single query. It will then calculate the costs for all plans and pick the plan w
|
|||
lowest total cost. This resulting plan is considered to be the *optimal plan*, which is
|
||||
then executed.
|
||||
|
||||
The optimizer is designed to only perform optimization if they are *safe*, in the
|
||||
meaning that an optimization does not modify the result of a query.
|
||||
The optimizer is designed to only perform optimizations if they are *safe*, in the
|
||||
meaning that an optimization should not modify the result of a query. A notable exception
|
||||
to this is that the optimizer is allowed to change the order of results for queries that
|
||||
do not explicitly specify how results should be sorted.
|
||||
|
||||
|
||||
!SUBSECTION Execution plans
|
||||
|
@ -69,16 +71,16 @@ the evaluation of an expression. The filters expression result (`i.value > 97`)
|
|||
is calculated in the *CalculationNode* above the *FilterNode*.
|
||||
|
||||
Finally, all of this needs to be done for documents of collection `test`. This is
|
||||
where the *IndexRangeNode* enters the game. It will use an index (thus its name)
|
||||
where the *IndexNode* enters the game. It will use an index (thus its name)
|
||||
to find certain documents in the collection and ship it down the pipeline in the
|
||||
order required by `SORT i.value`. The *IndexRangeNode* itself has a *SingletonNode*
|
||||
order required by `SORT i.value`. The *IndexNode* itself has a *SingletonNode*
|
||||
as its input. The sole purpose of a *SingletonNode* node is to provide a single empty
|
||||
document as input for other processing steps. It is always the end of the pipeline.
|
||||
|
||||
Here's a summary:
|
||||
* SingletonNode: produces empty document as input for other processing steps.
|
||||
* IndexRangeNode: iterates over the index on attribute `value` in collection `test`
|
||||
in the order required by `SORT i.value`.
|
||||
* SingletonNode: produces an empty document as input for other processing steps.
|
||||
* IndexNode: iterates over the index on attribute `value` in collection `test`
|
||||
in the order required by `SORT i.value`.
|
||||
* CalculationNode: evaluates the result of the calculation `i.value > 97` to `true` or `false`
|
||||
* FilterNode: only lets documents pass where above calculation returned `true`
|
||||
* CalculationNode: calculates return value `i.value`
|
||||
|
@ -88,9 +90,9 @@ Here's a summary:
|
|||
!SUBSUBSECTION Optimizer rules
|
||||
|
||||
Note that in the example, the optimizer has optimized the `SORT` statement away.
|
||||
It can do it safely because there is a sorted index on `i.value`, which it has
|
||||
picked in the *IndexRangeNode*. As the index values are iterated in sorted order
|
||||
anyway, the extra *SortNode* would be redundant and was removed.
|
||||
It can do it safely because there is a sorted skiplist index on `i.value`, which it has
|
||||
picked in the *IndexNode*. As the index values are iterated over in sorted order
|
||||
anyway, the extra *SortNode* would have been redundant and was removed.
|
||||
|
||||
Additionally, the optimizer has done more work to generate an execution plan that
|
||||
avoids as much expensive operations as possible. Here is the list of optimizer rules
|
||||
|
@ -115,7 +117,7 @@ Here is the meaning of these rules in context of this query:
|
|||
* `remove-unnecessary-calculations`: removes *CalculationNode*s whose result values are
|
||||
not used in the query. In the example this happens due to the `remove-redundant-calculations`
|
||||
rule having made some calculations unnecessary.
|
||||
* `use-index-range`: use an index to iterate over a collection instead of performing a
|
||||
* `use-index`: use an index to iterate over a collection instead of performing a
|
||||
full collection scan. In the example case this makes sense, as the index can be
|
||||
used for filtering and sorting.
|
||||
* `use-index-for-sort`: removes a `SORT` operation if it is already satisfied by
|
||||
|
@ -268,8 +270,8 @@ The following execution node types will appear in the output of `explain`:
|
|||
exactly one *SingletonNode* as its top node.
|
||||
* *EnumerateCollectionNode*: enumeration over documents of a collection (given in
|
||||
its *collection* attribute) without using an index.
|
||||
* *IndexRangeNode*: enumeration over a specific index (given in its *index* attribute)
|
||||
of a collection. The index range is specified in the *ranges* attribute of the node.
|
||||
* *IndexNode*: enumeration over one or many indexes (given in its *indexes* attribute)
|
||||
of a collection. The index ranges are specified in the *condition* attribute of the node.
|
||||
* *EnumerateListNode*: enumeration over a list of (non-collection) values.
|
||||
* *FilterNode*: only lets values pass that satisfy a filter condition. Will appear once
|
||||
per *FILTER* statement.
|
||||
|
@ -291,6 +293,8 @@ The following execution node types will appear in the output of `explain`:
|
|||
attribute). Will appear exactly once in a query that contains a *REPLACE* statement.
|
||||
* *UpdateNode*: updates documents in a collection (given in its *collection*
|
||||
attribute). Will appear exactly once in a query that contains an *UPDATE* statement.
|
||||
* *UpsertNode*: upserts documents in a collection (given in its *collection*
|
||||
attribute). Will appear exactly once in a query that contains an *UPSERT* statement.
|
||||
* *NoResultsNode*: will be inserted if *FILTER* statements turn out to be never
|
||||
satisfiable. The *NoResultsNode* will pass an empty result set into the processing
|
||||
pipeline.
|
||||
|
@ -349,11 +353,11 @@ The following optimizer rules may appear in the `rules` attribute of a plan:
|
|||
on the same variable or attribute were replaced with an *IN* condition.
|
||||
* `remove-redundant-or`: will appear if multiple *OR* conditions for the same variable
|
||||
or attribute were combined into a single condition.
|
||||
* `use-index-range`: will appear if an index can be used to iterate over a collection.
|
||||
* `use-indexes`: will appear when an index is used to iterate over a collection.
|
||||
As a consequence, an *EnumerateCollectionNode* was replaced with an
|
||||
*IndexRangeNode* in the plan.
|
||||
*IndexNode* in the plan.
|
||||
* `remove-filters-covered-by-index`: will appear if a *FilterNode* was removed or replaced
|
||||
because the filter condition is already covered by an *IndexRangeNode*.
|
||||
because the filter condition is already covered by an *IndexNode*.
|
||||
* `use-index-for-sort`: will appear if an index can be used to avoid a *SORT*
|
||||
operation. If the rule was applied, a *SortNode* was removed from the plan.
|
||||
* `move-calculations-down`: will appear if a *CalculationNode* was moved down in a plan.
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
!SECTION How ArangoDB uses Indexes
|
||||
|
||||
In general, ArangoDB will use a single index per collection in a given query. AQL queries can
|
||||
In most cases ArangoDB will use a single index per collection in a given query. AQL queries can
|
||||
use more than one index per collection when multiple FILTER conditions are combined with a
|
||||
logical `OR` and these can be covered by indexes. AQL queries will use a single index per
|
||||
collection when FILTER conditions are combined with logical `AND`.
|
||||
|
@ -23,11 +23,11 @@ multiple indexes the optimizer can choose from. The optimizer will then select a
|
|||
indexes with the lowest estimated total cost. In general, the optimizer will pick the indexes with
|
||||
the highest estimated selectivity.
|
||||
|
||||
Sparse indexes do not contain `null` values. If the optimizer cannot safely determine whether a
|
||||
FILTER condition includes `null` values, it will not make use of a sparse index. The optimizer
|
||||
policy is to produce correct results, regardless of whether or which index is used to satisfy
|
||||
FILTER conditions. If it is unsure about whether using an index will violate the policy, it will
|
||||
not make use of the index.
|
||||
Sparse indexes may or may not be picked by the optimizer in a query. As sparse indexes do not contain
|
||||
`null` values, they will not be used for queries if the optimizer cannot safely determine whether a
|
||||
FILTER condition includes `null` values for the index attributes. The optimizer policy is to produce
|
||||
correct results, regardless of whether or which index is used to satisfy FILTER conditions. If it is
|
||||
unsure about whether using an index will violate the policy, it will not make use of the index.
|
||||
|
||||
|
||||
!SUBSECTION Troubleshooting
|
||||
|
@ -76,12 +76,15 @@ If any of the explain methods shows that a query is not using indexes, the follo
|
|||
In these cases the queries should be rewritten so that only the index attribute is present on one side of
|
||||
the operator, or additional filters and indexes should be used to restrict the amount of documents otherwise.
|
||||
|
||||
* the query optimizer will in general picking one index per collection in a query. It can pick more than
|
||||
* the query optimizer will in general pick one index per collection in a query. It can pick more than
|
||||
one index per collection if the FILTER condition contains multiple branches combined with logical `OR`.
|
||||
For example, the following queries can use more than one index:
|
||||
For example, the following queries can use indexes:
|
||||
|
||||
FOR doc IN collection FILTER doc.value1 == 42 || doc.value1 == 23 RETURN doc
|
||||
FOR doc IN collection FILTER doc.value1 == 42 || doc.value2 == 23 RETURN doc
|
||||
FOR doc IN collection FILTER doc.value1 < 42 || doc.value2 > 23 RETURN doc
|
||||
|
||||
In the latter case, the query optimizer can only use indexes if there are indexes present on both `value1`
|
||||
and `value2`.
|
||||
The two `OR`s in the first query will be converted to an `IN` list, and if there is a suitable index on
|
||||
`value1`, it will be used. The second query requires two separate indexes on `value1` and `value2` and
|
||||
will use them if present. The third query can use the indexes on `value1` and `value2` when they are
|
||||
sorted.
|
||||
|
|
|
@ -35,8 +35,8 @@ db.collection.document("<document-key>");
|
|||
db._document("<document-id>");
|
||||
```
|
||||
|
||||
As the primary index is a hash index, it cannot be used for non-equality range
|
||||
queries or for sorting.
|
||||
As the primary index is an unsorted hash index, it cannot be used for non-equality
|
||||
range queries or for sorting.
|
||||
|
||||
The primary index of a collection cannot be dropped or changed, and there is no
|
||||
mechanism to create user-defined primary indexes.
|
||||
|
@ -64,11 +64,11 @@ db.collection.inEdges("<from-value>");
|
|||
db.collection.inEdges("<to-value>");
|
||||
```
|
||||
|
||||
The edges index is a hash index. It can be used for equality lookups only, but not for range
|
||||
queries or for sorting. As edges indexes are automatically created for edge collections, it
|
||||
is not possible to create user-defined edges indexes.
|
||||
Internally, the edges index is implemented as a hash index. It can be used for equality
|
||||
lookups, but not for range queries or for sorting. As edges indexes are automatically
|
||||
created for edge collections, it is not possible to create user-defined edges indexes.
|
||||
|
||||
The edges index cannot be dropped or changed.
|
||||
An edges index cannot be dropped or changed.
|
||||
|
||||
|
||||
!SUBSECTION Hash Index
|
||||
|
@ -120,7 +120,7 @@ Non-unique hash indexes have an amortized complexity of O(1) for insert, update,
|
|||
removal operations. That means non-unique hash indexes can be used on attributes with
|
||||
low cardinality.
|
||||
|
||||
If a hash index is created on an attribute that it is missing in all or many of the documents,
|
||||
If a hash index is created on an attribute that is missing in all or many of the documents,
|
||||
the behavior is as follows:
|
||||
|
||||
* if the index is sparse, the documents missing the attribute will not be indexed and not
|
||||
|
@ -130,6 +130,9 @@ the behavior is as follows:
|
|||
* if the index is non-sparse, the documents missing the attribute will be contained in the
|
||||
index with a key value of `null`.
|
||||
|
||||
Hash indexes support indexing array values if the index attribute name is extended with
|
||||
a <i>[\*]</i>`.
|
||||
|
||||
|
||||
!SUBSECTION Skiplist Index
|
||||
|
||||
|
@ -217,6 +220,9 @@ The different types of skiplist indexes have the following characteristics:
|
|||
The operational amortized complexity for skiplist indexes is logarithmically correlated
|
||||
with the number of documents in the index.
|
||||
|
||||
Skiplist indexes support indexing array values if the index attribute name is extended with
|
||||
a <i>[\*]</i>`.
|
||||
|
||||
|
||||
!SUBSECTION Geo Index
|
||||
|
||||
|
@ -231,8 +237,8 @@ Th geo index provides operations to find documents with coordinates nearest to a
|
|||
comparison coordinate, and to find documents with coordinates that are within a specifiable
|
||||
radius around a comparison coordinate.
|
||||
|
||||
The geo index is used via dedicated functions in AQL or the simple queries, but will
|
||||
not enabled for other types of queries or conditions.
|
||||
The geo index is used via dedicated functions in AQL or the simple queries functions,
|
||||
but will not be used for other types of queries or conditions.
|
||||
|
||||
|
||||
!SUBSECTION Fulltext Index
|
||||
|
|
|
@ -1035,7 +1035,7 @@ ArangoCollection.prototype.ensureSkiplist = function () {
|
|||
|
||||
////////////////////////////////////////////////////////////////////////////////
|
||||
/// @brief ensures that a fulltext index exists
|
||||
/// @startDocuBlock ensureIndex
|
||||
/// @startDocuBlock ensureFulltextIndex
|
||||
/// `collection.ensureIndex({ type: "fulltext", fields: [ "field" ], minLength: minLength })`
|
||||
///
|
||||
/// Creates a fulltext index on all documents on attribute *field*.
|
||||
|
|
|
@ -430,7 +430,7 @@ function ModelAnnotationSpec () {
|
|||
var Model = FoxxModel.extend({});
|
||||
jsonSchema = toJSONSchema("myname", Model);
|
||||
assertEqual(jsonSchema.id, "myname");
|
||||
assertEqual(jsonSchema.required, []);
|
||||
assertEqual(jsonSchema.required, undefined);
|
||||
assertEqual(jsonSchema.properties, {});
|
||||
},
|
||||
|
||||
|
@ -450,7 +450,7 @@ function ModelAnnotationSpec () {
|
|||
|
||||
jsonSchema = toJSONSchema("myname", Model);
|
||||
assertEqual(jsonSchema.id, "myname");
|
||||
assertEqual(jsonSchema.required, []);
|
||||
assertEqual(jsonSchema.required, undefined);
|
||||
assertEqual(jsonSchema.properties.x.type, "string");
|
||||
},
|
||||
|
||||
|
|
Loading…
Reference in New Issue