1
0
Fork 0
arangodb/Documentation/Books/Users/Aql/Operations.mdpp

570 lines
18 KiB
Plaintext

!CHAPTER High-level operations
!SUBSECTION FOR
The *FOR* keyword can be to iterate over all elements of a list.
The general syntax is:
```
FOR variable-name IN expression
```
Each list element returned by *expression* is visited exactly once. It is
required that *expression* returns a list in all cases. The empty list is
allowed, too. The current list element is made available for further processing
in the variable specified by *variable-name*.
```
FOR u IN users
RETURN u
```
This will iterate over all elements from the list *users* (note: this list
consists of all documents from the collection named "users" in this case) and
make the current list element available in variable *u*. *u* is not modified in
this example but simply pushed into the result using the *RETURN* keyword.
Note: When iterating over collection-based lists as shown here, the order of
documents is undefined unless an explicit sort order is defined using a *SORT*
statement.
The variable introduced by *FOR* is available until the scope the *FOR* is
placed in is closed.
Another example that uses a statically declared list of values to iterate over:
```
FOR year IN [ 2011, 2012, 2013 ]
RETURN { "year" : year, "isLeapYear" : year % 4 == 0 && (year % 100 != 0 || year % 400 == 0) }
```
Nesting of multiple *FOR* statements is allowed, too. When *FOR* statements are
nested, a cross product of the list elements returned by the individual *FOR*
statements will be created.
```
FOR u IN users
FOR l IN locations
RETURN { "user" : u, "location" : l }
```
In this example, there are two list iterations: an outer iteration over the list
*users* plus an inner iteration over the list *locations*. The inner list is
traversed as many times as there are elements in the outer list. For each
iteration, the current values of *users* and *locations* are made available for
further processing in the variable *u* and *l*.
!SUBSECTION RETURN
The *RETURN* statement can (and must) be used to produce the result of a query.
It is mandatory to specify a *RETURN* statement at the end of each block in a
query, otherwise the query result would be undefined.
The general syntax for *return* is:
```
RETURN expression
```
The *expression* returned by *RETURN* is produced for each iteration the
*RETURN* statement is placed in. That means the result of a *RETURN* statement
is always a list (this includes the empty list). To return all elements from
the currently iterated list without modification, the following simple form can
be used:
```
FOR variable-name IN expression
RETURN variable-name
```
As *RETURN* allows specifying an expression, arbitrary computations can be
performed to calculate the result elements. Any of the variables valid in the
scope the *RETURN* is placed in can be used for the computations.
Note: Return will close the current scope and eliminate all local variables in
it.
!SUBSECTION FILTER
The *FILTER* statement can be used to restrict the results to elements that
match an arbitrary logical condition. The general syntax is:
```
FILTER condition
```
*condition* must be a condition that evaluates to either *false* or *true*. If
the condition result is false, the current element is skipped, so it will not be
processed further and not be part of the result. If the condition is true, the
current element is not skipped and can be further processed.
```
FOR u IN users
FILTER u.active == true && u.age < 39
RETURN u
```
In the above example, all list elements from *users* will be included that have
an attribute *active* with value *true* and that have an attribute *age* with a
value less than *39* (including *null* ones). All other elements from *users*
will be skipped and not be included the result produced by *RETURN*.
It is allowed to specify multiple *FILTER* statements in a query, and even in
the same block. If multiple *FILTER* statements are used, their results will be
combined with a logical and, meaning all filter conditions must be true to
include an element.
```
FOR u IN users
FILTER u.active == true
FILTER u.age < 39
RETURN u
```
!SUBSECTION SORT
The *SORT* statement will force a sort of the list of already produced
intermediate results in the current block. *SORT* allows specifying one or
multiple sort criteria and directions. The general syntax is:
```
SORT expression direction
```
Specifying the *direction* is optional. The default (implicit) direction for a
sort is the ascending order. To explicitly specify the sort direction, the
keywords *ASC* (ascending) and *DESC* can be used. Multiple sort criteria can be
separated using commas.
Note: when iterating over collection-based lists, the order of documents is
always undefined unless an explicit sort order is defined using *SORT*.
```
FOR u IN users
SORT u.lastName, u.firstName, u.id DESC
RETURN u
```
!SUBSECTION LIMIT
The *LIMIT* statement allows slicing the list of result documents using an
offset and a count. It reduces the number of elements in the result to at most
the specified number. Two general forms of *LIMIT* are followed:
```
LIMIT count
LIMIT offset, count
```
The first form allows specifying only the *count* value whereas the second form
allows specifying both *offset* and *count*. The first form is identical using
the second form with an *offset* value of *0*.
The *offset* value specifies how many elements from the result shall be
discarded. It must be 0 or greater. The *count* value specifies how many
elements should be at most included in the result.
```
FOR u IN users
SORT u.firstName, u.lastName, u.id DESC
LIMIT 0, 5
RETURN u
```
!SUBSECTION LET
The *LET* statement can be used to assign an arbitrary value to a variable. The
variable is then introduced in the scope the *LET* statement is placed in. The
general syntax is:
```
LET variable-name = expression
```
*LET* statements are mostly used to declare complex computations and to avoid
repeated computations of the same value at multiple parts of a query.
```
FOR u IN users
LET numRecommendations = LENGTH(u.recommendations)
RETURN { "user" : u, "numRecommendations" : numRecommendations, "isPowerUser" : numRecommendations >= 10 }
```
In the above example, the computation of the number of recommendations is
factored out using a *LET* statement, thus avoiding computing the value twice in
the *RETURN* statement.
Another use case for *LET* is to declare a complex computation in a subquery,
making the whole query more readable.
```
FOR u IN users
LET friends = (
FOR f IN friends
FILTER u.id == f.userId
RETURN f
)
LET memberships = (
FOR m IN memberships
FILTER u.id == m.userId
RETURN m
)
RETURN { "user" : u, "friends" : friends, "numFriends" : LENGTH(friends), "memberShips" : memberships }
```
!SUBSECTION COLLECT
The *COLLECT* keyword can be used to group a list by one or multiple group
criteria. The two general syntaxes for *COLLECT* are:
```
COLLECT variable-name = expression
COLLECT variable-name = expression INTO groups
```
The first form only groups the result by the defined group criteria defined by
*expression*. In order to further process the results produced by *COLLECT*, a
new variable (specified by *variable-name*) is introduced. This variable
contains the group value.
The second form does the same as the first form, but additionally introduces a
variable (specified by *groups*) that contains all elements that fell into the
group. Specifying the *INTO* clause is optional-
```
FOR u IN users
COLLECT city = u.city INTO g
RETURN { "city" : city, "users" : g }
```
In the above example, the list of *users* will be grouped by the attribute
*city*. The result is a new list of documents, with one element per distinct
*city* value. The elements from the original list (here: *users*) per city are
made available in the variable *g*. This is due to the *INTO* clause.
*COLLECT* also allows specifying multiple group criteria. Individual group
criteria can be separated by commas.
```
FOR u IN users
COLLECT first = u.firstName, age = u.age INTO g
RETURN { "first" : first, "age" : age, "numUsers" : LENGTH(g) }
```
In the above example, the list of *users* is grouped by first names and ages
first, and for each distinct combination of first name and age, the number of
users found is returned.
Note: The *COLLECT* statement eliminates all local variables in the current
scope. After *COLLECT* only the variables introduced by *COLLECT* itself are
available.
!SUBSECTION REMOVE
The *REMOVE* keyword can be used to remove documents from a collection. On a
single server, the document removal is executed transactionally in an
all-or-nothing fashion. For sharded collections, the entire remove operation
is not transactional.
Only a single *REMOVE* statement is allowed per AQL query, and it cannot be combined
with other data-modification or retrieval operations. A remove operation is
restricted to a single collection, and the collection name must not be dynamic.
The syntax for a remove operation is:
```
REMOVE key-expression IN collection options
```
*collection* must contain the name of the collection to remove the documents
from. *key-expression* must be an expression that contains the document identification.
This can either be a string (which must then contain the document key) or a
document, which must contain a *_key* attribute.
The following queries are thus equivalent:
```
FOR u IN users
REMOVE { _key: u._key } IN users
FOR u IN users
REMOVE u._key IN users
FOR u IN users
REMOVE u IN users
```
**Note**: A remove operation can remove arbitrary documents, and the documents
do not need to be identical to the ones produced by a preceding *FOR* statement:
```
FOR i IN 1..1000
REMOVE { _key: CONCAT('test', TO_STRING(i)) } IN users
FOR u IN users
FILTER u.active == false
REMOVE { _key: u._key } IN backup
```
*options* can be used to suppress query errors that may occur when trying to
remove non-existing documents. For example, the following query will fail if one
of the to-be-deleted documents does not exist:
```
FOR i IN 1..1000
REMOVE { _key: CONCAT('test', TO_STRING(i)) } IN users
```
By specifying the *ignoreErrors* query option, these errors can be suppressed so
the query completes:
```
FOR i IN 1..1000
REMOVE { _key: CONCAT('test', TO_STRING(i)) } IN users OPTIONS { ignoreErrors: true }
```
To make sure data has been written to disk when a query returns, there is the *waitForSync*
query option:
```
FOR i IN 1..1000
REMOVE { _key: CONCAT('test', TO_STRING(i)) } IN users OPTIONS { waitForSync: true }
```
!SUBSECTION UPDATE
The *UPDATE* keyword can be used to partially update documents in a collection. On a
single server, updates are executed transactionally in an all-or-nothing fashion.
For sharded collections, the entire update operation is not transactional.
Only a single *UPDATE* statement is allowed per AQL query, and it cannot be combined
with other data-modification or retrieval operations. An update operation is
restricted to a single collection, and the collection name must not be dynamic.
The two syntaxes for an update operation are:
```
UPDATE document IN collection options
UPDATE key-expression WITH document IN collection options
```
*collection* must contain the name of the collection in which the documents should
be updated. *document* must be a document that contains the attributes and values
to be updated. When using the first syntax, *document* must also contain the *_key*
attribute to identify the document to be updated.
```
FOR u IN users
UPDATE { _key: u._key, name: CONCAT(u.firstName, u.lastName) } IN users
```
The following query is invalid because it does not contain a *_key* attribute and
thus it is not possible to determine the documents to be updated:
```
FOR u IN users
UPDATE { name: CONCAT(u.firstName, u.lastName) } IN users
```
When using the second syntax, *key-expression* provides the document identification.
This can either be a string (which must then contain the document key) or a
document, which must contain a *_key* attribute.
The following queries are equivalent:
```
FOR u IN users
UPDATE u._key WITH { name: CONCAT(u.firstName, u.lastName) } IN users
FOR u IN users
UPDATE { _key: u._key } WITH { name: CONCAT(u.firstName, u.lastName) } IN users
FOR u IN users
UPDATE u WITH { name: CONCAT(u.firstName, u.lastName) } IN users
```
An update operation may update arbitrary documents which do not need to be identical
to the ones produced by a preceding *FOR* statement:
```
FOR i IN 1..1000
UPDATE CONCAT('test', TO_STRING(i)) WITH { foobar: true } IN users
FOR u IN users
FILTER u.active == false
UPDATE u WITH { status: 'inactive' } IN backup
```
*options* can be used to suppress query errors that may occur when trying to
update non-existing documents or violating unique key constraints:
```
FOR i IN 1..1000
UPDATE { _key: CONCAT('test', TO_STRING(i)) } WITH { foobar: true } IN users OPTIONS { ignoreErrors: true }
```
An update operation will only update the attributes specified in *document* and
leave other attributes untouched. Internal attributes (such as *_id*, *_key*, *_rev*,
*_from* and *_to*) cannot be updated and are ignored when specified in *document*.
Updating a document will modify the document's revision number with a server-generated value.
When updating an attribute with a null value, ArangoDB will not remove the attribute
from the document but store a null value for it. To get rid of attributes in an update
operation, set them to null and provide the *keepNull* option:
```
FOR u IN users
UPDATE u WITH { foobar: true, notNeeded: null } IN users OPTIONS { keepNull: false }
```
The above query will remove the *notNeeded* attribute from the documents and update
the *foobar* attribute normally.
To make sure data are durable when an update query returns, there is the *waitForSync*
query option:
```
FOR u IN users
UPDATE u WITH { foobar: true } IN users OPTIONS { waitForSync: true }
```
!SUBSECTION REPLACE
The *REPLACE* keyword can be used to completely replace documents in a collection. On a
single server, the replace operation is executed transactionally in an all-or-nothing
fashion. For sharded collections, the entire replace operation is not transactional.
Only a single *REPLACE* statement is allowed per AQL query, and it cannot be combined
with other data-modification or retrieval operations. A replace operation is
restricted to a single collection, and the collection name must not be dynamic.
The two syntaxes for a replace operation are:
```
REPLACE document IN collection options
REPLACE key-expression WITH document IN collection options
```
*collection* must contain the name of the collection in which the documents should
be replaced. *document* is the replacement document. When using the first syntax, *document*
must also contain the *_key* attribute to identify the document to be replaced.
```
FOR u IN users
REPLACE { _key: u._key, name: CONCAT(u.firstName, u.lastName), status: u.status } IN users
```
The following query is invalid because it does not contain a *_key* attribute and
thus it is not possible to determine the documents to be replaced:
```
FOR u IN users
REPLACE { name: CONCAT(u.firstName, u.lastName, status: u.status) } IN users
```
When using the second syntax, *key-expression* provides the document identification.
This can either be a string (which must then contain the document key) or a
document, which must contain a *_key* attribute.
The following queries are equivalent:
```
FOR u IN users
REPLACE { _key: u._key, name: CONCAT(u.firstName, u.lastName) } IN users
FOR u IN users
REPLACE u._key WITH { name: CONCAT(u.firstName, u.lastName) } IN users
FOR u IN users
REPLACE { _key: u._key } WITH { name: CONCAT(u.firstName, u.lastName) } IN users
FOR u IN users
REPLACE u WITH { name: CONCAT(u.firstName, u.lastName) } IN users
```
A replace will fully replace an existing document, but it will not modify the values
of internal attributes (such as *_id*, *_key*, *_from* and *_to*). Replacing a document
will modify a document's revision number with a server-generated value.
A replace operation may update arbitrary documents which do not need to be identical
to the ones produced by a preceding *FOR* statement:
```
FOR i IN 1..1000
REPLACE CONCAT('test', TO_STRING(i)) WITH { foobar: true } IN users
FOR u IN users
FILTER u.active == false
REPLACE u WITH { status: 'inactive', name: u.name } IN backup
```
*options* can be used to suppress query errors that may occur when trying to
replace non-existing documents or when violating unique key constraints:
```
FOR i IN 1..1000
REPLACE { _key: CONCAT('test', TO_STRING(i)) } WITH { foobar: true } IN users OPTIONS { ignoreErrors: true }
```
To make sure data are durable when a replace query returns, there is the *waitForSync*
query option:
```
FOR i IN 1..1000
REPLACE { _key: CONCAT('test', TO_STRING(i)) } WITH { foobar: true } IN users OPTIONS { waitForSync: true }
```
!SUBSECTION INSERT
The *INSERT* keyword can be used to insert new documents into a collection. On a
single server, an insert operation is executed transactionally in an all-or-nothing
fashion. For sharded collections, the entire insert operation is not transactional.
Only a single *INSERT* statement is allowed per AQL query, and it cannot be combined
with other data-modification or retrieval operations. An insert operation is
restricted to a single collection, and the collection name must not be dynamic.
The syntax for an insert operation is:
```
INSERT document IN collection options
```
**Note**: The *INTO* keyword is also allowed in the place of *IN*.
*collection* must contain the name of the collection into which the documents should
be inserted. *document* is the document to be inserted, and it may or may not contain
a *_key* attribute. If no *_key* attribute is provided, ArangoDB will auto-generate
a value for *_key* value. Inserting a document will also auto-generate a document
revision number for the document.
```
FOR i IN 1..100
INSERT { value: i } IN numbers
```
When inserting into an edge collection, it is mandatory to specify the attributes
*_from* and *_to* in document:
```
FOR u IN users
FOR p IN products
FILTER u._key == p.recommendedBy
INSERT { _from: u._id, _to: p._id } IN recommendations
```
*options* can be used to suppress query errors that may occur when violating unique
key constraints:
```
FOR i IN 1..1000
INSERT { _key: CONCAT('test', TO_STRING(i)), name: "test" } WITH { foobar: true } IN users OPTIONS { ignoreErrors: true }
```
To make sure data are durable when an insert query returns, there is the *waitForSync*
query option:
```
FOR i IN 1..1000
INSERT { _key: CONCAT('test', TO_STRING(i)), name: "test" } WITH { foobar: true } IN users OPTIONS { waitForSync: true }
```