mirror of https://gitee.com/bigwinds/arangodb
511 lines
18 KiB
Plaintext
511 lines
18 KiB
Plaintext
!CHAPTER Language basics
|
|
|
|
!SUBSECTION Query types
|
|
|
|
An AQL query must either return a result (indicated by usage of the *RETURN*
|
|
keyword) or execute a data-modification operation (indicated by usage
|
|
of one of the keywords *INSERT*, *UPDATE*, *REPLACE* or *REMOVE*). The AQL
|
|
parser will return an error if it detects more than one data-modification
|
|
operation in the same query or if it cannot figure out if the query is meant
|
|
to be a data retrieval or a modification operation.
|
|
|
|
AQL only allows *one* query in a single query string; thus semicolons to
|
|
indicate the end of one query and separate multiple queries (as seen in SQL) are
|
|
not allowed.
|
|
|
|
!SUBSECTION Whitespace
|
|
|
|
Whitespaces (blanks, carriage returns, line feeds, and tab stops) can be used
|
|
in the query text to increase its readability. Tokens have to be separated by
|
|
any number of whitespaces. Whitespace within strings or names must be enclosed
|
|
in quotes in order to be preserved.
|
|
|
|
!SUBSECTION Comments
|
|
|
|
Comments can be embedded at any position in a query. The text contained in the
|
|
comment is ignored by the AQL parser.
|
|
|
|
Multi-line comments cannot be nested, which means subsequent comment starts within
|
|
comments are ignored, comment ends will end the comment.
|
|
|
|
AQL supports two types of comments:
|
|
- Single line comments: These start with a double forward slash and end at
|
|
the end of the line, or the end of the query string (whichever is first).
|
|
- Multi line comments: These start with a forward slash and asterisk, and
|
|
end with an asterisk and a following forward slash. They can span as many
|
|
lines as necessary.
|
|
|
|
|
|
/* this is a comment */ RETURN 1
|
|
/* these */ RETURN /* are */ 1 /* multiple */ + /* comments */ 1
|
|
/* this is
|
|
a multi line
|
|
comment */
|
|
// a single line comment
|
|
|
|
!SUBSECTION Keywords
|
|
|
|
On the top level, AQL offers the following operations:
|
|
- `FOR`: array iteration
|
|
- `RETURN`: results projection
|
|
- `FILTER`: results filtering
|
|
- `SORT`: result sorting
|
|
- `LIMIT`: result slicing
|
|
- `LET`: variable assignment
|
|
- `COLLECT`: result grouping
|
|
- `INSERT`: insertion of new documents
|
|
- `UPDATE`: (partial) update of existing documents
|
|
- `REPLACE`: replacement of existing documents
|
|
- `REMOVE`: removal of existing documents
|
|
- `UPSERT`: insertion or update of existing documents
|
|
|
|
Each of the above operations can be initiated in a query by using a keyword of
|
|
the same name. An AQL query can (and typically does) consist of multiple of the
|
|
above operations.
|
|
|
|
An example AQL query may look like this:
|
|
|
|
FOR u IN users
|
|
FILTER u.type == "newbie" && u.active == true
|
|
RETURN u.name
|
|
|
|
In this example query, the terms *FOR*, *FILTER*, and *RETURN* initiate the
|
|
higher-level operation according to their name. These terms are also keywords,
|
|
meaning that they have a special meaning in the language.
|
|
|
|
For example, the query parser will use the keywords to find out which high-level
|
|
operations to execute. That also means keywords can only be used at certain
|
|
locations in a query. This also makes all keywords reserved words that must not
|
|
be used for other purposes than they are intended for.
|
|
|
|
For example, it is not possible to use a keyword as a collection or attribute
|
|
name. If a collection or attribute need to have the same name as a keyword, the
|
|
collection or attribute name needs to be quoted.
|
|
|
|
Keywords are case-insensitive, meaning they can be specified in lower, upper, or
|
|
mixed case in queries. In this documentation, all keywords are written in upper
|
|
case to make them distinguishable from other query parts.
|
|
|
|
In addition to the higher-level operations keywords, there are other keywords.
|
|
The current list of keywords is:
|
|
|
|
- FOR
|
|
- RETURN
|
|
- FILTER
|
|
- SORT
|
|
- LIMIT
|
|
- LET
|
|
- COLLECT
|
|
- INSERT
|
|
- UPDATE
|
|
- REPLACE
|
|
- REMOVE
|
|
- UPSERT
|
|
- WITH
|
|
- ASC
|
|
- DESC
|
|
- IN
|
|
- INTO
|
|
- NOT
|
|
- AND
|
|
- OR
|
|
- NULL
|
|
- TRUE
|
|
- FALSE
|
|
|
|
Additional keywords may be added in future versions of ArangoDB.
|
|
|
|
!SUBSECTION Names
|
|
|
|
In general, names are used to identify objects (collections, attributes,
|
|
variables, and functions) in AQL queries.
|
|
|
|
The maximum supported length of any name is 64 bytes. Names in AQL are always
|
|
case-sensitive.
|
|
|
|
Keywords must not be used as names. If a reserved keyword should be used as a
|
|
name, the name must be enclosed in backticks. Enclosing a name in backticks
|
|
makes it possible to use otherwise reserved keywords as names. An example for this is:
|
|
|
|
FOR f IN `filter`
|
|
RETURN f.`sort`
|
|
|
|
Due to the backticks, *filter* and *sort* are interpreted as names and not as
|
|
keywords here.
|
|
|
|
!SUBSUBSECTION Collection names
|
|
|
|
Collection names can be used in queries as they are. If a collection happens to
|
|
have the same name as a keyword, the name must be enclosed in backticks.
|
|
|
|
Please refer to the [Naming Conventions in ArangoDB](../NamingConventions/CollectionNames.md) about collection naming
|
|
conventions.
|
|
|
|
!SUBSUBSECTION Attribute names
|
|
|
|
When referring to attributes of documents from a collection, the fully qualified
|
|
attribute name must be used. This is because multiple collections with ambiguous
|
|
attribute names may be used in a query. To avoid any ambiguity, it is not
|
|
allowed to refer to an unqualified attribute name.
|
|
|
|
Please refer to the [Naming Conventions in ArangoDB](../NamingConventions/AttributeNames.md) for more information about the
|
|
attribute naming conventions.
|
|
|
|
FOR u IN users
|
|
FOR f IN friends
|
|
FILTER u.active == true && f.active == true && u.id == f.userId
|
|
RETURN u.name
|
|
|
|
In the above example, the attribute names *active*, *name*, *id*, and *userId*
|
|
are qualified using the collection names they belong to (*u* and *f*
|
|
respectively).
|
|
|
|
!SUBSUBSECTION Variable names
|
|
|
|
AQL allows the user to assign values to additional variables in a query. All
|
|
variables that are assigned a value must have a name that is unique within the
|
|
context of the query. Variable names must be different from the names of any
|
|
collection name used in the same query.
|
|
|
|
FOR u IN users
|
|
LET friends = u.friends
|
|
RETURN { "name" : u.name, "friends" : friends }
|
|
|
|
In the above query, *users* is a collection name, and both *u* and *friends* are
|
|
variable names. This is because the *FOR* and *LET* operations need target
|
|
variables to store their intermediate results.
|
|
|
|
Allowed characters in variable names are the letters *a* to *z* (both in lower
|
|
and upper case), the numbers *0* to *9* and the underscore (*_*) symbol. A
|
|
variable name must not start with a number. If a variable name starts with the
|
|
underscore character, it must also contain at least one letter (a-z or A-Z).
|
|
|
|
!SUBSECTION Data types
|
|
|
|
AQL supports both primitive and compound data types. The following types are
|
|
available:
|
|
|
|
- Primitive types: Consisting of exactly one value
|
|
- null: An empty value, also: The absence of a value
|
|
- bool: Boolean truth value with possible values *false* and *true*
|
|
- number: Signed (real) number
|
|
- string: UTF-8 encoded text value
|
|
- Compound types: Consisting of multiple values
|
|
- array: Sequence of values, referred to by their positions
|
|
- object / document: Sequence of values, referred to by their names
|
|
|
|
!SUBSUBSECTION Numeric literals
|
|
|
|
Numeric literals can be integers or real values. They can optionally be signed
|
|
using the *+* or *-* symbols. The scientific notation is also supported.
|
|
|
|
1
|
|
42
|
|
-1
|
|
-42
|
|
1.23
|
|
-99.99
|
|
0.1
|
|
-4.87e103
|
|
|
|
All numeric values are treated as 64-bit double-precision values internally.
|
|
The internal format used is IEEE 754.
|
|
|
|
!SUBSUBSECTION String literals
|
|
|
|
String literals must be enclosed in single or double quotes. If the used quote
|
|
character is to be used itself within the string literal, it must be escaped
|
|
using the backslash symbol. Backslash literals themselves also be escaped using
|
|
a backslash.
|
|
|
|
"yikes!"
|
|
"don't know"
|
|
"this is a \"quoted\" word"
|
|
"this is a longer string."
|
|
"the path separator on Windows is \\"
|
|
|
|
'yikes!'
|
|
'don\'t know'
|
|
'this is a longer string."
|
|
'the path separator on Windows is \\'
|
|
|
|
All string literals must be UTF-8 encoded. It is currently not possible to use
|
|
arbitrary binary data if it is not UTF-8 encoded. A workaround to use binary
|
|
data is to encode the data using base64 or other algorithms on the application
|
|
side before storing, and decoding it on application side after retrieval.
|
|
|
|
!SUBSUBSECTION Arrays
|
|
|
|
AQL supports two compound types:
|
|
|
|
- arrays: A composition of unnamed values, each accessible by their positions
|
|
- objects / documents: A composition of named values, each accessible by their names
|
|
|
|
The first supported compound type is the array type. Arrays are effectively
|
|
sequences of (unnamed / anonymous) values. Individual array elements can be
|
|
accessed by their positions. The order of elements in an array is important.
|
|
|
|
An *array-declaration* starts with the *[* symbol and ends with the *]* symbol. An
|
|
*array-declaration* contains zero or many *expression*s, separated from each
|
|
other with the *,* symbol.
|
|
|
|
In the easiest case, an array is empty and thus looks like:
|
|
|
|
[ ]
|
|
|
|
Array elements can be any legal *expression* values. Nesting of arrays is
|
|
supported.
|
|
|
|
[ 1, 2, 3 ]
|
|
[ -99, "yikes!", [ true, [ "no"], [ ] ], 1 ]
|
|
[ [ "fox", "marshal" ] ]
|
|
|
|
Individual array values can later be accesses bd their positions using the *[]*
|
|
accessor. The position of the accessed element must be a numeric
|
|
value. Positions start at 0. It is also possible to use negative index values
|
|
to access array values starting from the end of the array. This is convenient if
|
|
the length of the array is unknown and access to elements at the end of the array
|
|
is required.
|
|
|
|
// access 1st array element (element start at index 0)
|
|
u.friends[0]
|
|
|
|
// access 3rd array element
|
|
u.friends[2]
|
|
|
|
// access last array element
|
|
u.friends[-1]
|
|
|
|
// access second last array element
|
|
u.friends[-2]
|
|
|
|
!SUBSUBSECTION Objects / Documents
|
|
|
|
The other supported compound type is the object (or document) type. Objects are a
|
|
composition of zero to many attributes. Each attribute is a name/value pair.
|
|
Object attributes can be accessed individually by their names.
|
|
|
|
Object declarations start with the *{* symbol and end with the *}* symbol. An
|
|
object contains zero to many attribute declarations, separated from each other
|
|
with the *,* symbol. In the simplest case, an object is empty. Its
|
|
declaration would then be:
|
|
|
|
{ }
|
|
|
|
Each attribute in an object is a name / value pair. Name and value of an
|
|
attribute are separated using the *:* symbol.
|
|
|
|
The attribute name is mandatory and must be specified as a quoted or unquoted
|
|
string. If a keyword is to be used as an attribute name, the name must be
|
|
quoted.
|
|
|
|
Any valid expression can be used as an attribute value. That also means nested
|
|
objects can be used as attribute values:
|
|
|
|
{ name : "Peter" }
|
|
{ "name" : "Vanessa", "age" : 15 }
|
|
{ "name" : "John", likes : [ "Swimming", "Skiing" ], "address" : { "street" : "Cucumber lane", "zip" : "94242" } }
|
|
|
|
Individual object attributes can later be accessed by their names using the
|
|
*.* accessor. If a non-existing attribute is accessed, the result is *null*.
|
|
|
|
u.address.city.name
|
|
u.friends[0].name.first
|
|
|
|
!SUBSECTION Bind parameters
|
|
|
|
AQL supports the usage of bind parameters, thus allowing to separate the query
|
|
text from literal values used in the query. It is good practice to separate the
|
|
query text from the literal values because this will prevent (malicious)
|
|
injection of keywords and other collection names into an existing query. This
|
|
injection would be dangerous because it may change the meaning of an existing
|
|
query.
|
|
|
|
Using bind parameters, the meaning of an existing query cannot be changed. Bind
|
|
parameters can be used everywhere in a query where literals can be used.
|
|
|
|
The syntax for bind parameters is *@nameparameter* where *nameparameter* is the
|
|
actual parameter name. The bind parameter values need to be passed along with
|
|
the query when it is executed, but not as part of the query text itself.
|
|
|
|
FOR u IN users
|
|
FILTER u.id == @id && u.name == @nameparameter
|
|
RETURN u
|
|
|
|
Bind parameter names must start with any of the letters *a* to *z* (both in
|
|
lower and upper case) or a digit (*0* to *9*), and can be followed by any
|
|
letter, digit or the underscore symbol.
|
|
|
|
A special type of bind parameter exists for injecting collection names. This
|
|
type of bind parameter has a name prefixed with an additional *@* symbol (thus
|
|
when using the bind parameter in a query, two *@* symbols must be used).
|
|
|
|
FOR u IN @@collection
|
|
FILTER u.active == true
|
|
RETURN u
|
|
|
|
!SUBSECTION Type and value order
|
|
|
|
When checking for equality or inequality or when determining the sort order of
|
|
values, AQL uses a deterministic algorithm that takes both the data types and
|
|
the actual values into account.
|
|
|
|
The compared operands are first compared by their data types, and only by their
|
|
data values if the operands have the same data types.
|
|
|
|
The following type order is used when comparing data types:
|
|
|
|
null < bool < number < string < array < object / document
|
|
|
|
This means *null* is the smallest type in AQL and *document* is the type with
|
|
the highest order. If the compared operands have a different type, then the
|
|
comparison result is determined and the comparison is finished.
|
|
|
|
For example, the boolean *true* value will always be less than any numeric or
|
|
string value, any array (even an empty array) or any object / document. Additionally, any
|
|
string value (even an empty string) will always be greater than any numeric
|
|
value, a boolean value, *true* or *false*.
|
|
|
|
null < false
|
|
null < true
|
|
null < 0
|
|
null < ''
|
|
null < ' '
|
|
null < '0'
|
|
null < 'abc'
|
|
null < [ ]
|
|
null < { }
|
|
|
|
false < true
|
|
false < 0
|
|
false < ''
|
|
false < ' '
|
|
false < '0'
|
|
false < 'abc'
|
|
false < [ ]
|
|
false < { }
|
|
|
|
true < 0
|
|
true < ''
|
|
true < ' '
|
|
true < '0'
|
|
true < 'abc'
|
|
true < [ ]
|
|
true < { }
|
|
|
|
0 < ''
|
|
0 < ' '
|
|
0 < '0'
|
|
0 < 'abc'
|
|
0 < [ ]
|
|
0 < { }
|
|
|
|
'' < ' '
|
|
'' < '0'
|
|
'' < 'abc'
|
|
'' < [ ]
|
|
'' < { }
|
|
|
|
[ ] < { }
|
|
|
|
If the two compared operands have the same data types, then the operands values
|
|
are compared. For the primitive types (null, boolean, number, and string), the
|
|
result is defined as follows:
|
|
|
|
- null: *null* is equal to *null*
|
|
- boolean: *false* is less than *true*
|
|
- number: numeric values are ordered by their cardinal value
|
|
- string: string values are ordered using a localized comparison,
|
|
|
|
Note: unlike in SQL, *null* can be compared to any value, including *null*
|
|
itself, without the result being converted into *null* automatically.
|
|
|
|
For compound, types the following special rules are applied:
|
|
|
|
Two array values are compared by comparing their individual elements position by
|
|
position, starting at the first element. For each position, the element types
|
|
are compared first. If the types are not equal, the comparison result is
|
|
determined, and the comparison is finished. If the types are equal, then the
|
|
values of the two elements are compared. If one of the arrays is finished and
|
|
the other array still has an element at a compared position, then *null* will be
|
|
used as the element value of the fully traversed array.
|
|
|
|
If an array element is itself a compound value (an array or an object / document), then the
|
|
comparison algorithm will check the element's sub values recursively. The element's
|
|
sub-elements are compared recursively.
|
|
|
|
[ ] < [ 0 ]
|
|
[ 1 ] < [ 2 ]
|
|
[ 1, 2 ] < [ 2 ]
|
|
[ 99, 99 ] < [ 100 ]
|
|
[ false ] < [ true ]
|
|
[ false, 1 ] < [ false, '' ]
|
|
|
|
Two object / documents operands are compared by checking attribute names and value. The
|
|
attribute names are compared first. Before attribute names are compared, a
|
|
combined array of all attribute names from both operands is created and sorted
|
|
lexicographically. This means that the order in which attributes are declared
|
|
in an object / document is not relevant when comparing two objects / documents.
|
|
|
|
The combined and sorted array of attribute names is then traversed, and the
|
|
respective attributes from the two compared operands are then looked up. If one
|
|
of the objects / documents does not have an attribute with the sought name, its attribute
|
|
value is considered to be *null*. Finally, the attribute value of both
|
|
objects / documents is compared using the before mentioned data type and value comparison.
|
|
The comparisons are performed for all object / document attributes until there is an
|
|
unambiguous comparison result. If an unambiguous comparison result is found, the
|
|
comparison is finished. If there is no unambiguous comparison result, the two
|
|
compared objects / documents are considered equal.
|
|
|
|
{ } < { "a" : 1 }
|
|
{ } < { "a" : null }
|
|
{ "a" : 1 } < { "a" : 2 }
|
|
{ "b" : 1 } < { "a" : 0 }
|
|
{ "a" : { "c" : true } } < { "a" : { "c" : 0 } }
|
|
{ "a" : { "c" : true, "a" : 0 } } < { "a" : { "c" : false, "a" : 1 } }
|
|
|
|
{ "a" : 1, "b" : 2 } == { "b" : 2, "a" : 1 }
|
|
|
|
!SUBSECTION Accessing data from collections
|
|
|
|
Collection data can be accessed by specifying a collection name in a query. A
|
|
collection can be understood as an array of documents, and that is how they are
|
|
treated in AQL. Documents from collections are normally accessing using the
|
|
*FOR* keyword. Note that when iterating over documents from a collection, the
|
|
order of documents is undefined. To traverse documents in an explicit and
|
|
deterministic order, the *SORT* keyword should be used in addition.
|
|
|
|
Data in collections is stored in documents, with each document potentially
|
|
having different attributes than other documents. This is true even for
|
|
documents of the same collection.
|
|
|
|
It is therefore quite normal to encounter documents that do not have some or all
|
|
of the attributes that are queried in an AQL query. In this case, the
|
|
non-existing attributes in the document will be treated as if they would exist
|
|
with a value of *null*. That means that comparing a document attribute to
|
|
*null* will return true if the document has the particular attribute and the
|
|
attribute has a value of *null*, or that the document does not have the
|
|
particular attribute at all.
|
|
|
|
For example, the following query will return all documents from the collection
|
|
*users* that have a value of *null* in the attribute *name*, plus all documents
|
|
from *users* that do not have the *name* attribute at all:
|
|
|
|
FOR u IN users
|
|
FILTER u.name == null
|
|
RETURN u
|
|
|
|
Furthermore, *null* is less than any other value (excluding *null* itself). That
|
|
means documents with non-existing attributes may be included in the result
|
|
when comparing attribute values with the less than or less equal operators.
|
|
|
|
For example, the following query will return all documents from the collection
|
|
*users* that have an attribute *age* with a value less than *39*, but also all
|
|
documents from the collection that do not have the attribute *age* at all.
|
|
|
|
FOR u IN users
|
|
FILTER u.age < 39
|
|
RETURN u
|
|
|
|
This behavior should always be taken into account when writing queries.
|