ArangoDB

ArangoDB Query Language (AQL)



Introduction

The ArangoDB query language (AQL) can be used to retrieve data that is stored in ArangoDB. The general workflow when executing a query is as follows:

  • a client application ships an AQL query to the ArangoDB server. The query text contains everything ArangoDB needs to compile the result set.
  • ArangoDB will parse the query, execute it and compile the results. If the query is invalid or cannot be executed, the server will return an error that the client can process and react to. If the query can be executed successfully, the server will return the query results to the client

AQL is mainly a declarative language, meaning that in a query it is expressed what result should be achieved and not how. AQL aims to be human- readable and therefore uses keywords from the English language. Another design goal of AQL was client independency, meaning that the language and syntax are the same for all clients, no matter what programming language the clients might use. Further design goals of AQL were to support complex query patterns, and to support the different data models ArangoDB offers.

In its purpose, AQL is similar to the Structured Query Language (SQL), but the two languages have major syntactic differences. Furthermore, to avoid any confusion between the two languages, the keywords in AQL have been chosen to be different from the keywords used in SQL.

AQL currently supports reading data only. That means you can use the language to issue read-requests on your database, but modifying data via AQL is currently not supported.

For some example queries, please refer to the page ArangoDB Query Language (AQL) Examples.

How to invoke AQL

You can run AQL queries from your application via the HTTP REST API. The full API description is available at HTTP Interface for AQL Query Cursors.

You can also run AQL queries from arangosh. To do so, first create an ArangoStatement object as follows:

 arangosh> stmt = db._createStatement( { "query": "FOR i IN [ 1, 2 ] RETURN i * 2" } );
 [object ArangoStatement]

To execute the query, use the execute method:

 arangosh> c = stmt.execute();
 [object ArangoQueryCursor]

This has executed the query. The query results are available in a cursor now. The cursor can return all its results at once using the elements method:

 arangosh> c.elements();
 [2, 4]

To execute a query using bind parameters, you need to create a statement first and then bind the parameters to it before execution:

 arangosh> stmt = db._createStatement( { "query": "FOR i IN [ @one, @two ] RETURN i * 2" } );
 [object ArangoStatement]
 arangosh> stmt.bind("one", 1);
 arangosh> stmt.bind("two", 2);
 arangosh> c = stmt.execute();
 [object ArangoQueryCursor]

The cursor results can then be dumped or traversed:

 arangosh> while (c.hasNext()) { print(c.next()); }
 2
 4

Please note that each cursor can be used exactly once as they are forward-only. Once all cursor results have been dumped or iterated, the cursor is empty. To iterate through the results again, the query needs to be re-executed.

Please also note that when using bind parameters, you must not re-declare an existing bind parameter because this will be considered an error:

 arangosh> stmt = db._createStatement( { "query": "FOR i IN [ @one, @two ] RETURN i * 2" } );
 [object ArangoStatement]
 arangosh> stmt.bind("one", 1);
 arangosh> stmt.bind("one", 1);
 JavaScript exception in file 'client/client.js' at 771,9: redeclaration of bind parameter

Query results

Result sets

The result of an AQL query is a list of values. The individual values in the result list may or may not have a homogenuous structure, depending on what is actually queried.

For example, when returning data from a collection with inhomogenuous documents (the individual documents in the collection have different attribute names) without modification, the result values will as well have an inhomogenuous structure. Each result value itself is a document:

FOR u IN users
  RETURN u

[ { "id" : 1, "name" : "John", "active" : false }, 
  { "age" : 32, "id" : 2, "name" : "Vanessa" }, 
  { "friends" : [ "John", "Vanessa" ], "id" : 3, "name" : "Amy" } ]

However, if a fixed set of attributes from the collection is queried, then the query result values will have a homogenuous structure. Each result value is still a document:

FOR u IN users
  RETURN { "id" : u.id, "name" : u.name }

[ { "id" : 1, "name" : "John" }, 
  { "id" : 2, "name" : "Vanessa" }, 
  { "id" : 3, "name" : "Amy" } ]

It is also possible to query just scalar values. In this case, the result set is a list of scalars, and each result value is a scalar value:

FOR u IN users
  RETURN u.id

[ 1, 2, 3 ]

If a query does not produce any results because no matching data can be found, it will produce an empty result list:

[ ]

Errors

Issuing an invalid query to the server will result in a parse error if the query is syntactically invalid. ArangoDB will detect such errors during query inspection and abort further processing. Instead, the error number and an error message are returned so that the errors can be fixed.

If a query passes the parsing stage, all collections referenced in the query will be opened. If any of the referenced collections is not present, query execution will again be aborted and an appropriate error message will be returned.

Executing a query might also produce run-time errors under some circumstances that cannot be predicted from inspecting the query text alone. This is because queries might use data from collections that might also be inhomogenuous. Some examples that will cause run-time errors are:

  • division by zero: will be triggered when an attempt is made to use the value 0 as the divisor in an arithmetic division or modulus operation
  • invalid operands for arithmetic operations: will be triggered when an attempt is made to use any non-numeric values as operands in arithmetic operations. This includes unary (unary minus, unary plus) and binary operations (plus, minus, multiplication, division, and modulus)
  • invalid operands for logical operations: will be triggered when an attempt is made to use any non-boolean values as operand(s) in logical operations. This includes unary (logical not/negation), binary (logical and, logical or), and the ternary operators.

Please refer to the Error codes and meanings page for a list of error codes and meanings.

Language basics

Whitespace

Whitespace can be used in the query text to increase its readability. However, for the parser any whitespace (spaces, carriage returns, line feeds, and tab stops) does not have any special meaning except that it separates individual tokens in the query. Whitespace within strings or names must be enclosed in quotes in order to be preserved.

Comments

Comments can be embedded at any position in a query. The text contained in the comment is ignored by the language parser. Comments cannot be nested, meaning the comment text may not contain another comment.

/* this is a comment */ RETURN 1

/* these */ RETURN /* are */ 1 /* multiple */ + /* comments */ 1

Keywords

On the top level, AQL offers the following operations:

  • FOR: list iteration
  • RETURN: results projection
  • FILTER: results filtering
  • SORT: result sorting
  • LIMIT: result slicing
  • LET: variable assignment
  • COLLECT: result grouping

Each of the above operations can be initiated in a query by using a keyword of the same name. An AQL query can (and typically does) consist of multiple of the above operations.

An example AQL query might look like this:

FOR u IN users
  FILTER u.type == "newbie" && u.active == true
  RETURN u.name

In this example query, the terms FOR, FILTER, and RETURN initiate the higher-level operation according to their name. These terms are also keywords, meaning that they have a special meaning in the language.

For example, the query parser will use the keywords to find out which high-level operations to execute. That also means keywords can only be used at certains locations in a query. This also makes all keywords reserved words that must not be used for other purposes than they are intended for.

For example, it is not possible to use a keyword as a collection or attribute name. If a collection or attribute need to have the same name as a keyword, the collection or attribute name needs to be quoted.

Keywords are case-insensitive, meaning they can be specified in lower, upper, or mixed case in queries. In this documentation, all keywords are written in upper case to make them distinguishable from other query parts.

In addition to the higher-level operations keywords, there are other keywords. The current list of keywords is:

  • FOR
  • RETURN
  • FILTER
  • SORT
  • LIMIT
  • LET
  • COLLECT
  • ASC
  • DESC
  • IN
  • INTO
  • NULL
  • TRUE
  • FALSE

Additional keywords might be added in future versions of ArangoDB.

Names

In general, names are used to identify objects (collections, attributes, variables, and functions) in AQL queries.

The maximum supported length of any name is 64 bytes. Names in AQL are always case-sensitive.

Keywords must not be used as names. If a reserved keyword should be used as a name, the name must be enclosed in backticks. Enclosing a name in backticks allows using otherwise-reserved keywords as names. An example for this is:

FOR f IN `filter` 
  RETURN f.`sort`

Due to the backticks, filter and sort are interpreted as names and not as keywords here.

Collection names

Collection names can be used in queries as they are. If a collection happens to have the same name as a keyword, the name must be enclosed in backticks.

Allowed characters in collection names are the letters a to z (both in lower and upper case) and the numbers 0 to 9 and the the underscore (_) symbol. A collection name must start with either a letter or a number, but not with an underscore.

Attribute names

When referring to attributes of documents from a collection, the fully qualified attribute name must be used. This is because multiple collections with ambigious attribute names might be used in a query. To avoid any ambiguity, it is not allowed to refer to an unqualified attribute name.

FOR u IN users
  FOR f IN friends
    FILTER u.active == true && f.active == true && u.id == f.userId
    RETURN u.name

In the above example, the attribute names active, name, id, and userId are qualified using the collection names they belong to (u and f respectively).

Variable names

AQL offers the user to assign values to additional variables in a query. All variables that are assigned a value must have a name that is unique within the context of the query. Variable names must be different from the names of any collection name used in the same query.

FOR u IN users
  LET friends = u.friends
  RETURN { "name" : u.name, "friends" : friends }

In the above query, users is a collection name, and both u and friends are variable names. This is because the FOR and LET operations need target variables to store their intermediate results.

Allowed characters in variable names are the letters a to z (both in lower and upper case), the numbers 0 to 9 and the underscore (_) symbol. A variable name must not start with a number. If a variable name starts with the underscore character, it must also contain at least one letter (a-z or A-Z).

Data types

AQL supports both primitive and compound data types. The following types are available:

  • primitive types: consisting of exactly one value
    • null: an empty value, also: the absence of a value
    • bool: boolean truth value with possible values false and true
    • number: signed (real) number
    • string: UTF-8 encoded text value
  • compound types: consisting of multiple values
    • list: sequence of values, referred to by their positions
    • document: sequence of values, referred to by their names

Numeric literals

Numeric literals can be integers or real values. They can optionally be signed using the + or - symbols. The scientific notation is also supported.

1
42
-1
-42
1.23
-99.99
0.1
-4.87e103

All numeric values are treated as 64-bit double-precision values internally. The internal format used is IEEE 754.

String literals

String literals must be enclosed in single or double quotes. If the used quote character is to be used itself within the string literal, it must be escaped using the backslash symbol. Backslash literals themselves also be escaped using a backslash.

"yikes!"
"don't know"
"this is a \"quoted\" word"
"this is a longer string."
"the path separator on Windows is \\"

'yikes!'
'don\'t know'
'this is a longer string."
'the path separator on Windows is \\'

All string literals must be UTF-8 encoded. It is currently not possible to use arbitrary binary data if it is not UTF-8 encoded. A workaround to use binary data is to encode the data using base64 or other algorithms on the application side before storing, and decoding it on application side after retrieval.

Lists

AQL supports two compound types:

  • lists: a composition of unnamed values, each accessible by their positions
  • documents: a composition of named values, each accessible by their names

The first supported compound type is the list type. Lists are effectively sequences of (unnamed/anonymous) values. Individual list elements can be accessed by their positions. The order of elements in a list is important.

An list-declaration starts with the [ symbol and ends with the ] symbol. A list-declaration contains zero or many expressions, seperated from each other with the , symbol.

In the easiest case, a list is empty and thus looks like:

[ ]

List elements can be any legal expression values. Nesting of lists is supported.

[ 1, 2, 3 ]
[ -99, "yikes!", [ true, [ "no"], [ ] ], 1 ]
[ [ "fox", "marshal" ] ] 

Individual list values can later be accesses by their positions using the [] accessor. The position of the accessed element must be a numeric value. Positions start at 0.

u.friends[2]

Documents

The other supported compound type is the document type. Documents are a composition of zero to many attributes. Each attribute is a name/value pair. Document attributes can be accessed individually by their names.

Document declarations start with the { symbol and end with the } symbol. A document contains zero to many attribute declarations, seperated from each other with the , symbol. In the simplest case, a document is empty. Its declaration would then be:

{ }

Each attribute in a document is a name/value pair. Name and value of an attribute are separated using the : symbol.

The attribute name is mandatory and must be specified as a quoted or unquoted string. If a keyword is to be used as an attribute name, the name must be quoted.

Any valid expression can be used as an attribute value. That also means nested documents can be used as attribute values

{ name : "Peter" }
{ "name" : "Vanessa", "age" : 15 }
{ "name" : "John", likes : [ "Swimming", "Skiing" ], "address" : { "street" : "Cucumber lane", "zip" : "94242" } }

Individual document attributes can later be accesses by their names using the . accessor. If a non-existing attribute is accessed, the result is null.

u.address.city.name
u.friends[0].name.first

Bind parameters

AQL supports the usage of bind parameters, thus allowing to separate the query text from literal values used in the query. It is good practice to separate the query text from the literal values because this will prevent (malicious) injection of keywords and other collection names into an existing query. This injection would be dangerous because it might change the meaning of an existing query.

Using bind parameters, the meaning of an existing query cannot be changed. Bind parameters can be used everywhere in a query where literals can be used.

The syntax for bind parameters is @name where name is the actual parameter name. The bind parameter values need to be passed along with the query when it is executed, but not as part of the query text itself. Please refer to the Accessing Cursors via HTTP manual section for information about how to pass the bind parameter values to the server.

FOR u IN users
  FILTER u.id == @id && u.name == @name
  RETURN u

Bind parameter names must start with any of the letters a to z (both in lower and upper case) or a digit (0 to 9), and can be followed by any letter, digit, or the underscore symbol.

A special type of bind parameter exists for injecting collection names. This type of bind parameter has a name prefixed with an additional @ symbol (thus when using the bind parameter in a query, two @ symbols must be used.

FOR u IN @@collection
  FILTER u.active == true
    RETURN u

Type and value order

When checking for equality or inequality or when determining the sort order of values, AQL uses a deterministic algorithm that takes both the data types and the actual values into account.

The compared operands are first compared by their data types, and only by their data values if the operands have the same data types.

The following type order is used when comparing data types:

null < bool  < number < string < list < document

This means null is the smallest type in AQL, and document is the type with the highest order. If the compared operands have a different type, then the comparison result is determined and the comparison is finished.

For example, the boolean true value will always be less than any numeric or string value, any list (even an empty list) or any document. Additionally, any string value (even an empty string) will always be greater than any numeric value, a boolean value, true, or false.

null < false
null < true
null < 0
null < ''
null < ' '
null < '0'
null < 'abc'
null < [ ]
null < { }

false < true
false < 0
false < ''
false < ' '
false < '0'
false < 'abc'
false < [ ]
false < { }

true < 0
true < ''
true < ' '
true < '0'
true < 'abc'
true < [ ]
true < { }

0 < ''
0 < ' '
0 < '0'
0 < 'abc'
0 < [ ]
0 < { }

'' < ' '
'' < '0'
'' < 'abc'
'' < [ ]
'' < { }

[ ] < { }

If the two compared operands have the same data types, then the operands values are compared. For the primitive types (null, boolean, number, and string), the result is defined as follows:

  • null: null is equal to null
  • boolean:false is less than true
  • number: numeric values are ordered by their cardinal value
  • string: string values are ordered using a byte-wise comparison

Note: unlike in SQL, null can be compared to any value, including null itself, without the result being converted into null automatically.

For compound, types the following special rules are applied:

Two list values are compared by comparing their individual elements position by position, starting at the first element. For each position, the element types are compared first. If the types are not equal, the comparison result is determined, and the comparison is finished. If the types are equal, then the values of the two elements are compared. If one of the lists is finished and the other list still has an element at a compared position, then null will be used as the element value of the fully traversed list.

If a list element is itself a compound value (a list or a document), then the comparison algorithm will check the element's sub values recursively. element's sub elements are compared recursively.

[ ] < [ 0 ]
[ 1 ] < [ 2 ]
[ 1, 2 ] < [ 2 ]
[ 99, 99 ] < [ 100 ]
[ false ] < [ true ]
[ false, 1 ] < [ false, '' ]

Two documents operands are compared by checking attribute names and value. The attribute names are compared first. Before attribute names are compared, a combined list of all attribute names from both operands is created and sorted lexicographically. This means that the order in which attributes are declared in a document is not relevant when comparing two documents.

The combined and sorted list of attribute names is then traversed, and the respective attributes from the two compared operands are then looked up. If one of the documents does not have an attribute with the sought name, its attribute value is considered to be null. Finally, the attribute value of both documents is compared using the beforementioned data type and value comparison. The comparisons are performed for all document attributes until there is an unambigious comparison result. If an unambigious comparison result is found, the comparison is finished. If there is no unambigious comparison result, the two compared documents are considered equal.

[ ] < [ 0 ]
[ 1 ] < [ 2 ]
[ 1, 2 ] < [ 2 ]
[ 99, 99 ] < [ 100 ]
[ false ] < [ true ]
[ false, 1 ] < [ false, '' ]

Accessing data from collections

Collection data can be accessed by specifying a collection name in a query. A collection can be understood as a list of documents, and that is how they are treated in AQL. Documents from collections are normally accessing using the FOR keyword. Note that when iterating over documents from a collection, the order of documents is undefined. To traverse documents in an explicit and deterministic order, the SORT keyword should be used in addition.

Data in collections is stored in documents, with each document potentially having different attributes than other documents. This is true even for documents of the same collection.

It is therefore quite normal to encounter documents that do not have some or all of the attributes that are queried in an AQL query. In this case, the non-existing attributes in the document will be treated as if they would exist with a value of null. That means that comparing a document attribute to null will return true if the document has the particular attribute and the attribute has a value of null, or that the document does not have the particular attribute at all.

For example, the following query will return all documents from the collection users that have a value of null in the attribute name, plus all documents from users that do not have the name attribute at all:

FOR u IN users
  FILTER u.name == null
  RETURN u

Furthermore, null is less than any other value (excluding null itself). That means documents with non-existing attributes might be included in the result when comparing attribute values with the less than or less equal operators.

For example, the following query with return all documents from the collection users that have an attribute age with a value less than 39, but also all documents from the collection that do not have the attribute age at all.

FOR u IN users
  FILTER u.age < 39
  RETURN u

This behavior should always be taken into account when writing queries.

Operators

AQL supports a number of operators that can be used in expressions. There are comparison, logical, arithmetic, and the ternary operator.

Comparison operators

Comparison (or relational) operators compare two operands. They can be used with any input data types, and will return a boolean result value.

The following comparison operators are supported:

  • == equality
  • != inequality
  • < less than
  • <= less or equal
  • > greater than
  • >= greater or equal
  • in test if a value is contained in a list

The in operator expects the second operand to be of type list. All other operators accept any data types for the first and second operands.

Each of the comparison operators returns a boolean value if the comparison can be evaluated and returns true if the comparison evaluates to true, and false otherwise.

Some examples for comparison operations in AQL:

1 > 0
true != null
45 <= "yikes!"
65 != "65"
65 == 65
1.23 < 1.32
1.5 IN [ 2, 3, 1.5 ]

Logical operators

Logical operators combine two boolean operands in a logical operation and return a boolean result value.

The following logical operators are supported:

  • && logical and operator
  • || logical or operator
  • ! logical not/negation operator

Some examples for logical operations in AQL:

u.age > 15 && u.address.city != ""
true || false
!u.isInvalid

The &&, ||, and ! operators expect their input operands to be boolean values each. If a non-boolean operand is used, the operation will fail with an error. In case all operands are valid, the result of each logical operator is a boolean value.

Both the && and || operators use short-circuit evaluation and only evaluate the second operand if the result of the operation cannot be determined by checking the first operand alone.

Arithmetic operators

Arithmetic operators perform an arithmetic operation on two numeric operands. The result of an arithmetic operation is again a numeric value. operators are supported:

AQL supports the following arithmetic operators:

  • + addition
  • - subtraction
  • * multiplication
  • / division
  • % modulus

These operators work with numeric operands only. Invoking any of the operators with non-numeric operands will result in an error. An error will also be raised for some other edge cases as division by zero, numeric over- or underflow etc. If both operands are numeric and the computation result is also valid, the result will be returned as a numeric value.

The unary plus and unary minus are supported as well.

Some example arithmetic operations:

1 + 1
33 - 99
12.4 * 4.5
13.0 / 0.1
23 % 7
-15
+9.99

Ternary operator

AQL also supports a ternary operator that can be used for conditional evaluation. The ternary operator expects a boolean condition as its first operand, and it returns the result of the second operand if the condition evaluates to true, and the third operand otherwise.

Example:

u.age > 15 || u.active == true ? u.userId : null

Operator precedence

The operator precedence in AQL is as follows (lowest precedence first):

  • ? : ternary operator
  • || logical or
  • && logical and
  • ==, != equality and inequality
  • in in operator
  • <, <=, >=, > less than, less equal, greater equal, greater than
  • +, - addition, subtraction
  • *, /, % multiplication, division, modulus
  • !, +, - logical negation, unary plus, unary minus
  • [*] expansion
  • () function call
  • . member access
  • [] indexed value access

The parentheses ( and ) can be used to enforce a different operator evaluation order.

Functions

AQL supports functions to allow more complex computations. Functions can be called at any query position where an expression is allowed. The general function call syntax is:

FUNCTIONAME(arguments)

where FUNCTIONNAME is the name of the function to be called, and arguments is a comma-separated list of function arguments. If a function does not need any arguments, the argument list can be left empty. However, even if the argument list is empty the parentheses around it are still mandatory to make function calls distinguishable from variable names.

Some example function calls:

HAS(user, "name")
LENGTH(friends)
COLLECTIONS()

Function names are not case-sensitive.

Type cast functions

As mentioned before, some of the operators expect their operands to have a certain data type. For example, the logical operators expect their operands to be boolean values, and the arithmetic operators expect their operands to be numeric values. If an operation is performed with operands of an unexpect type, the operation will fail with an error. To avoid such failures, value types can be converted explicitly in a query. This is called type casting.

In an AQL query, type casts are performed only upon request and not implicitly. This helps avoiding unexpected results. All type casts have to be performed by invoking a type cast function. AQL offers several type cast functions for this task. Each of the these functions takes an operand of any data type and returns a result value of type corresponding to the function name (e.g. TO_NUMBER() will return a number value):

  • TO_BOOL(value): takes an input value of any type and converts it into the appropriate boolean value as follows:
    • null is converted to false.
    • Numbers are converted to true if they are unequal to 0, and to false otherwise.
    • Strings are converted to true if they are non-empty, and to false otherwise.
    • Lists are converted to true if they are non-empty, and to false otherwise.
    • Documents are converted to true if they are non-empty, and to false otherwise.
  • TO_NUMBER(value): takes an input value of any type and converts it into a numeric value as follows:
    • null, false, lists, and documents are converted to the value 0.
    • true is converted to 1.
    • Strings are converted to their numeric equivalent if the full string content is is a valid number, and to 0 otherwise.
  • TO_STRING(value): takes an input value of any type and converts it into a string value as follows:
    • null is converted to the string "null"
    • false is converted to the string "false", true to the string "true"
    • numbers, lists, and documents are converted to their string equivalents.

Type check functions

AQL also offers functions to check the data type of a value at runtime. The following type check functions are available. Each of these functions takes an argument of any data type and returns true if the value has the type that is checked for, and false otherwise.

The following type check functions are available:

  • IS_NULL(value): checks whether value is a null value
  • IS_BOOL(value): checks whether value is a boolean value
  • IS_NUMBER(value): checks whether value is a numeric value
  • IS_STRING(value): checks whether value is a string value
  • IS_LIST(value): checks whether value is a list value
  • IS_DOCUMENT(value): checks whether value is a document value

String functions

For string processing, AQL offers the following functions:

  • CONCAT(value1, value2, ... valuen): concatenate the strings passed as in value1 to valuen. null values are ignored.
  • CONCAT_SEPARATOR(separator, value1, value2, ... valuen): concatenate the strings passed as arguments value1 to valuen using the separator string. null values are ignored.
  • CHAR_LENGTH(value): return the number of characters in value
  • LOWER(value): lower-case value
  • UPPER(value): upper-case value
  • SUBSTRING(value, offset, length): return a substring of value, starting at offset and with a maximum length of length characters. Offsets start at position 0.

Numeric functions

AQL offers some numeric functions for calculations. The following functions are supported:

  • FLOOR(value): returns the integer closest but not greater to value
  • CEIL(value): returns the integer closest but not less than value
  • ROUND(value): returns the integer closest to value
  • ABS(value): returns the absolute part of value
  • RAND(): returns a pseudo-random number between 0 and 1

List functions

AQL supports the following functions to operate on list values:

  • LENGTH(list): returns the length (number of elements) of list
  • MIN(list): returns the smallest element of list. null values are ignored. If the list is empty or only null are contained in the list, the function will return null.
  • MAX(list): returns the greatest element of list. null values are ignored. If the list is empty or only null are contained in the list, the function will return null.
  • SUM(list): returns the sum of values of the elements in list. This requires the elements in list to be numbers. null values are ignored. If the list is empty or only null are contained in the list, the function will return null.
  • REVERSE(list): returns the elements in list in reversed order.
  • FIRST(list): returns the first element in list or null if the list is empty.
  • LAST(list): returns the last element in list or null if the list is empty.
  • UNIQUE(list): returns all unique elements in list. To determine uniqueness, the function will use the comparison order defined in Type and value order. Calling this function might return the unique elements in any order.

Apart from these functions, AQL also offers several language constructs (e.g. FOR, SORT, LIMIT, COLLECT) to operate on lists.

Document functions

AQL supports the following functions to operate on document values:

  • MERGE(document1, document2, ... documentn): merges the documents in document1 to documentn into a single document. If document attribute keys are ambigious, the merged result will contain the values of the documents contained later in the argument list.
  • HAS(document, attributename): returns true if document has an attribute named attributename, and false otherwise.

Geo functions

AQL offers the following functions to filter data based on geo indexes:

  • NEAR(collection, latitude, longitude, limit, distancename): returns at most limit documents from collection collection that are near latitude and longitude. The result contains at limit documents, returned in any order. If more than limit documents qualify, it is undefined which of the qualifying documents are returned. Optionally, the distances between the specified coordinate (latitude and longitude) and the document coordinates can be returned as well. To make use of that, an attribute name for the distance result has to be specified in the distancename argument. The result documents will contain the distance value in an attribute of that name.
  • WITHIN(collection, latitude, longitude, radius, distancename): returns all documents from collection collection that are within a radius of radius around that specified coordinate (latitude and longitude). The order in which the result documents are returned is undefined. Optionally, the distance between the coordinate and the document coordinates can be returned as well. To make use of that, an attribute name for the distance result has to be specified in the distancename argument. The result documents will contain the distance value in an attribute of that name.

Note: these functions require the collection collection to have at least one geo index. If no geo index can be found, calling this function will fail with an error.

Graph functions

AQL has the following functions to traverse graphs:

  • PATHS(vertexcollection, edgecollection, direction, followcycles): returns a list of paths through the graph defined by the nodes in the collection vertexcollection and edges in the collection edgecollection. For each vertex in vertexcollection, it will determine the paths through the graph depending on the value of direction:
    • "outbound": follow all paths that start at the current vertex and lead to another vertex
    • "inbound": follow all paths that lead from another vertex to the current vertex
    • "any": combination of "outbound" and "inbound". The default value for direction is "outbound". If followcycles is true, cyclic paths will be followed as well. This is turned off by default.

The result of the function is a list of paths. Paths of length 0 will also be returned. Each path is a document consisting of the following attributes:

  • vertices: list of vertices visited along the path
  • edges: list of edges visited along the path (might be empty)
  • source: start vertex of path
  • destination: destination vertex of path

Example calls:

PATHS(friends, friendrelations, "outbound", false)

FOR p IN PATHS(friends, friendrelations, "outbound") 
  FILTER p.source._id == "123456/123456" && LENGTH(p.edges) == 2
  RETURN p.vertices[*].name

Control flow functions

AQL offers the following functions to let the user control the flow of operations:

  • NOT_NULL(condition, alternative): returns condition if it is not null, and alternative otherwise.

Miscellaneous functions

Finally, AQL supports the following functions that do not belong to any of the other function categories:

  • COLLECTIONS(): returns a list of collections. Each collection is returned as a document with attributes name and _id.

High-level operations

FOR

The FOR keyword can be to iterate over all elements of a list. The general syntax is:

FOR variable-name IN expression

Each list element returned by expression is visited exactly once. It is required that expression returns a list in all cases. The empty list is allowed, too. The current list element is made available for further processing in the variable specified by variable-name.

FOR u IN users
  RETURN u

This will iterate over all elements from the list users (note: this list consists of all documents from the collection named "users" in this case) and make the current list element available in variable u. u is not modified in this example but simply pushed into the result using the RETURN keyword.

Note: when iterating over collection-based lists as shown here, the order of documents is undefined unless an explicit sort order is defined using a SORT statement.

The variable introduced by FOR is available until the scope the FOR is placed in is closed.

Another example that uses a statically declared list of values to iterate over:

FOR year IN [ 2011, 2012, 2013 ]
  RETURN { "year" : year, "isLeapYear" : year % 4 == 0 && (year % 100 != 0 || year % 400 == 0) }

Nesting of multiple FOR statements is allowed, too. When FOR statements are nested, a cross product of the list elements returned by the individual FOR statements will be created.

FOR u IN users
  FOR l IN locations
    RETURN { "user" : u, "location" : l }

In this example, there are two list iterations: an outer iteration over the list users plus an inner iteration over the list locations. The inner list is traversed as many times as there are elements in the outer list. For each iteration, the current values of users and locations are made available for further processing in the variable u and l.

RETURN

The RETURN statement can (and must) be used to produce the result of a query. It is mandatory to specify a RETURN statement at the end of each block in a query, otherwise the query result would be undefined.

The general syntax for return is:

RETURN expression

The expression returned by RETURN is produced for each iteration the RETURN statement is placed in. That means the result of a RETURN statement is always a list (this includes the empty list). To return all elements from the currently iterated list without modification, the following simple form can be used:

FOR variable-name IN expression
  RETURN variable-name

As RETURN allows specifying an expression, arbitrary computations can be performed to calculate the result elements. Any of the variables valid in the scope the RETURN is placed in can be used for the computations.

Note: return will close the current scope and eliminate all local variables in it.

FILTER

The FILTER statement can be used to restrict the results to elements that match an arbitrary logical condition. The general syntax is:

FILTER condition

condition must be a condition that evaluates to either false or true. If the condition result is false, the current element is skipped, so it will not be processed further and not be part of the result. If the condition is true, the current element is not skipped and can be further processed.

FOR u IN users
  FILTER u.active == true && u.age < 39
  RETURN u

In the above example, all list elements from users will be included that have an attribute active with value true and that have an attribute age with a value less than 39. All other elements from users will be skipped and not be included the result produced by RETURN.

It is allowed to specifiy multiple FILTER statements in a query, and even in the same block. If multiple FILTER statements are used, their results will be combined with a logical and, meaning all filter conditions must be true to include an element.

FOR u IN users
  FILTER u.active == true
  FILTER u.age < 39
  RETURN u

SORT

The SORT statement will force a sort of the list of already produced intermediate results in the current block. SORT allows specifying one or multiple sort criteria and directions. The general syntax is:

SORT expression direction

Specifiyng the direction is optional. The default (implict) direction for a sort is the ascending order. To explicitly specify the sort direction, the keywords ASC (ascending) and DESC can be used. Multiple sort criteria can be separated using commas.

Note: when iterating over collection-based lists, the order of documents is always undefined unless an explicit sort order is defined using SORT.

FOR u IN users
  SORT u.lastName, u.firstName, u.id DESC
  RETURN u

LIMIT

The LIMIT statement allows slicing the list of result documents using an offset and a count. It reduces the number of elements in the result to at most the specified number. Two general forms of LIMIT are followed:

LIMIT count
LIMIT offset, count

The first form allows specifying only the count value whereas the second form allows specifying both offset and count. The first form is identical using the second form with an offset value of 0.

The offset value specifies how many elements from the result shall be discarded. It must be 0 or greater. The count value specifies how many elements should be at most included in the result.

FOR u IN users
  SORT u.firstName, u.lastName, u.id DESC
  LIMIT 0, 5
  RETURN u

LET

The LET statement can be used to assign an arbitrary value to a variable. The variable is then introduced in the scope the LET statement is placed in. The general syntax is:

LET variable-name = expression

LET statements are mostly used to declare complex computations and to avoid repeated computations of the same value at multiple parts of a query.

FOR u IN users
  LET numRecommendations = LENGTH(u.recommendations)
  RETURN { "user" : u, "numRecommendations" : numRecommendations, "isPowerUser" : numRecommendations >= 10 } 

In the above example, the computation of the number of recommendations is factored out using a LET statement, thus avoiding computing the value twice in the RETURN statement.

Another use case for LET is to declare a complex computation in a subquery, making the whole query more readable.

FOR u IN users
  LET friends = (
    FOR f IN friends 
      FILTER u.id == f.userId
      RETURN f
  )
  LET memberships = (
    FOR m IN memberships
      FILTER u.id == m.userId
      RETURN m
  )
  RETURN { "user" : u, "friends" : friends, "numFriends" : LENGTH(friends), "memberShips" : memberships }

COLLECT

The COLLECT keyword can be used to group a list by one or multiple group criteria. The two general syntaxes for COLLECT are:

COLLECT variable-name = expression
COLLECT variable-name = expression INTO groups

The first form only groups the result by the defined group criteria defined by expression. In order to further process the results produced by COLLECT, a new variable (specified by variable-name is introduced. This variable contains the group value.

The second form does the same as the first form, but additionally introduces a variable (specified by groups) that contains all elements that fell into the group. Specifying the INTO clause is optional-

FOR u IN users
  COLLECT city = u.city INTO g
  RETURN { "city" : city, "users" : g }

In the above example, the list of users will be grouped by the attribute city. The result is a new list of documents, with one element per distinct city value. The elements from the original list (here: users) per city are made available in the variable g. This is due to the INTO clause.

COLLECT also allows specifying multiple group criteria. Individual group criteria can be separated by commas.

FOR u IN users
  COLLECT first = u.firstName, age = u.age INTO g
  RETURN { "first" : first, "age" : age, "numUsers" : LENGTH(g) }

In the above example, the list of users is grouped by first names and ages first, and for each distinct combination of first name and age, the number of users found is returned.

Note: the COLLECT statement eliminates all local variables in the current scope. After COLLECT only the variables introduced by COLLECT itself are available.

Advanced features

Subqueries

Whereever an expression is allowed in AQL, a subquery can be placed. A subquery is a query part that can introduce its own local variables without affecting variables and values in its outer scope(s).

It is required that subqueries be put inside parentheses ( and ) to explicitly mark their start and end points:

FOR u IN users
  LET recommendations = ( 
    FOR r IN recommendations
      FILTER u.id == r.userId
      SORT u.rank DESC
      LIMIT 10
      RETURN r
  )
  RETURN { "user" : u, "recommendations" : recommendations }
   
  
FOR u IN users
  COLLECT city = u.city INTO g
  RETURN { "city" : city, "numUsers" : LENGTH(g), "maxRating": MAX(
    FOR r IN g 
      RETURN r.user.rating
  ) }

Subqueries might also include other subqueries themselves.

Variable expansion

In order to access a named attribute from all elements in a list easily, AQL offers the shortcut operator [\*] for variable expansion.

Using the [\*] operator with a variable will iterate over all elements in the variable thus allowing to access a particular attribute of each element. It is required that the expanded variable is a list. The result of the [\*] operator is again a list.

FOR u IN users
  RETURN { "user" : u, "friendNames" : u.friends[*].name }

In the above example, the attribute name is accessed for each element in the list u.friends. The result is a flat list of friend names, made available as the attribute friendNames.