!CHAPTER Language basics !SUBSECTION Whitespace Whitespace can be used in the query text to increase its readability. However, for the parser any whitespace (spaces, carriage returns, line feeds, and tab stops) does not have any special meaning except that it separates individual tokens in the query. Whitespace within strings or names must be enclosed in quotes in order to be preserved. !SUBSECTION Comments Comments can be embedded at any position in a query. The text contained in the comment is ignored by the AQL parser. Comments cannot be nested, meaning the comment text may not contain another comment. AQL supports two types of comments: - Single line comments: These start with a double forward slash and end at the end of the line, or the end of the query string (whichever is first). - Multi line comments: These start with a forward slash and asterisk, and end with an asterisk and a following forward slash. They can span as many lines as necessary. /* this is a comment */ RETURN 1 /* these */ RETURN /* are */ 1 /* multiple */ + /* comments */ 1 /* this is a multi line comment */ // a single line comment !SUBSECTION Keywords On the top level, AQL offers the following operations: - FOR: list iteration - RETURN: results projection - FILTER: results filtering - SORT: result sorting - LIMIT: result slicing - LET: variable assignment - COLLECT: result grouping Each of the above operations can be initiated in a query by using a keyword of the same name. An AQL query can (and typically does) consist of multiple of the above operations. An example AQL query might look like this: FOR u IN users FILTER u.type == "newbie" && u.active == true RETURN u.name In this example query, the terms `FOR`, `FILTER`, and `RETURN` initiate the higher-level operation according to their name. These terms are also keywords, meaning that they have a special meaning in the language. For example, the query parser will use the keywords to find out which high-level operations to execute. That also means keywords can only be used at certain locations in a query. This also makes all keywords reserved words that must not be used for other purposes than they are intended for. For example, it is not possible to use a keyword as a collection or attribute name. If a collection or attribute need to have the same name as a keyword, the collection or attribute name needs to be quoted. Keywords are case-insensitive, meaning they can be specified in lower, upper, or mixed case in queries. In this documentation, all keywords are written in upper case to make them distinguishable from other query parts. In addition to the higher-level operations keywords, there are other keywords. The current list of keywords is: - FOR - RETURN - FILTER - SORT - LIMIT - LET - COLLECT - ASC - DESC - IN - INTO - NULL - TRUE - FALSE Additional keywords might be added in future versions of ArangoDB. !SUBSECTION Names In general, names are used to identify objects (collections, attributes, variables, and functions) in AQL queries. The maximum supported length of any name is 64 bytes. Names in AQL are always case-sensitive. Keywords must not be used as names. If a reserved keyword should be used as a name, the name must be enclosed in backticks. Enclosing a name in backticks allows using otherwise-reserved keywords as names. An example for this is: FOR f IN `filter` RETURN f.`sort` Due to the backticks, `filter` and `sort` are interpreted as names and not as keywords here. !SUBSUBSECTION Collection names Collection names can be used in queries as they are. If a collection happens to have the same name as a keyword, the name must be enclosed in backticks. Please refer to the [Naming Conventions in ArangoDB](../NamingConventions/CollectionNames.md) about collection naming conventions. !SUBSUBSECTION Attribute names When referring to attributes of documents from a collection, the fully qualified attribute name must be used. This is because multiple collections with ambiguous attribute names might be used in a query. To avoid any ambiguity, it is not allowed to refer to an unqualified attribute name. Please refer to the [Naming Conventions in ArangoDB](../NamingConventions/AttributeNames.md) for more information about the attribute naming conventions. FOR u IN users FOR f IN friends FILTER u.active == true && f.active == true && u.id == f.userId RETURN u.name In the above example, the attribute names `active`, `name`, `id`, and `userId` are qualified using the collection names they belong to (`u` and `f` respectively). !SUBSUBSECTION Variable names AQL offers the user to assign values to additional variables in a query. All variables that are assigned a value must have a name that is unique within the context of the query. Variable names must be different from the names of any collection name used in the same query. FOR u IN users LET friends = u.friends RETURN { "name" : u.name, "friends" : friends } In the above query, `users` is a collection name, and both `u` and `friends` are variable names. This is because the `FOR` and `LET` operations need target variables to store their intermediate results. Allowed characters in variable names are the letters `a` to `z` (both in lower and upper case), the numbers `0` to `9` and the underscore (`_`) symbol. A variable name must not start with a number. If a variable name starts with the underscore character, it must also contain at least one letter (a-z or A-Z). !SUBSECTION Data types AQL supports both primitive and compound data types. The following types are available: - Primitive types: Consisting of exactly one value - null: An empty value, also: The absence of a value - bool: Boolean truth value with possible values `false` and `true` - number: Signed (real) number - string: UTF-8 encoded text value - Compound types: Consisting of multiple values - list: Sequence of values, referred to by their positions - document: Sequence of values, referred to by their names !SUBSUBSECTION Numeric literals Numeric literals can be integers or real values. They can optionally be signed using the `+` or `-` symbols. The scientific notation is also supported. 1 42 -1 -42 1.23 -99.99 0.1 -4.87e103 All numeric values are treated as 64-bit double-precision values internally. The internal format used is IEEE 754. !SUBSUBSECTION String literals String literals must be enclosed in single or double quotes. If the used quote character is to be used itself within the string literal, it must be escaped using the backslash symbol. Backslash literals themselves also be escaped using a backslash. "yikes!" "don't know" "this is a \"quoted\" word" "this is a longer string." "the path separator on Windows is \\" 'yikes!' 'don\'t know' 'this is a longer string." 'the path separator on Windows is \\' All string literals must be UTF-8 encoded. It is currently not possible to use arbitrary binary data if it is not UTF-8 encoded. A workaround to use binary data is to encode the data using base64 or other algorithms on the application side before storing, and decoding it on application side after retrieval. !SUBSUBSECTION Lists AQL supports two compound types: - lists: A composition of unnamed values, each accessible by their positions - documents: A composition of named values, each accessible by their names The first supported compound type is the list type. Lists are effectively sequences of (unnamed/anonymous) values. Individual list elements can be accessed by their positions. The order of elements in a list is important. A `list-declaration` starts with the `[` symbol and ends with the `]` symbol. A `list-declaration` contains zero or many `expression`s, separated from each other with the `,` symbol. In the easiest case, a list is empty and thus looks like: [ ] List elements can be any legal `expression` values. Nesting of lists is supported. [ 1, 2, 3 ] [ -99, "yikes!", [ true, [ "no"], [ ] ], 1 ] [ [ "fox", "marshal" ] ] Individual list values can later be accesses by their positions using the `[]` accessor. The position of the accessed element must be a numeric value. Positions start at 0. It is also possible to use negative index values to access list values starting from the end of the list. This is convenient if the length of the list is unknown and access to elements at the end of the list is required. // access 1st list element (element start at index 0) u.friends[0] // access 3rd list element u.friends[2] // access last list element u.friends[-1] // access second last list element u.friends[-2] !SUBSUBSECTION Documents The other supported compound type is the document type. Documents are a composition of zero to many attributes. Each attribute is a name/value pair. Document attributes can be accessed individually by their names. Document declarations start with the `{` symbol and end with the `}` symbol. A document contains zero to many attribute declarations, separated from each other with the `,` symbol. In the simplest case, a document is empty. Its declaration would then be: { } Each attribute in a document is a name/value pair. Name and value of an attribute are separated using the `:` symbol. The attribute name is mandatory and must be specified as a quoted or unquoted string. If a keyword is to be used as an attribute name, the name must be quoted. Any valid expression can be used as an attribute value. That also means nested documents can be used as attribute values { name : "Peter" } { "name" : "Vanessa", "age" : 15 } { "name" : "John", likes : [ "Swimming", "Skiing" ], "address" : { "street" : "Cucumber lane", "zip" : "94242" } } Individual document attributes can later be accesses by their names using the `.` accessor. If a non-existing attribute is accessed, the result is `null`. u.address.city.name u.friends[0].name.first !SUBSECTION Bind parameters AQL supports the usage of bind parameters, thus allowing to separate the query text from literal values used in the query. It is good practice to separate the query text from the literal values because this will prevent (malicious) injection of keywords and other collection names into an existing query. This injection would be dangerous because it might change the meaning of an existing query. Using bind parameters, the meaning of an existing query cannot be changed. Bind parameters can be used everywhere in a query where literals can be used. The syntax for bind parameters is `@nameparameter` where `nameparameter` is the actual parameter name. The bind parameter values need to be passed along with the query when it is executed, but not as part of the query text itself. Please refer to the @ref HttpCursorHttp manual section for information about how to pass the bind parameter values to the server. FOR u IN users FILTER u.id == @id && u.name == @nameparameter RETURN u Bind parameter names must start with any of the letters `a` to `z` (both in lower and upper case) or a digit (`0` to `9`), and can be followed by any letter, digit or the underscore symbol. A special type of bind parameter exists for injecting collection names. This type of bind parameter has a name prefixed with an additional `@` symbol (thus when using the bind parameter in a query, two `@` symbols must be used). FOR u IN @@collection FILTER u.active == true RETURN u !SUBSECTION Type and value order When checking for equality or inequality or when determining the sort order of values, AQL uses a deterministic algorithm that takes both the data types and the actual values into account. The compared operands are first compared by their data types, and only by their data values if the operands have the same data types. The following type order is used when comparing data types: null < bool < number < string < list < document This means `null` is the smallest type in AQL and `document` is the type with the highest order. If the compared operands have a different type, then the comparison result is determined and the comparison is finished. For example, the boolean `true` value will always be less than any numeric or string value, any list (even an empty list) or any document. Additionally, any string value (even an empty string) will always be greater than any numeric value, a boolean value, `true` or `false`. null < false null < true null < 0 null < '' null < ' ' null < '0' null < 'abc' null < [ ] null < { } false < true false < 0 false < '' false < ' ' false < '0' false < 'abc' false < [ ] false < { } true < 0 true < '' true < ' ' true < '0' true < 'abc' true < [ ] true < { } 0 < '' 0 < ' ' 0 < '0' 0 < 'abc' 0 < [ ] 0 < { } '' < ' ' '' < '0' '' < 'abc' '' < [ ] '' < { } [ ] < { } If the two compared operands have the same data types, then the operands values are compared. For the primitive types (null, boolean, number, and string), the result is defined as follows: - null: `null` is equal to `null` - boolean: `false` is less than `true` - number: numeric values are ordered by their cardinal value - string: string values are ordered using a localized comparison, see @ref CommandLineDefaultLanguage "--default-language" Note: unlike in SQL, `null` can be compared to any value, including `null` itself, without the result being converted into `null` automatically. For compound, types the following special rules are applied: Two list values are compared by comparing their individual elements position by position, starting at the first element. For each position, the element types are compared first. If the types are not equal, the comparison result is determined, and the comparison is finished. If the types are equal, then the values of the two elements are compared. If one of the lists is finished and the other list still has an element at a compared position, then `null` will be used as the element value of the fully traversed list. If a list element is itself a compound value (a list or a document), then the comparison algorithm will check the element's sub values recursively. The element's sub elements are compared recursively. [ ] < [ 0 ] [ 1 ] < [ 2 ] [ 1, 2 ] < [ 2 ] [ 99, 99 ] < [ 100 ] [ false ] < [ true ] [ false, 1 ] < [ false, '' ] Two documents operands are compared by checking attribute names and value. The attribute names are compared first. Before attribute names are compared, a combined list of all attribute names from both operands is created and sorted lexicographically. This means that the order in which attributes are declared in a document is not relevant when comparing two documents. The combined and sorted list of attribute names is then traversed, and the respective attributes from the two compared operands are then looked up. If one of the documents does not have an attribute with the sought name, its attribute value is considered to be `null`. Finally, the attribute value of both documents is compared using the before mentioned data type and value comparison. The comparisons are performed for all document attributes until there is an unambiguous comparison result. If an unambiguous comparison result is found, the comparison is finished. If there is no unambiguous comparison result, the two compared documents are considered equal. { } < { "a" : 1 } { } < { "a" : null } { "a" : 1 } < { "a" : 2 } { "b" : 1 } < { "a" : 0 } { "a" : { "c" : true } } < { "a" : { "c" : 0 } } { "a" : { "c" : true, "a" : 0 } } < { "a" : { "c" : false, "a" : 1 } } { "a" : 1, "b" : 2 } == { "b" : 2, "a" : 1 } !SUBSECTION Accessing data from collections Collection data can be accessed by specifying a collection name in a query. A collection can be understood as a list of documents, and that is how they are treated in AQL. Documents from collections are normally accessing using the `FOR` keyword. Note that when iterating over documents from a collection, the order of documents is undefined. To traverse documents in an explicit and deterministic order, the `SORT` keyword should be used in addition. Data in collections is stored in documents, with each document potentially having different attributes than other documents. This is true even for documents of the same collection. It is therefore quite normal to encounter documents that do not have some or all of the attributes that are queried in an AQL query. In this case, the non-existing attributes in the document will be treated as if they would exist with a value of `null`. That means that comparing a document attribute to `null` will return true if the document has the particular attribute and the attribute has a value of `null`, or that the document does not have the particular attribute at all. For example, the following query will return all documents from the collection `users` that have a value of `null` in the attribute `name`, plus all documents from `users` that do not have the `name` attribute at all: FOR u IN users FILTER u.name == null RETURN u Furthermore, `null` is less than any other value (excluding `null` itself). That means documents with non-existing attributes might be included in the result when comparing attribute values with the less than or less equal operators. For example, the following query will return all documents from the collection `users` that have an attribute `age` with a value less than `39`, but also all documents from the collection that do not have the attribute `age` at all. FOR u IN users FILTER u.age < 39 RETURN u This behavior should always be taken into account when writing queries.