10 KiB
Features and Improvements in ArangoDB 3.5
The following list shows in detail which features have been added or improved in ArangoDB 3.5. ArangoDB 3.5 also contains several bug fixes that are not listed here.
AQL
SORT-LIMIT optimization
A new SORT-LIMIT optimization has been added. This optimization will be pulled off by the query optimizer if there is a SORT statement followed by a LIMIT node, and the overall number of documents to return is relatively small in relation to the total number of documents to be sorted. In this case, the optimizer will use a size-constrained heap for keeping only the required number of results in memory, which can drastically reduce memory usage and, for some queries, also execution time for the sorting.
If the optimization is applied, it will show as "sort-limit" rule in the query execution plan.
Sorted primary index (RocksDB engine)
The query optimizer can now make use of the sortedness of primary indexes if the
RocksDB engine is used. This means the primary index can be utilized for queries
that sort by either the _key
or _id
attributes of a collection and also for
range queries on these attributes.
In the list of documents for a collection in the web interface, the documents will
now always be sorted in lexicographical order of their _key
values. An exception for
keys representing quasi-numerical values has been removed when doing the sorting in
the web interface. Removing this exception can also speed up the display of the list
of documents.
This change potentially affects the order in which documents are displayed in the list of documents overview in the web interface. A document with a key value "10" will now be displayed before a document with a key value of "9". In previous versions of ArangoDB this was exactly opposite.
Edge index query optimization (RocksDB engine)
An AQL query that uses the edge index only and returns the opposite side of the edge can now be executed in a more optimized way, e.g.
FOR edge IN edgeCollection FILTER edge._from == "v/1" RETURN edge._to
is fully covered by the RocksDB edge index.
For MMFiles this rule does not apply.
AQL syntax improvements
AQL now allows the usage of floating point values without leading zeros, e.g.
.1234
. Previous versions of ArangoDB required a leading zero in front of
the decimal separator, i.e 0.1234
.
Background Index Creation
Creating new indexes is by default done under an exclusive collection lock. This means that the collection (or the respective shards) are not available for write operations as long as the index is created. This "foreground" index creation can be undesirable, if you have to perform it on a live system without a dedicated maintenance window.
Starting with ArangoDB 3.5, indexes can also be created in "background", not using an exclusive lock during the entire index creation. The collection remains basically available, so that other CRUD operations can run on the collection while the index is being created. This can be achieved by setting the inBackground attribute when creating an index.
To create an index in the background in arangosh just specify inBackground: true
,
like in the following example:
db.collection.ensureIndex({ type: "hash", fields: [ "value" ], inBackground: true });
Indexes that are still in the build process will not be visible via the ArangoDB APIs. Nevertheless it is not possible to create the same index twice via the ensureIndex API while an index is still begin created. AQL queries also will not use these indexes until the index reports back as fully created. Note that the initial ensureIndex call or HTTP request will still block until the index is completely ready. Existing single-threaded client programs can thus safely set the inBackground option to true and continue to work as before.
Should you be building an index in the background you cannot rename or drop the collection. These operations will block until the index creation is finished. This is equally the case with foreground indexing.
After an interrupted index build (i.e. due to a server crash) the partially built index will the removed. In the ArangoDB cluster the index might then be automatically recreated on affected shards.
Background index creation might be slower than the "foreground" index creation and require more RAM. Under a write heavy load (specifically many remove, update or replace operations), the background index creation needs to keep a list of removed documents in RAM. This might become unsustainable if this list grows to tens of millions of entries.
Building an index is always a write-heavy operation, so it is always a good idea to build indexes during times with less load.
Please note that background index creation is useful only in combination with the RocksDB storage engine. With the MMFiles storage engine, creating an index will always block any other operations on the collection.
TTL (time-to-live) Indexes
The new TTL indexes provided by ArangoDB can be used for removing expired documents from a collection.
A TTL index can be set up by setting an expireAfter
value and by picking a single
document attribute which contains the documents' creation date and time. Documents
are expired after expireAfter
seconds after their creation time. The creation time
is specified as either a numeric timestamp or a UTC datestring.
For example, if expireAfter
is set to 600 seconds (10 minutes) and the index
attribute is "creationDate" and there is the following document:
{ "creationDate" : 1550165973 }
This document will be indexed with a creation timestamp value of 1550165973
,
which translates to the human-readable date string 2019-02-14T17:39:33.000Z
. The
document will expire 600 seconds afterwards, which is at timestamp 1550166573000
(or
2019-02-14T17:49:33.000Z
in the human-readable version).
The actual removal of expired documents will not necessarily happen immediately. Expired documents will eventually removed by a background thread that is periodically going through all TTL indexes and removing the expired documents.
There is no guarantee when exactly the removal of expired documents will be carried out, so queries may still find and return documents that have already expired. These will eventually be removed when the background thread kicks in and has capacity to remove the expired documents. It is guaranteed however that only documents which are past their expiration time will actually be removed.
Please note that the numeric timestamp values for the index attribute should be
specified in seconds since January 1st 1970 (Unix timestamp). To calculate the current
timestamp from JavaScript in this format, there is Date.now() / 1000
, to calculate it
from an arbitrary Date instance, there is Date.getTime() / 1000
.
Alternatively, the index attribute values can be specified as a date string in format
YYYY-MM-DDTHH:MM:SS
with optional milliseconds. All date strings will be interpreted
as UTC dates.
The above example document using a datestring attribute value would be
{ "creationDate" : "2019-02-14T17:39:33.000Z" }
In case the index attribute does not contain a numeric value nor a proper date string, the document will not be stored in the TTL index and thus will not become a candidate for expiration and removal. Providing either a non-numeric value or even no value for the index attribute is a supported way of keeping documents from being expired and removed.
There can at most be one TTL index per collection. It is not recommended to use TTL indexes for user-land AQL queries, as TTL indexes may store a transformed, always numerical version of the index attribute value.
The frequency for invoking the background removal thread can be configured
using the --ttl.frequency
startup option.
In order to avoid "random" load spikes by the background thread suddenly kicking
in and removing a lot of documents at once, the number of to-be-removed documents
per thread invocation can be capped.
The total maximum number of documents to be removed per thread invocation is
controlled by the startup option --ttl.max-total-removes
. The maximum number of
documents in a single collection at once can be controlled by the startup option
--ttl.max-collection-removes
.
HTTP API extensions
The HTTP API for creating indexes at POST /_api/index
has been extended two-fold:
-
to create a TTL (time-to-live) index, it is now possible to specify a value of
ttl
in thetype
attribute. When creating a TTL index, the attributeexpireAfter
is also required. That attribute contains the expiration time (in seconds), which is based on the documents' index attribute value. -
to create an index in background, the attribute
inBackground
can be set totrue
.
Web interface
For the RocksDB engine, the selection of index types "persistent" and "skiplist" has been removed from the web interface when creating new indexes.
The index types "hash", "skiplist" and "persistent" are just aliases of each other when using the RocksDB engine, so there is no need to offer all of them in parallel.
Miscellaneous
Improved overview of available program options
The --help-all
command-line option for all ArangoDB executables will now also
show all hidden program options.
Previously hidden program options were only returned when invoking arangod or
a client tool with the cryptic --help-.
option. Now --help-all
simply retuns
them as well.
Fewer system collections
The system collections _routing
and _modules
are not created anymore for new
new databases, as both are only needed for legacy functionality.
Existing _routing
collections will not be touched as they may contain user-defined
entries, and will continue to work.
Existing _modules
collections will also remain functional.
Internal
We have moved from C++11 to C++14, which allows us to use some of the simplifications, features and guarantees that this standard has in stock. To compile ArangoDB from source, a compiler that supports C++14 is now required.
The bundled JEMalloc memory allocator used in ArangoDB release packages has been upgraded from version 5.0.1 to version 5.1.0.
The bundled version of the RocksDB library has been upgraded from 5.16 to 5.18.
The bundled version of the V8 JavaScript engine has been upgraded from 5.7.492.77 to 7.1.302.28.