1
0
Fork 0
arangodb/Documentation/Books/Manual/ReleaseNotes/NewFeatures35.md

10 KiB

Features and Improvements in ArangoDB 3.5

The following list shows in detail which features have been added or improved in ArangoDB 3.5. ArangoDB 3.5 also contains several bug fixes that are not listed here.

AQL

SORT-LIMIT optimization

A new SORT-LIMIT optimization has been added. This optimization will be pulled off by the query optimizer if there is a SORT statement followed by a LIMIT node, and the overall number of documents to return is relatively small in relation to the total number of documents to be sorted. In this case, the optimizer will use a size-constrained heap for keeping only the required number of results in memory, which can drastically reduce memory usage and, for some queries, also execution time for the sorting.

If the optimization is applied, it will show as "sort-limit" rule in the query execution plan.

Sorted primary index (RocksDB engine)

The query optimizer can now make use of the sortedness of primary indexes if the RocksDB engine is used. This means the primary index can be utilized for queries that sort by either the _key or _id attributes of a collection and also for range queries on these attributes.

In the list of documents for a collection in the web interface, the documents will now always be sorted in lexicographical order of their _key values. An exception for keys representing quasi-numerical values has been removed when doing the sorting in the web interface. Removing this exception can also speed up the display of the list of documents.

This change potentially affects the order in which documents are displayed in the list of documents overview in the web interface. A document with a key value "10" will now be displayed before a document with a key value of "9". In previous versions of ArangoDB this was exactly opposite.

Edge index query optimization (RocksDB engine)

An AQL query that uses the edge index only and returns the opposite side of the edge can now be executed in a more optimized way, e.g.

FOR edge IN edgeCollection FILTER edge._from == "v/1" RETURN edge._to

is fully covered by the RocksDB edge index.

For MMFiles this rule does not apply.

AQL syntax improvements

AQL now allows the usage of floating point values without leading zeros, e.g. .1234. Previous versions of ArangoDB required a leading zero in front of the decimal separator, i.e 0.1234.

Background Index Creation

Creating new indexes is by default done under an exclusive collection lock. This means that the collection (or the respective shards) are not available for write operations as long as the index is created. This "foreground" index creation can be undesirable, if you have to perform it on a live system without a dedicated maintenance window.

Starting with ArangoDB 3.5, indexes can also be created in "background", not using an exclusive lock during the entire index creation. The collection remains basically available, so that other CRUD operations can run on the collection while the index is being created. This can be achieved by setting the inBackground attribute when creating an index.

To create an index in the background in arangosh just specify inBackground: true, like in the following example:

db.collection.ensureIndex({ type: "hash", fields: [ "value" ], inBackground: true });

Indexes that are still in the build process will not be visible via the ArangoDB APIs. Nevertheless it is not possible to create the same index twice via the ensureIndex API while an index is still begin created. AQL queries also will not use these indexes until the index reports back as fully created. Note that the initial ensureIndex call or HTTP request will still block until the index is completely ready. Existing single-threaded client programs can thus safely set the inBackground option to true and continue to work as before.

Should you be building an index in the background you cannot rename or drop the collection. These operations will block until the index creation is finished. This is equally the case with foreground indexing.

After an interrupted index build (i.e. due to a server crash) the partially built index will the removed. In the ArangoDB cluster the index might then be automatically recreated on affected shards.

Background index creation might be slower than the "foreground" index creation and require more RAM. Under a write heavy load (specifically many remove, update or replace operations), the background index creation needs to keep a list of removed documents in RAM. This might become unsustainable if this list grows to tens of millions of entries.

Building an index is always a write-heavy operation, so it is always a good idea to build indexes during times with less load.

Please note that background index creation is useful only in combination with the RocksDB storage engine. With the MMFiles storage engine, creating an index will always block any other operations on the collection.

TTL (time-to-live) Indexes

The new TTL indexes provided by ArangoDB can be used for removing expired documents from a collection.

A TTL index can be set up by setting an expireAfter value and by picking a single document attribute which contains the documents' creation date and time. Documents are expired after expireAfter seconds after their creation time. The creation time is specified as either a numeric timestamp or a UTC datestring.

For example, if expireAfter is set to 600 seconds (10 minutes) and the index attribute is "creationDate" and there is the following document:

{ "creationDate" : 1550165973 }

This document will be indexed with a creation timestamp value of 1550165973, which translates to the human-readable date string 2019-02-14T17:39:33.000Z. The document will expire 600 seconds afterwards, which is at timestamp 1550166573000 (or 2019-02-14T17:49:33.000Z in the human-readable version).

The actual removal of expired documents will not necessarily happen immediately. Expired documents will eventually removed by a background thread that is periodically going through all TTL indexes and removing the expired documents.

There is no guarantee when exactly the removal of expired documents will be carried out, so queries may still find and return documents that have already expired. These will eventually be removed when the background thread kicks in and has capacity to remove the expired documents. It is guaranteed however that only documents which are past their expiration time will actually be removed.

Please note that the numeric timestamp values for the index attribute should be specified in seconds since January 1st 1970 (Unix timestamp). To calculate the current timestamp from JavaScript in this format, there is Date.now() / 1000, to calculate it from an arbitrary Date instance, there is Date.getTime() / 1000.

Alternatively, the index attribute values can be specified as a date string in format YYYY-MM-DDTHH:MM:SS with optional milliseconds. All date strings will be interpreted as UTC dates.

The above example document using a datestring attribute value would be

{ "creationDate" : "2019-02-14T17:39:33.000Z" }

In case the index attribute does not contain a numeric value nor a proper date string, the document will not be stored in the TTL index and thus will not become a candidate for expiration and removal. Providing either a non-numeric value or even no value for the index attribute is a supported way of keeping documents from being expired and removed.

There can at most be one TTL index per collection. It is not recommended to use TTL indexes for user-land AQL queries, as TTL indexes may store a transformed, always numerical version of the index attribute value.

The frequency for invoking the background removal thread can be configured using the --ttl.frequency startup option. In order to avoid "random" load spikes by the background thread suddenly kicking in and removing a lot of documents at once, the number of to-be-removed documents per thread invocation can be capped. The total maximum number of documents to be removed per thread invocation is controlled by the startup option --ttl.max-total-removes. The maximum number of documents in a single collection at once can be controlled by the startup option --ttl.max-collection-removes.

HTTP API extensions

The HTTP API for creating indexes at POST /_api/index has been extended two-fold:

  • to create a TTL (time-to-live) index, it is now possible to specify a value of ttl in the type attribute. When creating a TTL index, the attribute expireAfter is also required. That attribute contains the expiration time (in seconds), which is based on the documents' index attribute value.

  • to create an index in background, the attribute inBackground can be set to true.

Web interface

For the RocksDB engine, the selection of index types "persistent" and "skiplist" has been removed from the web interface when creating new indexes.

The index types "hash", "skiplist" and "persistent" are just aliases of each other when using the RocksDB engine, so there is no need to offer all of them in parallel.

Miscellaneous

Improved overview of available program options

The --help-all command-line option for all ArangoDB executables will now also show all hidden program options.

Previously hidden program options were only returned when invoking arangod or a client tool with the cryptic --help-. option. Now --help-all simply retuns them as well.

Fewer system collections

The system collections _routing and _modules are not created anymore for new new databases, as both are only needed for legacy functionality.

Existing _routing collections will not be touched as they may contain user-defined entries, and will continue to work.

Existing _modules collections will also remain functional.

Internal

We have moved from C++11 to C++14, which allows us to use some of the simplifications, features and guarantees that this standard has in stock. To compile ArangoDB from source, a compiler that supports C++14 is now required.

The bundled JEMalloc memory allocator used in ArangoDB release packages has been upgraded from version 5.0.1 to version 5.1.0.

The bundled version of the RocksDB library has been upgraded from 5.16 to 5.18.

The bundled version of the V8 JavaScript engine has been upgraded from 5.7.492.77 to 7.1.302.28.