From c5ed5db65c145b1a0d47671ea92a813b4bf2f7fa Mon Sep 17 00:00:00 2001 From: Jan Steemann Date: Tue, 8 Jul 2014 00:27:59 +0200 Subject: [PATCH] wrote new features and upgrading docs --- CHANGELOG | 4 +- .../Books/Users/Aql/DataModification.mdpp | 11 +- .../Users/NewFeatures/NewFeatures22.mdpp | 231 +++++++++++++- .../Books/Users/Upgrading/Upgrading22.mdpp | 285 +++++++++++++++++- UPGRADING | 1 - 5 files changed, 524 insertions(+), 8 deletions(-) delete mode 100644 UPGRADING diff --git a/CHANGELOG b/CHANGELOG index 6c6f722543..82f2163525 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -2,7 +2,7 @@ v2.2.0 (XXXX-XX-XX) ------------------- * The replication methods `logger.start`, `logger.stop` and `logger.properties` are - no-ops in ArangoDB 2.2 as there is no separate replication logger anymore. Changes + no-ops in ArangoDB 2.2 as there is no separate replication logger anymore. Data changes are logged into the write-ahead log in ArangoDB 2.2, and not separately by the replication logger. The replication logger object is still there in ArangoDB 2.2 to ensure backwards-compatibility, however, logging cannot be started, stopped or @@ -20,7 +20,7 @@ v2.2.0 (XXXX-XX-XX) * INCOMPATIBLE CHANGE: replication of transactions has changed. Previously, transactions were logged on a master in one big block and shipped to a slave in one block, too. Now transactions will be logged and replicated as separate entries, allowing transactions - to bigger and ensure replication progress. + to be bigger and also ensure replication progress. This change also affects the behavior of the `stop` method of the replication applier. If the replication applier is now stopped manually using the `stop` method and later diff --git a/Documentation/Books/Users/Aql/DataModification.mdpp b/Documentation/Books/Users/Aql/DataModification.mdpp index 4423ec47af..cc991f369b 100644 --- a/Documentation/Books/Users/Aql/DataModification.mdpp +++ b/Documentation/Books/Users/Aql/DataModification.mdpp @@ -9,14 +9,21 @@ data-modification operations: - REPLACE: completely replace existing documents in a collection - REMOVE: remove existing documents from a collection -Data-modification operations are normally combined with a *FOR* loop to -iterate over a given list of documents. It can optionally be combined with +Data-modification operations are normally combined with *FOR* loops to +iterate over a given list of documents. They can optionally be combined with *FILTER* statements and the like. FOR u IN users FILTER u.status == 'not active' UPDATE u WITH { status: 'inactive' } IN users +Though there is no need to combine a data-modification query with other +AQL operations such a *FOR* and *FILTER*. For example, the following +stripped-down update query will work, too. It will update one document +(with key *foo*) in collection *users*: + + UPDATE "foo" WITH { status: 'inactive' } IN users + Data-modification queries are restricted to modifying data in a single collection per query. That means a data-modification query cannot modify data in multiple collections with a single query, though it is possible diff --git a/Documentation/Books/Users/NewFeatures/NewFeatures22.mdpp b/Documentation/Books/Users/NewFeatures/NewFeatures22.mdpp index 46dc45255a..33e9f4ef93 100644 --- a/Documentation/Books/Users/NewFeatures/NewFeatures22.mdpp +++ b/Documentation/Books/Users/NewFeatures/NewFeatures22.mdpp @@ -1,6 +1,235 @@ !CHAPTER Features and Improvements The following list shows in detail which features have been added or improved in -ArangoDB 2.2. ArangoDB 2.1 also contains several bugfixes that are not listed +ArangoDB 2.2. ArangoDB 2.2 also contains several bugfixes that are not listed here. +!SECTION AQL improvements + +!SUBSECTION Data modification AQL queries + +Up to including version 2.1, AQL supported data retrieval operations only. +Starting with ArangoDB version 2.2, AQL also supports the following +data modification operations: + +- INSERT: insert new documents into a collection +- UPDATE: partially update existing documents in a collection +- REPLACE: completely replace existing documents in a collection +- REMOVE: remove existing documents from a collection + +Data-modification operations are normally combined with other AQL +statements such as *FOR* loops and *FILTER* conditions to determine +the set of documents to operate on. For example, the following query +will find all documents in collection *users* that match a specific +condition and set their *status* variable to *inactive*: + + FOR u IN users + FILTER u.status == 'not active' + UPDATE u WITH { status: 'inactive' } IN users + +The following query copies all documents from collection *users* into +collection *backup*: + + FOR u IN users + INSERT u IN backup + +And this query removes documents from collection *backup*: + + FOR doc IN backup + FILTER doc.lastModified < DATE_NOW() - 3600 + REMOVE doc IN backup + +For more information on data-modification queries, please refer to +[Data modification queries](../Aql/DataModification.md). + +!SUBSECTION Updatable variables + +Previously, the value of a variable assigned in an AQL query with the `LET` keyword +was not updatable in an AQL query. This prevented statements like the following from +being executable: + + LET sum = 0 + FOR v IN values + SORT v.year + LET sum = sum + v.value + RETURN { year: v.year, value: v.value, sum: sum } + +!SUBSECTION Other AQL improvements + +* added AQL TRANSLATE function + + This function can be used to perform lookups from static lists, e.g. + + LET countryNames = { US: "United States", UK: "United Kingdom", FR: "France" } + RETURN TRANSLATE("FR", countryNames) + + LET lookup = { foo: "foo-replacement", bar: "bar-replacement", baz: "baz-replacement" } + RETURN TRANSLATE("foobar", lookup, "not contained!") + + +!SECTION Write-ahead log + +All write operations in an ArangoDB server will now be automatically logged +in the server's write-ahead log. The write-ahead log is a set of append-only +logfiles, and it is used in case of a crash recovery and for replication. + +Data from the write-ahead log will eventually be moved into the journals or +datafiles of collections, allowing the server to remove older write-ahead logfiles. + +Cross-collection transactions in ArangoDB should benefit considerably by this +change, as less writes than in previous versions are required to ensure the data +of multiple collections are atomcially and durably committed. All data-modifying +operations inside transactions (insert, update, remove) will write their +operations into the write-ahead log directly now. In previous versions, such +operations were buffered until the commit or rollback occurred. Transactions with +multiple operations should therefore require less physical memory than in previous +versions of ArangoDB. + +The data in the write-ahead log can also be used in the replication context. In +previous versions of ArangoDB, replicating from a master required turning on a +special replication logger on the master. The replication logger caused an extra +write operation into the *_replication* system collection for each actual write +operation. This extra write is now superfluous. Instead, slaves can read directly +from the master's write-ahead log to get informed about most recent data changes. +This removes the need to store data-modication operations in the *_replication* +collection altogether. + +For the configuration of the write-ahead log, please refer to [Write-ahead log options](../ConfigureArango/Wal.md). + +The introduction of the write-ahead log also removes the need to configure and +start the replication logger on a master. Though the replication logger object +is still available in ArangoDB 2.2 to ensure API compatibility, starting, stopping, +or configuring it will have no effect. + + +!SECTION Performance improvements + +* Removed sorting of attribute names when in collection shaper + + In previous versions of ArangoDB, adding a document with previously not-used + attribute names caused a full sort of all attribute names used in the + collection. The sorting was done to ensure fast comparisons of attribute + names in some rare edge cases, but it considerably slowed down inserts into + collections with many different or even unique attribute names. + +* Specialized primary index implementation to allow faster hash table + rebuilding and reduce lookups in datafiles for the actual value of `_key`. + This also reduces the amount of random memory accesses for primary index inserts. + +* Reclamation of index memory when deleting last document in collection + + Deleting documents from a collection did not lead to index sizes being reduced. + Instead, the index memory was kept allocated and re-used later when a collection + was refilled with new documents. Now, index memory of primary indexes and hash + indexes is reclaimed instantly when the last document in a collection is removed. + +* Prevent buffering of long print results in arangosh's and arangod's print + command + + This change will emit buffered intermediate print results and discard the + output buffer to quickly deliver print results to the user, and to prevent + constructing very large buffers for large resultis. + + +!SECTION Miscellaneous improvements + +* Added `insert` method as an alias for `save`. Documents can now be inserted into + a collection using either method: + + db.test.save({ foo: "bar" }); + db.test.insert({ foo: "bar" }); + +* Cleanup of options for data-modification operations + + Many of the data-modification operations had signatures with many optional + bool parameters, e.g.: + + db.test.update("foo", { bar: "baz" }, true, true, true) + db.test.replace("foo", { bar: "baz" }, true, true) + db.test.remove("foo", true, true) + db.test.save({ bar: "baz" }, true) + + Such long parameter lists were unintuitive and hard to use when only one of + the optional parameters should have been set. + + To make the APIs more usable, the operations now understand the following + alternative signature: + + collection.update(key, update-document, options) + collection.replace(key, replacement-document, options) + collection.remove(key, options) + collection.save(document, options) + + Examples: + + db.test.update("foo", { bar: "baz" }, { overwrite: true, keepNull: true, waitForSync: true }) + db.test.replace("foo", { bar: "baz" }, { overwrite: true, waitForSync: true }) + db.test.remove("foo", { overwrite: true, waitForSync: true }) + db.test.save({ bar: "baz" }, { waitForSync: true }) + +* Added `--overwrite` option to arangoimp + + This allows removing all documents in a collection before importing into it + using arangoimp. + +* Honor startup option `--server.disable-statistics` when deciding whether or not + to start periodic statistics collection jobs + + Previously, the statistics collection jobs were started even if the server was + started with the `--server.disable-statistics` flag being set to `true`. Now if + the option is set to `true`, no statistics will be collected on the server. + +* Disallow storing of JavaScript objects that contain JavaScript native objects + of type `Date`, `Function`, `RegExp` or `External`, e.g. + + db.test.save({ foo: /bar/ }); + db.test.save({ foo: new Date() }); + + This will now print + + Error: cannot be converted into JSON shape: could not shape document + + Previously, objects of these types were silently converted into an empty object + (i.e. `{ }`) and no warning was issued. + + To store such objects in a collection, explicitly convert them into strings + like this: + + db.test.save({ foo: String(/bar/) }); + db.test.save({ foo: String(new Date()) }); + + +!SECTION Removed features + +!SUBSECTION MRuby integration for arangod + +ArangoDB had an experimental MRuby integration in some of the publish builds. +This wasn't continuously developed, and so it has been removed in ArangoDB 2.2. + +This change has led to the following startup options being superfluous: + +- `--ruby.gc-interval` +- `--ruby.action-directory` +- `--ruby.modules-path` +- `--ruby.startup-directory` + +Specifying these startup options will do nothing in ArangoDB 2.2, so using these +options should be avoided from now on as they might be removed in a future version +of ArangoDB. + +!SUBSECTION Removed startup options + +The following startup options have been removed in ArangoDB 2.2. Specifying them +in the server's configuration file will not produce an error to make migration +easier. Still, usage of these options should be avoided as they will not have any +effect and might fully be removed in a future version of ArangoDB: + +- `--database.remove-on-drop` +- `--database.force-sync-properties` +- `--random.no-seed` +- `--ruby.gc-interval` +- `--ruby.action-directory` +- `--ruby.modules-path` +- `--ruby.startup-directory` +- `--server.disable-replication-logger` + diff --git a/Documentation/Books/Users/Upgrading/Upgrading22.mdpp b/Documentation/Books/Users/Upgrading/Upgrading22.mdpp index ecf3675495..06e7c60b94 100644 --- a/Documentation/Books/Users/Upgrading/Upgrading22.mdpp +++ b/Documentation/Books/Users/Upgrading/Upgrading22.mdpp @@ -1,4 +1,4 @@ -!CHAPTER Upgrading to ArangoDB 2.1 +!CHAPTER Upgrading to ArangoDB 2.2 Please read the following sections if you upgrade from a previous version to ArangoDB 2.2. @@ -7,4 +7,285 @@ Please note first that a database directory used with ArangoDB 2.2 cannot be used with earlier versions (e.g. ArangoDB 2.1) any more. Upgrading a database directory cannot be reverted. Therefore please make sure to create a full backup of your existing ArangoDB -installation before performing an upgrade. \ No newline at end of file +installation before performing an upgrade. + +!SECTION Database Directory Version Check and Upgrade + +ArangoDB will perform a database version check at startup. When ArangoDB 2.2 +encounters a database created with earlier versions of ArangoDB, it will refuse +to start. This is intentional. + +The output will then look like this: + +``` +2014-07-07T22:04:53Z [18675] ERROR In database '_system': Database directory version (2.1) is lower than server version (2.2). +2014-07-07T22:04:53Z [18675] ERROR In database '_system': ---------------------------------------------------------------------- +2014-07-07T22:04:53Z [18675] ERROR In database '_system': It seems like you have upgraded the ArangoDB binary. +2014-07-07T22:04:53Z [18675] ERROR In database '_system': If this is what you wanted to do, please restart with the +2014-07-07T22:04:53Z [18675] ERROR In database '_system': --upgrade +2014-07-07T22:04:53Z [18675] ERROR In database '_system': option to upgrade the data in the database directory. +2014-07-07T22:04:53Z [18675] ERROR In database '_system': Normally you can use the control script to upgrade your database +2014-07-07T22:04:53Z [18675] ERROR In database '_system': /etc/init.d/arangodb stop +2014-07-07T22:04:53Z [18675] ERROR In database '_system': /etc/init.d/arangodb upgrade +2014-07-07T22:04:53Z [18675] ERROR In database '_system': /etc/init.d/arangodb start +2014-07-07T22:04:53Z [18675] ERROR In database '_system': ---------------------------------------------------------------------- +2014-07-07T22:04:53Z [18675] FATAL Database version check failed for '_system'. Please start the server with the --upgrade option +``` + +To make ArangoDB 2.2 start with a database directory created with an earlier +ArangoDB version, you may need to invoke the upgrade procedure once. This can +be done by running ArangoDB from the command line and supplying the `--upgrade` +option: + + unix> arangod data --upgrade + +where `data` is ArangoDB's main data directory. + +Note: here the same database should be specified that is also specified when +arangod is started regularly. Please do not run the `--upgrade` command on each +individual database subfolder (named `database-`). + +For example, if you regularly start your ArangoDB server with + + unix> arangod mydatabasefolder + +then running + + unix> arangod mydatabasefolder --upgrade + +will perform the upgrade for the whole ArangoDB instance, including all of its +databases. + +Starting with `--upgrade` will run a database version check and perform any +necessary migrations. As usual, you should create a backup of your database +directory before performing the upgrade. + +The output should look like this: +``` +2014-07-07T22:11:30Z [18867] INFO In database '_system': starting upgrade from version 2.1 to 2.2.0 +2014-07-07T22:11:30Z [18867] INFO In database '_system': Found 19 defined task(s), 2 task(s) to run +2014-07-07T22:11:30Z [18867] INFO In database '_system': upgrade successfully finished +2014-07-07T22:11:30Z [18867] INFO database upgrade passed +``` + +Please check the output the `--upgrade` run. It may produce errors, which need +to be fixed before ArangoDB can be used properly. If no errors are present or +they have been resolved, you can start ArangoDB 2.2 regularly. + +!SECTION Upgrading a cluster planned in the web interface + +A cluster of ArangoDB instances has to be upgraded as well. This +involves upgrading all ArangoDB instances in the cluster, as well as +running the version check on the whole running cluster in the end. + +We have tried to make this procedure as painless and convenient for you. +We assume that you planned, launched and administrated a cluster using the +graphical front end in your browser. The upgrade procedure is then as +follows: + + 1. First shut down your cluster using the graphical front end as + usual. + + 2. Then upgrade all dispatcher instances on all machines in your + cluster using the version check as described above and restart them. + + 3. Now open the cluster dash board in your browser by pointing it to + the same dispatcher that you used to plan and launch the cluster in + the graphical front end. In addition to the usual buttons + "Relaunch", "Edit cluster plan" and "Delete cluster plan" you will + see another button marked "Upgrade and relaunch cluster". + + 4. Hit this button, your cluster will be upgraded and launched and + all is done for you behind the scenes. If all goes well, you will + see the usual cluster dash board after a few seconds. If there is + an error, you have to inspect the log files of your cluster + ArangoDB instances. Please let us know if you run into problems. + +There is an alternative way using the `ArangoDB` shell. Instead of +steps 3. and 4. above you can launch `arangosh`, point it to the dispatcher +that you have used to plan and launch the cluster using the option +``--server.endpoint``, and execute + + arangosh> require("org/arangodb/cluster").Upgrade("root",""); + +This upgrades the cluster and launches it, exactly as with the button +above in the graphical front end. You have to replace `"root"` with +a user name and `""` with a password that is valid for authentication +with the cluster. + + +!SECTION Changed behavior + +!SUBSECTION Replication + +The *_replication* system collection is not used anymore in ArangoDB 2.2 because all +write operations will be logged in the write-ahead log. There is no need to additionally +log operations in the *_replication* system collection. Usage of the *_replication* +system collection in user scripts is discouraged. + +!SUBSECTION Replication logger + +The replication methods `logger.start`, `logger.stop` and `logger.properties` are +no-ops in ArangoDB 2.2 as there is no separate replication logger anymore. Data changes +are logged into the write-ahead log in ArangoDB 2.2, and need not be separately written +to the *_replication* system collection by the replication logger. + +The replication logger object is still there in ArangoDB 2.2 to ensure API +backwards-compatibility, however, starting, stopping or configuring the logger are +no-ops in ArangoDB 2.2. + +This change also affects the following HTTP API methods: +- `PUT /_api/replication/logger-start` +- `PUT /_api/replication/logger-stop` +- `GET /_api/replication/logger-config` +- `PUT /_api/replication/logger-config` + +The start and stop commands will do nothing, and retrieving the logger configuration +will return a dummy configuration. Setting the logger configuration does nothing and +will return the dummy configuration again. + +Any user scripts that invoke the replication logger should be checked and adjusted +before performing the upgrade to 2.2. + +!SUBSECTION Replication of transactions + +Replication of transactions has changed in ArangoDB 2.2. Previously, transactions were +logged on the master in one big block and were shipped to a slave in one block, too. + +Now transaction operations will be logged and replicated as separate entries, allowing +transactions to be bigger and also ensure replication progress. + +This also means the replication format is not fully compatible between ArangoDB 2.2 +and previous versions. When upgrading a master-slave pair from ArangoDB 2.1 to 2.2, +please stop operations on the master first and make sure everything has been replicated +to the slave server. Then upgrade and restart both servers. + +!SUBSECTION Replication applier + +This change also affects the behavior of the *stop* method of the replication applier. +If the replication applier is now stopped manually using the *stop* method and later +restarted using the *start* method, any transactions that were unfinished at the +point of stopping will be aborted on a slave, even if they later commit on the master. + +In ArangoDB 2.2, stopping the replication applier manually should be avoided unless the +goal is to stop replication permanently or to do a full resync with the master anyway. +If the replication applier still must be stopped, it should be made sure that the +slave has fetched and applied all pending operations from a master, and that no +extra transactions are started on the master before the `stop` command on the slave +is executed. + +Replication of transactions in ArangoDB 2.2 might also lock the involved collections on +the slave while a transaction is either committed or aborted on the master and the +change has been replicated to the slave. This change in behavior may be important for +slave servers that are used for read-scaling. In order to avoid long lasting collection +locks on the slave, transactions should be kept small. + +Any user scripts that invoke the replication applier should be checked and adjusted +before performing the upgrade to 2.2. + +!SUBSECTION Collection figures + +The figures reported by the *collection.figures* method only reflect documents and +data contained in the journals and datafiles of collections. Documents or deletions +contained only in the write-ahead log will not influence collection figures until the +write-ahead log garbage collection kicks in and copies data over into the collections. + +The figures of a collection might therefore underreport the total resource usage of +a collection. + +Additionally, the attributes *lastTick* and *uncollectedLogfileEntries* have been +added to the figures. This also affects the HTTP API method *PUT /_api/collection/figures*. + +Any user scripts that process collection figures should be checked and adjusted +before performing the upgrade to 2.2. + +!SUBSECTION Storage of non-JSON attribute values + +Previous versions of ArangoDB allowed storing JavaScript native objects of type +`Date`, `Function`, `RegExp` or `External`, e.g. + + db.test.save({ foo: /bar/ }); + db.test.save({ foo: new Date() }); + +Objects of these types were silently converted into an empty object (`{ }`) when +being saved, an no warning was issued. This led to a silent data loss. + +ArangoDB 2.2 changes this, and disallows storing JavaScript native objects of +the mentioned types. When this is attempted, the operation will now fail with the +following error: + + Error: cannot be converted into JSON shape: could not shape document + +To store such data in a collection, explicitly convert them into strings like so: + + db.test.save({ foo: String(/bar/) }); + db.test.save({ foo: String(new Date()) }); + +Please review your server-side data storage operation code (if any) before performing +the upgrade to 2.2. + +!SUBSECTION AQL keywords + +The following keywords have been added to AQL in ArangoDB 2.2 to support +data modification queries: + +- *INSERT* +- *UPDATE* +- *REPLACE* +- *REMOVE* +- *WITH* + +Unquoted usage of these keywords for attribute names in AQL queries will likely +fail in ArangoDB 2.2. If any such attribute name needs to be used in a query, it +should be enclosed in backticks to indicate the usage of a literal attribute +name. + +For example, the following query will fail in ArangoDB 2.2 with a parse error: + + FOR i IN foo RETURN i.remove + +The attribute name *remove* needs to be quoted with backticks to indicate that +the literal *remove* is meant: + + FOR i IN foo RETURN i.`remove` + +Before upgrading to 2.2, please check if any of your collections or queries use +of the new keywords. + +!SECTION Removed features + +!SUBSECTION MRuby integration for arangod + +ArangoDB had an experimental MRuby integration in some of the publish builds. +This wasn't continuously developed, and so it has been removed in ArangoDB 2.2. + +This change has led to the following startup options being superfluous: + +- `--ruby.gc-interval` +- `--ruby.action-directory` +- `--ruby.modules-path` +- `--ruby.startup-directory` + +Specifying these startup options will do nothing in ArangoDB 2.2, so using these +options should be avoided from now on as they might be removed in a future version +of ArangoDB. + +!SUBSECTION Removed startup options + +The following startup options have been removed in ArangoDB 2.2. Specifying them +in the server's configuration file will not produce an error to make migration +easier. Still, usage of these options should be avoided as they will not have any +effect and might fully be removed in a future version of ArangoDB: + +- `--database.remove-on-drop` +- `--database.force-sync-properties` +- `--random.no-seed` +- `--ruby.gc-interval` +- `--ruby.action-directory` +- `--ruby.modules-path` +- `--ruby.startup-directory` +- `--server.disable-replication-logger` + +Before upgrading to 2.2, please check your configuration files and adjust them so +no superfluous options are used. + diff --git a/UPGRADING b/UPGRADING deleted file mode 100644 index c412234e8d..0000000000 --- a/UPGRADING +++ /dev/null @@ -1 +0,0 @@ -Please refer to Documentation/Manual/Upgrading20.md