Merge branch 'devel' of https://github.com/arangodb/arangodb into devel

2016-08-05 09:42:00 +02:00 · 2016-08-05 09:42:00 +02:00 · aec762ef1b
parent bc042e858e 8f0b512794
commit aec762ef1b
8 changed files with 164 additions and 54 deletions
--- a/Documentation/Books/AQL/ExecutionAndPerformance/Optimizer.mdpp
+++ b/Documentation/Books/AQL/ExecutionAndPerformance/Optimizer.mdpp
@ -259,7 +259,6 @@ The following attributes will be returned in the `stats` attribute of an `explai
  indicate a plan was actually modified by a rule)
 - `rulesSkipped`: number of rules skipped by the optimizer

-
 !SUBSECTION Warnings

 For some queries, the optimizer may produce warnings. These will be returned in
@ -277,19 +276,26 @@ the `warnings` attribute of the `explain` result:
 There is an upper bound on the number of warning a query may produce. If that
 bound is reached, no further warnings will be returned.

+!SUBSECTION Optimization in a cluster

-!SUBSECTION Clustering
 When you're running AQL in the cluster, the parsing of the query is done on the
-Coordinator. The coordinator then chops the query into parts, which are to
+coordinator. The coordinator then chops the query into snipets, which are to
 remain on the coordinator, and others that are to be distributed over the network
 to the shards. The cutting sites are interconnected
 via *Scatter-*, *Gather-* and *RemoteNodes*.
-These nodes mark the network borders. The optimizer strives to reduce the amount
+
+These nodes mark the network borders of the snippets. The optimizer strives to reduce the amount
 of data transfered via these network interfaces by pushing `FILTER`s out to the shards,
-as its vital to the query performance to reduce that data amount to transfer over the
-network links. As usual the optimizer can only take certain assumptions for granted
-when doing so, I.e. [User defined functions have to be executed on the coordinator](../Extending/README.md).
-In doubt, you should modify your query to reduce the number of round trips.
+as it is vital to the query performance to reduce that data amount to transfer over the
+network links.
+
+Snippets marked with **DBS** are executed on the shards, **COOR** ones are excuted on the coordinator.
+
+**As usual, the optimizer can only take certain assumptions for granted when doing so,
+i.e. [user-defined functions have to be executed on the coordinator](../Extending/README.md).
+If in doubt, you should modify your query to reduce the number interconnections between your snippets.**
+
+When optimizing your query you may want to look at simpler parts of it first.

 !SUBSECTION List of execution nodes

--- a/Documentation/Books/AQL/Extending/README.mdpp
+++ b/Documentation/Books/AQL/Extending/README.mdpp
@ -1,42 +1,62 @@
 !CHAPTER Extending AQL with User Functions

-AQL comes with a built-in set of functions, but it is not a
-fully-featured programming language.
+AQL comes with a [built-in set of functions](../Functions/README.md), but it is
+not a fully-featured programming language.

-To add missing functionality or to simplify queries, users
-may add their own functions to AQL in the selected database. 
-These functions can be written in JavaScript, and have to be 
-registered via the API; see [Registering Functions](Functions.md).
+To add missing functionality or to simplify queries, users may add their own
+functions to AQL in the selected database. These functions are written in
+JavaScript, and are deployed via an API; see [Registering Functions](Functions.md).

-In order to avoid conflicts with existing or future built-in 
-function names, all user functions have to be put into separate
-namespaces. Invoking a user function is then possible by referring
-to the fully-qualified function name, which includes the namespace,
-too; see [Conventions](Conventions.md). 
+In order to avoid conflicts with existing or future built-in function names,
+all user defined functions (**UDF**) have to be put into separate namespaces.
+Invoking a UDF is then possible by referring to the fully-qualified function name,
+which includes the namespace, too; see [Conventions](Conventions.md).

 !SECTION Technical Details

-Internally, user-defined functions (UDF) are stored in a system collection named
-*_aqlfunctions* of the selected database. When an AQL statement refers to such a function,
-it is loaded from that collection. The functions will be exclusively
+!SUBSECTION Known Limitations
+
+**UDFs have some implications you should be aware of. Otherwise they can
+introduce serious effects on the performance of your queries and the resource
+usage in ArangoDB.**
+
+Since the optimizer doesn't know anything about the nature of your function,
+**the optimizer can't use indices for UDFs**. So you should never lean on a UDF
+as the primary criterion for a `FILTER` statement to reduce your query result set.
+Instead, put a another `FILTER` statement in front of it. You should make sure
+that this [**`FILTER` statement** is effective](../ExecutionAndPerformance/Optimizer.md)
+to reduce the query result before passing it to your UDF.
+
+Rule of thumb is, the closer the UDF is to your final `RETURN` statement
+(or maybe even inside it), the better. 
+
+When used in clusters, UDFs are always executed on the
+[coordinator](../../Manual/Scalability/Architecture.html).
+
+Using UDFs in clusters may result in a higher resource allocation
+in terms of used V8 contexts and server threads. If you run out 
+of these resources, your query may abort with a
+[**cluster backend unavailable**](../../Manual/Appendix/ErrorCodes.html) error.
+
+To overcome these mentioned limitations, you may want to
+[increase the number of available V8 contexts](../../Manual/Administration/Configuration/Arangod.html#v8-contexts)
+(at the expense of increased memory usage),
+and [the number of available server threads](../../Manual/Administration/Configuration/Arangod.html#server-threads).
+
+!SUBSECTION Deployment Details
+
+Internally, UDFs are stored in a system collection named `_aqlfunctions`
+of the selected database. When an AQL statement refers to such a UDF,
+it is loaded from that collection. The UDFs will be exclusively
 available for queries in that particular database.

-When used in clusters, the UDF is executed on the coordinator. Depending on your
-query layout, this may result in many documents having to be passed up from the
-DB-Servers to the Coordinator. To avoid this,
-[you should make sure that the query contains effective `FILTER` statements](../ExecutionAndPerformance/Optimizer.md)
-that can be used on the DB-Server side to reduce the query result
-before passing it up to the Coordinator and your UDF.
-
-Since the Coordinator doesn't have own local collections, the `_aqlfunctions`
+Since the coordinator doesn't have own local collections, the `_aqlfunctions`
 collection is sharded across the cluster. Therefore (as usual), it has to be
-accessed through the coordinator - you mustn't talk to the DB-Servers directly.
-Once it is in there, it will be available on all coordinators.
-
-Since the optimizer doesn't know anything about this function, it won't be able
-to use indices for user defined functions.
+accessed through a coordinator - you mustn't talk to the shards directly.
+Once it is in the `_aqlfunctions` collection, it is available on all
+coordinators without additional effort.

 Keep in mind that system collections are excluded from dumps created with
 [arangodump](../../Manual/Administration/Arangodump.html) by default.
-To include AQL user functions in a dump, the dump needs to be started with
+To include AQL UDF in a dump, the dump needs to be started with
 the option *--include-system-collections true*.
--- a/Documentation/Books/AQL/Functions/Document.mdpp
+++ b/Documentation/Books/AQL/Functions/Document.mdpp
@ -72,13 +72,24 @@ from similar ways to test for the existance of an attribute, in case the attribu
 has a falsy value or is not present (implicitly *null* on object access):

 ```js
-!!{ name: "" }.name // false
+!!{ name: "" }.name        // false
 HAS( { name: "" }, "name") // true

-{ name: null }.name == null // true
-{ }.name == null // true
+{ name: null }.name == null   // true
+{ }.name == null              // true
 HAS( { name: null }, "name" ) // true
-HAS( { }, "name" ) // false
+HAS( { }, "name" )            // false
+```
+
+Note that `HAS()` can not utilize indexes. If it's not necessary to distinguish
+between explicit and implicit *null* values in your query, you may use an equality
+comparison to test for *null* and create a non-sparse index on the attribute you
+want to test against:
+
+```js
+FILTER !HAS(doc, "name")    // can not use indexes
+FILTER IS_NULL(doc, "name") // can not use indexes
+FILTER doc.name == null     // can utilize non-sparse indexes
 ```

 !SUBSECTION IS_SAME_COLLECTION()
--- a/Documentation/Books/AQL/Functions/README.mdpp
+++ b/Documentation/Books/AQL/Functions/README.mdpp
@ -28,5 +28,5 @@ i.e. *LENGTH(foo)* and *length(foo)* are equivalent.
 !SUBSECTION Extending AQL
 
 It is possible to extend AQL with user-defined functions. These functions need to
-be written in JavaScript, and have to be registered before usaing them in a query.
+be written in JavaScript, and have to be registered before they can be used in a query.
 Please refer to [Extending AQL](../Extending/index.html) for more details.
--- a/Documentation/Books/AQL/Functions/TypeCast.mdpp
+++ b/Documentation/Books/AQL/Functions/TypeCast.mdpp
@ -143,7 +143,8 @@ checked for, and false otherwise.

 The following type check functions are available:

- `IS_NULL(value) → bool`: Check whether *value* is a *null* value
+- `IS_NULL(value) → bool`: Check whether *value* is a *null* value, also see
+  [HAS()](Document.md#has)

 - `IS_BOOL(value) → bool`: Check whether *value* is a *boolean* value

--- a/Documentation/Books/HTTP/AqlUserFunctions/README.mdpp
+++ b/Documentation/Books/HTTP/AqlUserFunctions/README.mdpp
@ -6,8 +6,8 @@ This is an introduction to ArangoDB's HTTP interface for managing AQL
 user functions. AQL user functions are a means to extend the functionality
 of ArangoDB's query language (AQL) with user-defined JavaScript code.
 
-For an overview of how AQL user functions work, please refer to
-[Extending AQL](../../AQL/Extending/index.html).
+For an overview of how AQL user functions and their implications, please refer to
+the [Extending AQL](../../AQL/Extending/index.html) chapter.

 The HTTP interface provides an API for adding, deleting, and listing
 previously registered AQL user functions.
--- a/Documentation/Books/Manual/ReleaseNotes/UpgradingChanges30.mdpp
+++ b/Documentation/Books/Manual/ReleaseNotes/UpgradingChanges30.mdpp
@ -122,12 +122,12 @@ are missing from the replacement document, an `REPLACE` operation will fail.

 !SUBSUBSECTION Graph functions

-In version 3.0 all former graph related functions have been removed from AQL to be replaced
-by native AQL constructs.
+In version 3.0 all former graph related functions have been removed from AQL to
+be replaced by [native AQL constructs](../../AQL/Graphs/index.html).
 These constructs allow for more fine-grained filtering on several graph levels.
 Also this allows the AQL optimizer to automatically improve these queries by
 enhancing them with appropriate indexes.
-We have create recipes to upgrade from 2.8 to 3.0 when using these functions.
+We have created recipes to upgrade from 2.8 to 3.0 when using these functions.

 The functions:

@ -161,7 +161,7 @@ are covered in [Migrating GRAPH_* Measurements from 2.8 or earlier to 3.0](https
 * TRAVERSAL
 * TRAVERSAL_TREE

-are covered in  [#Migrating anonymous graph Functions from 2.8 or earlier to 3.0](https://docs.arangodb.com/cookbook/AQL/MigratingEdgeFunctionsTo3.html)
+are covered in [Migrating anonymous graph functions from 2.8 or earlier to 3.0](https://docs.arangodb.com/3/cookbook/AQL/MigratingEdgeFunctionsTo3.html)

 !SUBSECTION Typecasting functions

--- a/lib/V8/v8-utils.cpp
+++ b/lib/V8/v8-utils.cpp
@ -61,16 +61,18 @@
 #include "V8/v8-globals.h"
 #include "V8/v8-vpack.h"

+#include <velocypack/Builder.h>
+#include <velocypack/Slice.h>
+#include <velocypack/Validator.h>
+#include <velocypack/velocypack-aliases.h>
+
 using namespace arangodb;
 using namespace arangodb::application_features;
 using namespace arangodb::basics;
 using namespace arangodb::httpclient;
 using namespace arangodb::rest;

-////////////////////////////////////////////////////////////////////////////////
 /// @brief Random string generators
-////////////////////////////////////////////////////////////////////////////////
-
 namespace {
 static UniformCharacter JSAlphaNumGenerator(
    "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789");
@ -80,10 +82,7 @@ static UniformCharacter JSSaltGenerator(
    "[]:;<>,.?/|");
 }

-////////////////////////////////////////////////////////////////////////////////
 /// @brief Converts an object to a UTF-8-encoded and normalized character array.
-////////////////////////////////////////////////////////////////////////////////
-
 TRI_Utf8ValueNFC::TRI_Utf8ValueNFC(TRI_memory_zone_t* memoryZone,
                                   v8::Handle<v8::Value> const obj)
    : _str(nullptr), _length(0), _memoryZone(memoryZone) {
@ -3667,6 +3666,73 @@ static void JS_IsStopping(v8::FunctionCallbackInfo<v8::Value> const& args) {
  TRI_V8_TRY_CATCH_END
 }

+/// @brief convert a V8 value to VPack
+static void JS_V8ToVPack(v8::FunctionCallbackInfo<v8::Value> const& args) {
+  TRI_V8_TRY_CATCH_BEGIN(isolate);
+  v8::HandleScope scope(isolate);
+
+  // extract the argument
+  if (args.Length() != 1) {
+    TRI_V8_THROW_EXCEPTION_USAGE("V8_TO_VPACK(value)");
+  }
+
+  VPackBuilder builder;
+  int res = TRI_V8ToVPack(isolate, builder, args[0], false);
+
+  if (res != TRI_ERROR_NO_ERROR) {
+    TRI_V8_THROW_EXCEPTION(res);
+  }
+
+  VPackSlice slice = builder.slice();
+              
+  V8Buffer* buffer = V8Buffer::New(isolate, slice.startAs<char const>(), slice.byteSize());
+  v8::Local<v8::Object> bufferObject = v8::Local<v8::Object>::New(isolate, buffer->_handle);
+  TRI_V8_RETURN(bufferObject);
+  
+  TRI_V8_RETURN_FALSE();
+  TRI_V8_TRY_CATCH_END
+}
+
+/// @brief convert a VPack value to V8
+static void JS_VPackToV8(v8::FunctionCallbackInfo<v8::Value> const& args) {
+  TRI_V8_TRY_CATCH_BEGIN(isolate);
+  v8::HandleScope scope(isolate);
+
+  // extract the argument
+  if (args.Length() != 1) {
+    TRI_V8_THROW_EXCEPTION_USAGE("VPACK_TO_V8(value)");
+  }
+
+  if (args[0]->IsString() || args[0]->IsStringObject()) {
+    // supplied argument is a string
+    std::string const value = TRI_ObjectToString(args[0]);
+    
+    VPackValidator validator;
+    validator.validate(value.c_str(), value.size(), false); 
+
+    VPackSlice slice(value.c_str());
+    v8::Handle<v8::Value> result = TRI_VPackToV8(isolate, slice);
+    TRI_V8_RETURN(result);
+  } else if (args[0]->IsObject() && V8Buffer::hasInstance(isolate, args[0])) {
+    // argument is a buffer
+    char const* data = V8Buffer::data(args[0].As<v8::Object>());
+    size_t size = V8Buffer::length(args[0].As<v8::Object>());
+
+    VPackValidator validator;
+    validator.validate(data, size, false); 
+
+    VPackSlice slice(data);
+    v8::Handle<v8::Value> result = TRI_VPackToV8(isolate, slice);
+
+    TRI_V8_RETURN(result);
+  } else {
+    TRI_V8_THROW_EXCEPTION_MESSAGE(TRI_ERROR_BAD_PARAMETER, "invalid argument type for VPACK_TO_V8()");
+  }
+  
+  TRI_V8_RETURN_FALSE();
+  TRI_V8_TRY_CATCH_END
+}
+
 ////////////////////////////////////////////////////////////////////////////////
 /// @brief ArangoError
 ////////////////////////////////////////////////////////////////////////////////
@ -4433,6 +4499,12 @@ void TRI_InitV8Utils(v8::Isolate* isolate, v8::Handle<v8::Context> context,

  TRI_AddGlobalFunctionVocbase(
      isolate, context, TRI_V8_ASCII_STRING("SYS_IS_STOPPING"), JS_IsStopping);
+  
+  TRI_AddGlobalFunctionVocbase(
+      isolate, context, TRI_V8_ASCII_STRING("V8_TO_VPACK"), JS_V8ToVPack);
+  
+  TRI_AddGlobalFunctionVocbase(
+      isolate, context, TRI_V8_ASCII_STRING("VPACK_TO_V8"), JS_VPackToV8);

  // .............................................................................
  // create the global variables