Doc - Updated arangosearch doc (issue 442) (PR #6130)

2018-08-21 12:14:31 +03:00 · 2018-08-21 12:14:31 +03:00 · aeadad0790
parent cb12be3e4e
commit aeadad0790
6 changed files with 387 additions and 244 deletions
--- a/Documentation/Books/Manual/SUMMARY.md
+++ b/Documentation/Books/Manual/SUMMARY.md
@ -123,7 +123,10 @@
  * [Working with Edges](Graphs/Edges/README.md)
  * [Pregel](Graphs/Pregel/README.md)
 * [ArangoSearch Views](Views/ArangoSearch/README.md)
+  * [Getting Started](Views/ArangoSearch/GettingStarted.md)
+  * [Detailed Overview](Views/ArangoSearch/DetailedOverview.md)
  * [Analyzers](Views/ArangoSearch/Analyzers.md)
+  * [Scorers](Views/ArangoSearch/Scorers.md)

 ## ADVANCED TOPICS

--- a/Documentation/Books/Manual/Views/ArangoSearch/Analyzers.md
+++ b/Documentation/Books/Manual/Views/ArangoSearch/Analyzers.md
@ -1,13 +1,14 @@
-### Analyzers:
+ArangoSearch Analyzers
+======================

 To simplify query syntax ArangoSearch provides a concept of named analyzers which
 are merely aliases for type+configuration of IResearch analyzers. Management of
-named analyzers is exposed via both REST, GUI and JavaScript APIs, e.g.
+named analyzers is exposed via REST, GUI and JavaScript APIs e.g.

 `db._globalSettings("iresearch.analyzers")`

 A user then merely uses these analyzer names in ArangoSearch view configurations
-and AQL queries, e.g.
+and AQL queries.

 ArangoSearch provides a 'text' analyzer to analyze human readable text. A required
 configuration parameter for this type of analyzer is 'locale' used to specify
@ -27,7 +28,7 @@ The ArangoDB administrator may then set up a named analyzer 'text_des':

 The user is then immediately able to run queries with the said analyzer, e.g.

-`FILTER doc.description IN TOKENS('Ein brauner Fuchs springt', 'text_des')`
+`SEARCH doc.description IN TOKENS('Ein brauner Fuchs springt', 'text_des')`

 Similarly an administrator may choose to deploy a custom DNA analyzer 'DnaSeq':

@ -41,7 +42,7 @@ Similarly an administrator may choose to deploy a custom DNA analyzer 'DnaSeq':

 The user is then immediately able to run queries with the said analyzer, e.g.

-`FILTER doc.dna IN TOKENS('ACGTCGTATGCACTGA', 'DnaSeq')`
+`SEARCH doc.dna IN TOKENS('ACGTCGTATGCACTGA', 'DnaSeq')`

 To a limited degree the concept of 'analysis' is even available in non-IResearch
 AQL, e.g. the `TOKENS(...)` function will utilize the power of IResearch to break
@ -53,9 +54,9 @@ e.g. to match docs with 'word == quick' OR 'word == brown' OR 'word == fox'

    FOR doc IN someCollection
      FILTER doc.word IN TOKENS('a quick brown fox', 'text_en')
-      RETRUN doc
+      RETURN doc

-Runtime-plugging functionality for analyzers is not avaiable in ArangoDB at this
+Runtime-plugging functionality for analyzers is not available in ArangoDB at this
 point in time, so ArangoDB comes with a few default-initialized analyzers:

 * `identity`
--- a/Documentation/Books/Manual/Views/ArangoSearch/DetailedOverview.md
+++ b/Documentation/Books/Manual/Views/ArangoSearch/DetailedOverview.md
@ -0,0 +1,169 @@
+# Detailed overview of ArangoSearch views
+
+ArangoSearch is a powerful fulltext search component with additional functionality, 
+supported via the 'text' analyzer and 'tfidf'/'bm25' [scorers](Scorers.md), 
+without impact on performance when specifying documents from different collections or 
+filtering on multiple document attributes.
+
+## View datasource
+
+The IResearch functionality is exposed to ArangoDB via the the ArangoSearch view
+API because the ArangoSearch view is merely an identity transformation applied
+onto documents stored in linked collections of the same ArangoDB database.
+In plain terms an ArangoSearch view only allows filtering and sorting of documents
+located in collections of the same database. The matching documents themselves
+are returned as-is from their corresponding collections.
+
+## Links to ArangoDB collections
+
+A concept of an ArangoDB collection 'link' is introduced to allow specifying
+which ArangoDB collections a given ArangoSearch View should query for documents
+and how these documents should be queried.
+
+An ArangoSearch Link is a uni-directional connection from an ArangoDB collection
+to an ArangoSearch view describing how data coming from the said collection should
+be made available in the given view. Each ArangoSearch Link in an ArangoSearch
+view is uniquely identified by the name of the ArangoDB collection it links to.
+An ArangoSearch view may have zero or more links, each to a distinct ArangoDB
+collection. Similarly an ArangoDB collection may be referenced via links by zero
+or more distinct ArangoSearch views. In plain terms any given ArangoSearch view
+may be linked to any given ArangoDB collection of the same database with zero or
+at most one link. However, any ArangoSearch view may be linked to multiple
+distinct ArangoDB collections and similarly any ArangoDB collection may be
+referenced by multiple ArangoSearch views.
+
+To configure an ArangoSearch view for consideration of documents from a given
+ArangoDB collection a link definition must be added to the properties of the
+said ArangoSearch view defining the link parameters as per the section
+[View definition/modification](#view-definitionmodification).
+
+## Analyzers
+
+To simplify query syntax ArangoSearch provides a concept of 
+[named analyzers](Analyzers.md) which are merely aliases for
+type+configuration of IResearch analyzers. Management of named analyzers
+is exposed via REST, GUI and JavaScript APIs.
+
+## View definition/modification
+
+An ArangoSearch view is configured via an object containing a set of
+view-specific configuration directives and a map of link-specific configuration
+directives.
+
+During view creation the following directives apply:
+
+* id _(optional)_: the desired view identifier
+* name _(required)_: the view name
+* type _(required)_: the value "arangosearch"
+  any of the directives from the section [View properties](#view-properties-updatable)
+
+During view modification the following directives apply:
+
+* links _(optional)_:
+  a mapping of collection-name/collection-identifier to one of:
+  * link creation - link definition as per the section [Link properties](#link-properties)
+  * link removal - JSON keyword *null* (i.e. nullify a link if present)
+    any of the directives from the section [modifiable view properties](#view-properties-updatable)
+
+## View properties (non-updatable)
+
+* **locale** (_optional_, default: `C`)<br/>
+  the default locale used for ordering processed attribute names
+
+## View properties (updatable)
+
+* **commit** (_optional_, default: use defaults for all values)<br/>
+  configure ArangoSearch View commit policy for single-item inserts/removals,
+  e.g. when adding removing documents from a linked ArangoDB collection
+
+  * **cleanupIntervalStep** (_optional_, default: `10`; to disable use: `0`)<br/>
+    wait at least this many commits between removing unused files in the
+    ArangoSearch data directory
+    for the case where the consolidation policies merge segments often (i.e. a
+    lot of commit+consolidate), a lower value will cause a lot of disk space to
+    be wasted
+    for the case where the consolidation policies rarely merge segments (i.e.
+    few inserts/deletes), a higher value will impact performance without any
+    added benefits
+
+  * **commitIntervalMsec** (_optional_, default: `60000`; to disable use: `0`)<br/>
+    wait at least *count* milliseconds between committing view data store
+    changes and making documents visible to queries
+    for the case where there are a lot of inserts/updates, a lower value will
+    cause the view not to account for them, (unlit commit), and memory usage
+    would continue to grow
+    for the case where there are a few inserts/updates, a higher value will
+    impact performance and waste disk space for each commit call without any
+    added benefits
+
+  * **consolidate** (_optional_, default: `none`)<br/>
+    a per-policy mapping of thresholds in the range `[0.0, 1.0]` to determine data
+    store segment merge candidates, if specified then only the listed policies
+    are used, keys are any of:
+
+    * **bytes** (_optional_, for default values use an empty object: `{}`)
+
+      * **segmentThreshold** (_optional_, default: `300`; to disable use: `0`)<br/>
+        apply consolidation policy IFF {segmentThreshold} >= #segments
+
+      * **threshold** (_optional_, default: `0.85`)<br/>
+        consolidate `IFF {threshold} > segment_bytes / (all_segment_bytes / #segments)`
+
+    * **bytes_accum** (_optional_, for default values use: `{}`)<br/>
+
+      * **segmentThreshold** (_optional_, default: `300`; to disable use: `0`)<br/>
+        apply consolidation policy IFF {segmentThreshold} >= #segments
+
+      * **threshold** (_optional_, default: `0.85`)<br/>
+        consolidate `IFF {threshold} > (segment_bytes + sum_of_merge_candidate_segment_bytes) / all_segment_bytes`
+
+    * **count** (_optional_, for default values use: `{}`)
+
+      * **segmentThreshold** (_optional_, default: `300`; to disable use: `0`)<br/>
+        apply consolidation policy IFF {segmentThreshold} >= #segments
+
+      * **threshold** (_optional_, default: `0.85`)<br/>
+        consolidate `IFF {threshold} > segment_docs{valid} / (all_segment_docs{valid} / #segments)`
+
+    * fill: (optional)
+      if specified, use empty object for default values, i.e. `{}`
+
+      * **segmentThreshold** (_optional_, default: `300`; to disable use: `0`)<br/>
+        apply consolidation policy IFF {segmentThreshold} >= #segments
+
+      * **threshold** (_optional_, default: `0.85`)<br/>
+        consolidate `IFF {threshold} > #segment_docs{valid} / (#segment_docs{valid} + #segment_docs{removed})`
+
+## Link properties
+
+* **analyzers** (_optional_, default: `[ 'identity' ]`)<br/>
+  a list of analyzers, by name as defined via the [Analyzers](Analyzers.md), that
+  should be applied to values of processed document attributes
+
+* **fields** (_optional_, default: `{}`)<br/>
+  an object `{attribute-name: [Link properties]}` of fields that should be
+  processed at each level of the document
+  each key specifies the document attribute to be processed, the value of
+  *includeAllFields* is also consulted when selecting fields to be processed
+  each value specifies the [Link properties](#link-properties) directives to be used when
+  processing the specified field, a Link properties value of `{}` denotes
+  inheritance of all (except *fields*) directives from the current level
+
+* **includeAllFields** (_optional_, default: `false`)<br/>
+  if true then process all document attributes (if not explicitly specified
+  then process the fields with default Link properties directives, i.e. `{}`),
+  otherwise only consider attributes mentioned in *fields*
+
+* **trackListPositions** (_optional_, default: `false`)<br/>
+  if true then for array values track the value position in the array, e.g. when
+  querying for the input: `{ attr: [ 'valueX', 'valueY', 'valueZ' ] }`
+  the user must specify: `doc.attr[1] == 'valueY'`
+  otherwise all values in an array are treated as equal alternatives, e.g. when
+  querying for the input: `{ attr: [ 'valueX', 'valueY', 'valueZ' ] }`
+  the user must specify: `doc.attr == 'valueY'`
+
+* **storeValues** (_optional_, default: `"none"`)<br/>
+  how should the view track the attribute values, this setting allows for
+  additional value retrieval optimizations, one of:
+  * none: Do not store values by the view
+  * id: Store only information about value presence, to allow use of the EXISTS() function
--- a/Documentation/Books/Manual/Views/ArangoSearch/GettingStarted.md
+++ b/Documentation/Books/Manual/Views/ArangoSearch/GettingStarted.md
@ -0,0 +1,126 @@
+# Getting started with ArangoSearch views
+
+## The DDL configuration
+
+[DDL](https://en.wikipedia.org/wiki/Data_definition_language) is a data
+definition language or data description language for defining data structures,
+especially database schemas.
+
+All DDL operations on Views can be done via JavaScript or REST calls. The DDL
+syntax follows the well established ArangoDB guidelines and thus is very
+similar between JavaScript and REST. This article uses the JavaScript syntax.
+
+Assume the following collections were initially defined in a database using
+the following commands:
+
+```js
+c0 = db._create("ExampleCollection0");
+c1 = db._create("ExampleCollection1");
+ 
+c0.save({ i: 0, name: "full", text: "是一个 多模 型数 据库" });
+c0.save({ i: 1, name: "half", text: "是一个 多模" });
+c0.save({ i: 2, name: "other half", text: "型数 据库" });
+c0.save({ i: 3, name: "quarter", text: "是一" });
+ 
+c1.save({ a: "foo", b: "bar", i: 4 });
+c1.save({ a: "foo", b: "baz", i: 5 });
+c1.save({ a: "bar", b: "foo", i: 6 });
+c1.save({ a: "baz", b: "foo", i: 7 });
+```
+
+## Creating a View (with default parameters)
+
+```js
+v0 = db._createView("ExampleView", "arangosearch", {});
+```
+
+## Linking created View with a collection and adding indexing parameters
+
+```js
+v0 = db._view("ExampleView");
+v0.properties({
+    links: {
+      'ExampleCollection0': /* collection Link 0 with additional custom configuration */
+      {
+        includeAllFields: true, /* examine fields of all linked collections  using default configuration */
+        fields:
+        {
+          name: /* a field to apply custom configuration that will index English text */
+          {
+            analyzers: ["text_en"]
+          },
+          text: /* another field to apply custom that will index Chineese text */
+          {
+            analyzers: ["text_zh"]
+          }
+        }
+      },
+      'ExampleCollection1': /* collection Link 1 with custom configuration */
+      {
+        includeAllFields: true, /* examine all fields using default configuration */
+        fields:
+        {
+          a:
+          {
+            analyzers: ["text_en"] /* a field to apply custom configuration that will index English text */
+          }
+        }
+      }
+    }
+  }
+);
+```
+
+## Query data using created View with linked collections
+
+```js
+db._query(`FOR doc IN ExampleView
+  SEARCH PHRASE(doc.text, '型数 据库', 'text_zh') OR STARTS_WITH(doc.b, 'ba')
+  SORT TFIDF(doc) DESC
+  RETURN doc`);
+```
+
+## Examine query result
+
+Result of the latter query will include all documents from both linked
+collections that include `多模 型数` phrase in Chinese at any part of `text`
+property or `b` property in English that starts with `ba`. Additionally,
+descendant sorting using [TFIDF algorithm](https://en.wikipedia.org/wiki/TF-IDF)
+will be applied during a search:
+
+```json
+[
+  {
+    "_key" : "120",
+    "_id" : "ExampleCollection0/120",
+    "_rev" : "_XPoMzCi--_",
+    "i" : 0,
+    "name" : "full",
+    "text" : "是一个 多模 型数 据库"
+  },
+  {
+    "_key" : "124",
+    "_id" : "ExampleCollection0/124",
+    "_rev" : "_XPoMzCq--_",
+    "i" : 2,
+    "name" : "other half",
+    "text" : "型数 据库"
+  },
+  {
+    "_key" : "128",
+    "_id" : "ExampleCollection1/128",
+    "_rev" : "_XPoMzCu--_",
+    "a" : "foo",
+    "b" : "bar",
+    "c" : 0
+  },
+  {
+    "_key" : "130",
+    "_id" : "ExampleCollection1/130",
+    "_rev" : "_XPoMzCy--_",
+    "a" : "foo",
+    "b" : "baz",
+    "c" : 1
+  }
+]
+```
--- a/Documentation/Books/Manual/Views/ArangoSearch/README.md
+++ b/Documentation/Books/Manual/Views/ArangoSearch/README.md
@ -1,259 +1,45 @@
-# Bringing the power of IResearch to ArangoDB
+# ArangoSearch views powered by IResearch

-## What is ArangoSearch
+ArangoSearch is a natively integrated AQL extension making use of the
+IResearch library.

-ArangoSearch is a natively integrated AQL extension making use of the IResearch library.
+ArangoSearch allows one to:

-Arangosearch allows one to:
 * join documents located in different collections to one result list
-* search documents based on AQL boolean expressions and functions
-* sort the result set based on how closely each document matched the search condition
+* filter documents based on AQL boolean expressions and functions
+* sort the result set based on how closely each document matched the filter

 A concept of value 'analysis' that is meant to break up a given value into
 a set of sub-values internally tied together by metadata which influences both
-the search and sort stages to provide the most appropriate match for the
+the filter and sort stages to provide the most appropriate match for the
 specified conditions, similar to queries to web search engines.

 In plain terms this means a user can for example:
+
 * request documents where the 'body' attribute best matches 'a quick brown fox'
 * request documents where the 'dna' attribute best matches a DNA sub sequence
 * request documents where the 'name' attribute best matches gender
-* etc... (via custom analyzers described in the next section)
+* etc. (via custom analyzers)

-### The IResearch Library
+## The IResearch Library

-IResearch s a cross-platform open source indexing and searching engine written in C++,
-optimized for speed and memory footprint, with source available from:
-https://github.com/iresearch-toolkit/iresearch
+IResearch is a cross-platform open source indexing and searching engine written
+in modern C++, optimized for speed and memory footprint, with source available
+from https://github.com/iresearch-toolkit/iresearch

-IResearch is a framework for indexing, searching and sorting of data. The indexing stage can
-treat each data item as an atom or use custom 'analyzers' to break the data item
-into sub-atomic pieces tied together with internally tracked metadata.
+IResearch is the framework for indexing, filtering and sorting of data.
+The indexing stage can treat each data item as an atom or use custom 'analyzers'
+to break the data item into sub-atomic pieces tied together with internally
+tracked metadata.

 The IResearch framework in general can be further extended at runtime with
-custom implementations of analyzers (used during the indexing and searching
+custom implementations of analyzers (used during the indexing and filtering
 stages) and scorers (used during the sorting stage) allowing full control over
 the behavior of the engine.

+## Using ArangoSearch views

-### ArangoSearch Scorers
-
-ArangoSearch accesses scorers directly by their internal names. The
-name (in upper-case) of the scorer is the function name to be used in the
-['SORT' section](../../../AQL/Views/ArangoSearch/index.html#arangosearch-sort).
-Function arguments, (excluding the first argument), are serialized as a
-string representation of a JSON array and passed directly to the corresponding
-scorer. The first argument to any scorer function is the reference to the 
-current document emitted by the `FOR` statement, i.e. it would be 'doc' for this
-statement:
-
-    FOR doc IN someView
-
-IResearch provides a 'bm25' scorer implementing the
-[BM25 algorithm](https://en.wikipedia.org/wiki/Okapi_BM25). This scorer
-optionally takes 'k' and 'b' positional parameters.
-
-The user is able to run queries with the said scorer, e.g.
-
-    SORT BM25(doc, 1.2, 0.75)
-
-The function arguments will then be serialized into a JSON representation:
-
-```json
-[ 1.2, 0.75 ]
-```
-
-and passed to the scorer implementation.
-
-Similarly an administrator may choose to deploy a custom DNA analyzer 'DnaRank'.
-
-The user is then immediately able to run queries with the said scorer, e.g.
-
-    SORT DNARANK(doc, 123, 456, "abc", { "def", "ghi" })
-
-The function arguments will then be serialized into a JSON representation:
-
-```json
-[ 123, 456, "abc", { "def", "ghi" } ]
-```
-
-and passed to the scorer implementation.
-
-Runtime-plugging functionality for scores is not avaiable in ArangoDB at this
-point in time, so ArangoDB comes with a few default-initialized scores:
-
- *attribute-name*
-  order results based on the value of **attribute-name**
-
- BM25
-  order results based on the
-  [BM25 algorithm](https://en.wikipedia.org/wiki/Okapi_BM25)
-
- TFIDF
-  order results based on the
-  [TFIDF algorithm](https://en.wikipedia.org/wiki/TF-IDF)
-
-### ArangoSearch is much more than a fulltext search
-
-But fulltext searching is a subset of its available functionality, supported via
-the 'text' analyzer and 'tfidf'/'bm25' scorers, without impact to performance
-when specifying documents from different collections or searching on multiple
-document attributes.
-
-### View datasource
-
-The IResearch functionality is exposed to ArangoDB via the the ArangoSearch view
-API because the ArangoSearch view is merely an identity transformation applied
-onto documents stored in linked collections of the same ArangoDB database.
-In plain terms an ArangoSearch view only allows searching and sorting of documents
-located in collections of the same database.
-The matching documents themselves are returned as-is from their corresponding collections.
-
-### Links to ArangoDB collections
-
-A concept of an ArangoDB collection 'link' is introduced to allow specifying
-which ArangoDB collections a given ArangoSearch View should query for documents
-and how these documents should be queried.
-
-An ArangoSearch Link is a uni-directional connection from an ArangoDB collection
-to an ArangoSearch view describing how data coming from the said collection should
-be made available in the given view. Each ArangoSearch Link in an ArangoSearch view is
-uniquely identified by the name of the ArangoDB collection it links to. An
-ArangoSearch view may have zero or more links, each to a distinct ArangoDB
-collection. Similarly an ArangoDB collection may be referenced via links by zero
-or more distinct ArangoSearch views. In plain terms any given ArangoSearch view may be
-linked to any given ArangoDB collection of the same database with zero or at
-most one link. However, any ArangoSearch view may be linked to multiple distinct
-ArangoDB collections and similarly any ArangoDB collection may be referenced by
-multiple ArangoSearch views.
-
-To configure an ArangoSearch view for consideration of documents from a given
-ArangoDB collection a link definition must be added to the properties of the
-said ArangoSearch view defining the link parameters as per the section
-[View definition/modification](#view-definitionmodification).
-
-### Analyzers
-
-To simplify query syntax ArangoSearch provides a concept of 
-[named analyzers](Analyzers.md) which
-are merely aliases for type+configuration of IResearch analyzers. Management of
-named analyzers is exposed via both REST, GUI and JavaScript APIs, e.g.
-
-
-### View definition/modification
-
-An ArangoSearch view is configured via an object containing a set of
-view-specific configuration directives and a map of link-specific configuration
-directives.
-
-During view creation the following directives apply:
-* id: (optional) the desired view identifier
-* name: (required) the view name
-* type: \<required\> the value "arangosearch"
-  any of the directives from the section [View properties](#view-properties-updatable)
-
-During view modification the following directives apply:
-* links: (optional)
-  a mapping of collection-name/collection-identifier to one of:
-  * link creation - link definition as per the section [Link properties](#link-properties)
-  * link removal - JSON keyword *null* (i.e. nullify a link if present)
-    any of the directives from the section [modifiable view properties](#view-properties-updatable)
-
-### View properties (non-updatable)
-
-* locale: (optional; default: `C`)
-  the default locale used for ordering processed attribute names
-
-### View properties (updatable)
-
-* cleanupIntervalStep: (optional; default: `10`; to disable use: `0`)
-  wait at least this many commits between removing unused files in the
-  ArangoSearch data directory
-  for the case where the consolidation policies merge segments often (i.e. a
-  lot of commit+consolidate), a lower value will cause a lot of disk space to
-  be wasted
-  for the case where the consolidation policies rarely merge segments (i.e.
-  few inserts/deletes), a higher value will impact performance without any
-  added benefits
-
-* commitIntervalMsec: (optional; default: `60000`; to disable use: `0`)
-  wait at least *count* milliseconds between committing view data store
-  changes and making documents visible to queries
-  for the case where there are a lot of inserts/updates, a lower value will
-  cause the view not to account for them, (unlit commit), and memory usage
-  would continue to grow
-  for the case where there are a few inserts/updates, a higher value will
-  impact performance and waste disk space for each commit call without any
-  added benefits
-
-* consolidate: (optional; default: `none`)
-  a per-policy mapping of thresholds in the range `[0.0, 1.0]` to determine data
-  store segment merge candidates, if specified then only the listed policies
-  are used, keys are any of:
-
-  * bytes: (optional; for default values use an empty object: `{}`)
-
-    * segmentThreshold: (optional, default: `300`; to disable use: `0`)
-      apply consolidation policy IFF {segmentThreshold} >= #segments
-
-    * threshold: (optional; default: `0.85`)
-      consolidate `IFF {threshold} > segment_bytes / (all_segment_bytes / #segments)`
-
-  * bytes_accum: (optional; for default values use: `{}`)
-
-    * segmentThreshold: (optional; default: `300`; to disable use: `0`)
-      apply consolidation policy IFF {segmentThreshold} >= #segments
-
-    * threshold: (optional; default: `0.85`)
-      consolidate `IFF {threshold} > (segment_bytes + sum_of_merge_candidate_segment_bytes) / all_segment_bytes`
-
-  * count: (optional; for default values use: `{}`)
-
-    * segmentThreshold: (optional; default: `300`; to disable use: `0`)
-      apply consolidation policy IFF {segmentThreshold} >= #segments
-
-    * threshold: (optional; default: `0.85`)
-      consolidate `IFF {threshold} > segment_docs{valid} / (all_segment_docs{valid} / #segments)`
-
-  * fill: (optional)
-    if specified, use empty object for default values, i.e. `{}`
-
-    * segmentThreshold: (optional; default: `300`; to disable use: `0`)
-      apply consolidation policy IFF {segmentThreshold} >= #segments
-
-    * threshold: (optional; default: `0.85`)
-      consolidate `IFF {threshold} > #segment_docs{valid} / (#segment_docs{valid} + #segment_docs{removed})`
-
-### Link properties
-
-* analyzers: (optional; default: `[ 'identity' ]`)
-  a list of analyzers, by name as defined via the [Analyzers](Analyzers.md), that
-  should be applied to values of processed document attributes
-
-* fields: (optional; default: `{}`)
-  an object `{attribute-name: [Link properties]}` of fields that should be
-  processed at each level of the document
-  each key specifies the document attribute to be processed, the value of
-  *includeAllFields* is also consulted when selecting fields to be processed
-  each value specifies the [Link properties](#link-properties) directives to be used when
-  processing the specified field, a Link properties value of `{}` denotes
-  inheritance of all (except *fields*) directives from the current level
-
-* includeAllFields: (optional; default: `false`)
-  if true then process all document attributes (if not explicitly specified
-  then process the fields with default Link properties directives, i.e. `{}`),
-  otherwise only consider attributes mentioned in *fields*
-
-* trackListPositions: (optional; default: false)
-  if true then for array values track the value position in the array, e.g. when
-  querying for the input: `{ attr: [ 'valueX', 'valueY', 'valueZ' ] }`
-  the user must specify: `doc.attr[1] == 'valueY'`
-  otherwise all values in an array are treated as equal alternatives, e.g. when
-  querying for the input: `{ attr: [ 'valueX', 'valueY', 'valueZ' ] }`
-  the user must specify: `doc.attr == 'valueY'`
-
-* storeValues: (optional; default: "none")
-  how should the view track the attribute values, this setting allows for
-  additional value retrieval optimizations, one of:
-  * none: Do not store values by the view
-  * id: Store only information about value presence, to allow use of the EXISTS() function
+To get more familiar with ArangoSearch usage, you may start with [Getting Started](GettingStarted.md) simple guide and then explore details of ArangoSearch in
+ [Detailed Overview](DetailedOverview.md),
+ [Analyzers](Analyzers.md)
+ and [Scorers](Scorers.md) topics.
--- a/Documentation/Books/Manual/Views/ArangoSearch/Scorers.md
+++ b/Documentation/Books/Manual/Views/ArangoSearch/Scorers.md
@ -0,0 +1,58 @@
+ArangoSearch Scorers
+====================
+
+ArangoSearch accesses scorers directly by their internal names. The
+name (in upper-case) of the scorer is the function name to be used in the
+['SORT' section](../../../AQL/Views/ArangoSearch/index.html#arangosearch-sort).
+Function arguments, (excluding the first argument), are serialized as a
+string representation of a JSON array and passed directly to the corresponding
+scorer. The first argument to any scorer function is the reference to the 
+current document emitted by the `FOR` statement, i.e. it would be 'doc' for this
+statement:
+
+```js
+FOR doc IN someView
+```
+
+IResearch provides a 'bm25' scorer implementing the
+[BM25 algorithm](https://en.wikipedia.org/wiki/Okapi_BM25). This scorer
+optionally takes 'k' and 'b' positional parameters.
+
+The user is able to run queries with the said scorer, e.g.
+
+```js
+SORT BM25(doc, 1.2, 0.75)
+```
+
+The function arguments will then be serialized into a JSON representation:
+
+```json
+[ 1.2, 0.75 ]
+```
+
+and passed to the scorer implementation.
+
+Similarly an administrator may choose to deploy a custom DNA analyzer 'DnaRank'.
+
+The user is then immediately able to run queries with the said scorer, e.g.
+
+```js
+SORT DNARANK(doc, 123, 456, "abc", { "def": "ghi" })
+```
+
+The function arguments will then be serialized into a JSON representation:
+
+```json
+[ 123, 456, "abc", { "def": "ghi" } ]
+```
+
+and passed to the scorer implementation.
+
+Runtime-plugging functionality for scores is not available in ArangoDB at this
+point in time, so ArangoDB comes with a few default-initialized scores:
+
+- *attribute-name*: order results based on the value of **attribute-name**
+
+- BM25: order results based on the [BM25 algorithm](https://en.wikipedia.org/wiki/Okapi_BM25)
+
+- TFIDF: order results based on the [TFIDF algorithm](https://en.wikipedia.org/wiki/TF-IDF)