1
0
Fork 0

issue 526.9.1: implement swagger interface, add documentation (#8730)

* issue 526.9.1: implement swagger interface, add documentation

* address review comments

* add ngram

* Formatting

* Move REST description to new Analyzers top chapter in HTTP book

* Missed a DocuBlock

* Add Analyzers chapter to Manual SUMMARY.md

* Move REST API description back to Manual

Headlines were broken

* Add n-gram example
This commit is contained in:
Vasiliy 2019-04-16 18:54:30 +03:00 committed by Andrey Abramov
parent 80b021f915
commit 1a22d1360c
20 changed files with 847 additions and 65 deletions

View File

@ -445,6 +445,8 @@ devel
* do not wait for replication after each job execution in Supervision * do not wait for replication after each job execution in Supervision
* add support for configuring custom Analyzers via JavaScript and REST
v3.4.1 (XXXX-XX-XX) v3.4.1 (XXXX-XX-XX)
------------------- -------------------

View File

@ -0,0 +1,91 @@
HTTP Interface for Analyzers
============================
The REST API is accessible via the `/_api/analyzer` endpoint URL callable via
HTTP requests.
Analyzer Operations
-------------------
@startDocuBlock post_api_analyzer
@startDocuBlock get_api_analyzer
@startDocuBlock get_api_analyzers
@startDocuBlock delete_api_analyzer
Analyzer Types
--------------
The currently implemented Analyzer types are:
- `identity`
- `delimited`
- `ngram`
- `text`
### Identity
An analyzer applying the `identity` transformation, i.e. returning the input
unmodified.
The value of the *properties* attribute is ignored.
### Delimited
An analyzer capable of breaking up delimited text into tokens as per RFC4180
(without starting new records on newlines).
The *properties* allowed for this analyzer are either:
- a string encoded delimiter to use
- an object with the attribute *delimiter* containing the string encoded
delimiter to use
### N-gram
An analyzer capable of producing n-grams from a specified input in a range of
[min;max] (inclusive). Can optionally preserve the original input.
The *properties* allowed for this analyzer are an object with the following
attributes:
- *max*: unsigned integer (required) maximum n-gram length
- *min*: unsigned integer (required) minimum n-gram length
- *preserveOriginal*: boolean (required) output the original value as well
*Example*
With `min` = 4 and `max` = 5, the analyzer will produce the following n-grams
for the input *foobar*:
- foob
- ooba
- obar
- fooba
- oobar
With `preserveOriginal` enabled, it will additionally include *foobar* itself.
### Text
An analyzer capable of breaking up strings into individual words while also
optionally filtering out stop-words, applying case conversion and extracting
word stems.
The *properties* allowed for this analyzer are an object with the following
attributes:
- `locale`: string (required) format: (language[_COUNTRY][.encoding][@variant])
- `case_convert`: string enum (optional) one of: `lower`, `none`, `upper`,
default: `lower`
- `ignored_words`: array of strings (optional) words to omit from result,
default: load words from `ignored_words_path`
- `ignored_words_path`: string(optional) path with the `language` sub-directory
containing files with words to omit, default: if no
`ignored_words` provided then the value from the
environment variable `IRESEARCH_TEXT_STOPWORD_PATH` or
if undefined then the current working directory
- `no_accent`: boolean (optional) apply accent removal, default: true
- `no_stem`: boolean (optional) do not apply stemming on returned words,
default: false

View File

@ -49,6 +49,7 @@
* [Modifying](Views/Modifying.md) * [Modifying](Views/Modifying.md)
* [Retrieving](Views/Getting.md) * [Retrieving](Views/Getting.md)
* [ArangoSearch Views](Views/ArangoSearch.md) * [ArangoSearch Views](Views/ArangoSearch.md)
* [Analyzers](Analyzers/README.md)
* [Transactions](Transaction/README.md) * [Transactions](Transaction/README.md)
* [Replication](Replications/README.md) * [Replication](Replications/README.md)
* [Replication Dump](Replications/ReplicationDump.md) * [Replication Dump](Replications/ReplicationDump.md)

View File

@ -0,0 +1,181 @@
# Analyzers powered by IResearch
## Background
The concept of value "analysis" refers to the process of breaking up a given
value into a set of sub-values, which are internally tied together by metadata,
which in turn influences both the search and sort stages to provide the most
appropriate match for the specified conditions, similar to queries to web
search engines.
In plain terms this means a user can for example:
- request documents where the `body` attribute best matches `a quick brown fox`
- request documents where the `dna` attribute best matches a DNA sub sequence
- request documents where the `name` attribute best matches gender
- etc. (via custom analyzers)
## What are Analyzers
Analyzers are helpers that allow a user the parse and transform an arbitrary
value, currently only string values are supported, into zero or more resulting
values. The parsing and transformation applied is directed by the analyzer
*type* and the analyzer *properties*.
The Analyzer implementations themselves are provided by the underlying
[IResearch library](https://github.com/iresearch-toolkit/iresearch).
Therefore their most common usecase for filter condition matching is with
[ArangoSearch Views](../Views/ArangoSearch/README.md).
However, Analyzers can be used as standalone helpers via the `TOKENS(...)`
function, allowing a user to leverage the value transformation power of the
Analyzer in any context where an AQL function can be used.
A user-visible Analyzer is simply an alias for an underlying implementation
*type* + configuration *properties* and a set of *features*. The *features*
dictate what term matching capabilities are available and as such are only
applicable in the context of ArangoSearch Views.
The aforementioned three configuration attributes that an Analyzer is composed
of are given a simple *name* that can be used to reference the said Analyzer.
Thus an analyzer definition is composed of the following attributes:
- *name*: the analyzer name
- *type*: the analyzer type
- *properties*: the properties used to configure the specified type
- *features*: the set of features to set on the analyzer generated fields
The valid values for *type* is any Analyzer type available.
The valid values for the *properties* are dependant on what *type* is used. For
example for the *text* type its property may simply be an object with the value
`"locale": "en"`, whereas for the "delimited" type its property may simply be
the delimiter `,`.
The valid values for the *features* are dependant on both the capabilities of
the underlying *type* and the query filtering and sorting functions that the
result can be used with. For example the *text* type will produce
*frequency* + *norm* + *position* and the `PHRASE(...)` function requires
*frequency* + *position* to be available.
Currently the following *features* are supported:
- *frequency*: how often a term is seen, required for PHRASE(...)
- *norm*: the field normalization factor
- *position*: sequentially increasing term position, required for PHRASE(...)
if present then the *frequency* feature is also required
## Analyzer usage
For Analyzer usage in the context of ArangoSearch Views please see the section
[ArangoSearch Views](../Views/ArangoSearch/README.md).
The value transformation capabilities of a given analyzer can be invoked via
the `TOKENS(...)` function to for example:
- break up a string of words into individual words, while also optionally
filtering out stopwords, applying case conversion and extracting word stems
- parse CSV/TSV or other delimiter encoded string values into individual fields
The signature of the `TOKENS(...)` function is:
TOKENS(<value-to-parse>, <analyzer-name-to-apply>)
It currently accepts any string value, and an analyzer name, and will produce
an array of zero or more tokens generated by the specified analyzer
transformation.
## Analyzer management
The following operations are exposed via JavaScript and REST APIs for analyzer
management:
- *create*: creation of a new analyzer definition
- *get*: retrieve an existing analyzer definition
- *list*: retrieve a listing of all available analyzer definitions
- *remove*: remove an analyzer definition
### JavaScript
The JavaScript API is accessible via the `@arangodb/analyzers` endpoint from
both server-side and client-side code, e.g.
```js
var analyzers = require("@arangodb/analyzers");
```
The *create* operation is accessible via:
```js
analyzers.save(<name>, <type>[, <properties>[, <features>]])
```
… where *properties* can be represented either as a string, an object or a null
value and *features* is an array of string encoded feature names.
The *get* operation is accessible via:
```js
analyzers.analyzer(<name>)
```
The *list* operation is accessible via:
```js
analyzers.toArray()
```
The *remove* operation is accessible via:
```js
analyzers.remove(<name> [, <force>])
```
Additionally individual analyzer instances expose getter accessors for the
aforementioned definition attributes:
```js
analyzer.name()
analyzer.type()
analyzer.properties()
analyzer.features()
```
### RESTful API
The *create* operation is accessible via the *POST* method on the URL:
/_api/analyzer
With the Analyzer configuration passed via the body as an object with
attributes:
- *name*: string (required)
- *type*: string (required)
- *properties*: string or object or null (optional) default: `null`
- *features*: array of strings (optional) default: empty array
The *get* operation is accessible via the *GET* method on the URL:
/_api/analyzer/{analyzer-name}
A successful result will be an object with the fields:
- *name*
- *type*
- *properties*
- *features*
The *list* operation is accessible via the *GET* method on the URL:
/_api/analyzer
A successful result will be an array of object with the fields:
- *name*
- *type*
- *properties*
- *features*
The *remove* operation is accessible via the *DELETE* method on the URL:
/_api/analyzer/{analyzer-name}[?force=true]
Also see [Analyzers](../../HTTP/Analyzers/index.html) in the HTTP book
including a list of available [Analyzer Types](../../HTTP/Analyzers/index.html#analyzer-types).

View File

@ -160,6 +160,7 @@
* [Detailed Overview](Views/ArangoSearch/DetailedOverview.md) * [Detailed Overview](Views/ArangoSearch/DetailedOverview.md)
* [Analyzers](Views/ArangoSearch/Analyzers.md) * [Analyzers](Views/ArangoSearch/Analyzers.md)
* [Scorers](Views/ArangoSearch/Scorers.md) * [Scorers](Views/ArangoSearch/Scorers.md)
* [Analyzers](Analyzers/README.md)
## ADVANCED TOPICS ## ADVANCED TOPICS

View File

@ -2,56 +2,6 @@ ArangoSearch Analyzers
====================== ======================
To simplify query syntax ArangoSearch provides a concept of named analyzers To simplify query syntax ArangoSearch provides a concept of named analyzers
which are merely aliases for type+configuration of IResearch analyzers. In the which are merely aliases for type+configuration of IResearch analyzers. See
future, users will be able to specify their own named analyzers. For now, the [Analyzers](../../Analyzers/README.md) for a description of their usage
ArangoDB comes with the following analyzers: and management.
- `identity`<br/>
treat the value as an atom
- `text_de`<br/>
tokenize the value into case-insensitive word stems as per the German locale,<br/>
do not discard any stopwords
- `text_en`<br/>
tokenize the value into case-insensitive word stems as per the English locale,<br/>
do not discard any stopwords
- `text_es`<br/>
tokenize the value into case-insensitive word stems as per the Spanish locale,<br/>
do not discard any stopwords
- `text_fi`<br/>
tokenize the value into case-insensitive word stems as per the Finnish locale,<br/>
do not discard any stopwords
- `text_fr`<br/>
tokenize the value into case-insensitive word stems as per the French locale,<br/>
do not discard any stopwords
- `text_it`<br/>
tokenize the value into case-insensitive word stems as per the Italian locale,<br/>
do not discard any stopwords
- `text_nl`<br/>
tokenize the value into case-insensitive word stems as per the Dutch locale,<br/>
do not discard any stopwords
- `text_no`<br/>
tokenize the value into case-insensitive word stems as per the Norwegian<br/>
locale, do not discard any stopwords
- `text_pt`<br/>
tokenize the value into case-insensitive word stems as per the Portuguese<br/>
locale, do not discard any stopwords
- `text_ru`<br/>
tokenize the value into case-insensitive word stems as per the Russian locale,<br/>
do not discard any stopwords
- `text_sv`<br/>
tokenize the value into case-insensitive word stems as per the Swedish locale,<br/>
do not discard any stopwords
- `text_zh`<br/>
tokenize the value into word stems as per the Chinese locale

View File

@ -1,9 +1,9 @@
# ArangoSearch Views powered by IResearch # ArangoSearch Views powered by IResearch
ArangoSearch is a natively integrated AQL extension making use of the ## What is ArangoSearch
IResearch library.
ArangoSearch allows one to: ArangoSearch is a natively integrated AQL extension making use of the
[IResearch library](https://github.com/iresearch-toolkit/iresearch).
- join documents located in different collections to one result list - join documents located in different collections to one result list
- filter documents based on AQL boolean expressions and functions - filter documents based on AQL boolean expressions and functions
@ -11,7 +11,7 @@ ArangoSearch allows one to:
A concept of value "analysis" that is meant to break up a given value into A concept of value "analysis" that is meant to break up a given value into
a set of sub-values internally tied together by metadata which influences both a set of sub-values internally tied together by metadata which influences both
the filter and sort stages to provide the most appropriate match for the the search and sort stages to provide the most appropriate match for the
specified conditions, similar to queries to web search engines. specified conditions, similar to queries to web search engines.
In plain terms this means a user can for example: In plain terms this means a user can for example:
@ -21,11 +21,14 @@ In plain terms this means a user can for example:
- request documents where the `name` attribute best matches gender - request documents where the `name` attribute best matches gender
- etc. (via custom analyzers) - etc. (via custom analyzers)
## The IResearch Library See the [Analyzers](../../Analyzers/README.md) for a detailed description of
usage and management of custom analyzers.
IResearch is a cross-platform open source indexing and searching engine written ### The IResearch Library
in modern C++, optimized for speed and memory footprint, with source available
from https://github.com/iresearch-toolkit/iresearch IResearch s a cross-platform open source indexing and searching engine written in C++,
optimized for speed and memory footprint, with source available from:
https://github.com/iresearch-toolkit/iresearch
IResearch is the framework for indexing, filtering and sorting of data. IResearch is the framework for indexing, filtering and sorting of data.
The indexing stage can treat each data item as an atom or use custom "analyzers" The indexing stage can treat each data item as an atom or use custom "analyzers"
@ -33,7 +36,7 @@ to break the data item into sub-atomic pieces tied together with internally
tracked metadata. tracked metadata.
The IResearch framework in general can be further extended at runtime with The IResearch framework in general can be further extended at runtime with
custom implementations of analyzers (used during the indexing and filtering custom implementations of analyzers (used during the indexing and searching
stages) and scorers (used during the sorting stage) allowing full control over stages) and scorers (used during the sorting stage) allowing full control over
the behavior of the engine. the behavior of the engine.

View File

@ -0,0 +1,100 @@
@startDocuBlock delete_api_analyzer
@brief removes an analyzer configuration
@RESTHEADER{DELETE /_api/analyzer/{analyzer-name}, Remove an analyzer}
@RESTURLPARAMETERS
@RESTURLPARAM{analyzer-name,string,required}
The name of the analyzer to remove.
@RESTQUERYPARAMETERS
@RESTQUERYPARAM{force,boolean,optional}
The analyzer configuration should be removed even if it is in-use.
The default value is *false*.
@RESTDESCRIPTION
Removes an analyzer configuration identified by *analyzer-name*.
If the analyzer definition was successfully dropped, an object is returned with
the following attributes:
- *error*: *false*
- *name*: The name of the removed analyzer
@RESTRETURNCODES
@RESTRETURNCODE{200}
The analyzer configuration was removed successfully.
@RESTRETURNCODE{400}
The *analyzer-name* was not supplied or another request parameter was not
valid.
@RESTRETURNCODE{403}
The user does not have permission to remove this analyzer configuration.
@RESTRETURNCODE{404}
Such an analyzer configuration does not exist.
@RESTRETURNCODE{409}
The specified analyzer configuration is still in use and *force* was omitted or
*false* specified.
@EXAMPLES
Removing without *force*:
@EXAMPLE_ARANGOSH_RUN{RestAnalyzerDelete}
var analyzers = require("@arangodb/analyzers");
var db = require("@arangodb").db;
var analyzerName = db._name() + "::testAnalyzer";
analyzers.save(analyzerName, "identity", "test properties");
// removal
var url = "/_api/analyzer/" + encodeURIComponent(analyzerName);
var response = logCurlRequest('DELETE', url);
console.error(JSON.stringify(response));
assert(response.code === 200);
logJsonResponse(response);
@END_EXAMPLE_ARANGOSH_RUN
Removing with *force*:
@EXAMPLE_ARANGOSH_RUN{RestAnalyzerDeleteForce}
var analyzers = require("@arangodb/analyzers");
var db = require("@arangodb").db;
var analyzerName = db._name() + "::testAnalyzer";
analyzers.save(analyzerName, "identity", "test properties");
// create analyzer reference
var url = "/_api/collection";
var body = { name: "testCollection" };
var response = logCurlRequest('POST', url, body);
assert(response.code === 200);
var url = "/_api/view";
var body = {
name: "testView",
type: "arangosearch",
links: { testCollection: { analyzers: [ analyzerName ] } }
};
var response = logCurlRequest('POST', url, body);
// removal (fail)
var url = "/_api/analyzer/" + encodeURIComponent(analyzerName) + "?force=false";
var response = logCurlRequest('DELETE', url);
assert(response.code === 409);
// removal
var url = "/_api/analyzer/" + encodeURIComponent(analyzerName) + "?force=true";
var response = logCurlRequest('DELETE', url);
assert(response.code === 200);
logJsonResponse(response);
db._dropView("testView");
db._drop("testCollection");
@END_EXAMPLE_ARANGOSH_RUN
@endDocuBlock

View File

@ -0,0 +1,48 @@
@startDocuBlock get_api_analyzer
@brief returns an analyzer definition
@RESTHEADER{GET /_api/analyzer/{analyzer-name}, Return the analyzer definition}
@RESTURLPARAMETERS
@RESTURLPARAM{analyzer-name,string,required}
The name of the analyzer to retrieve.
@RESTDESCRIPTION
Retrieves the full definition for the specified analyzer name.
The resulting object contains the following attributes:
- *name*: the analyzer name
- *type*: the analyzer type
- *properties*: the properties used to configure the specified type
- *features*: the set of features to set on the analyzer generated fields
@RESTRETURNCODES
@RESTRETURNCODE{200}
The analyzer definition was retrieved successfully.
@RESTRETURNCODE{404}
Such an analyzer configuration does not exist.
@EXAMPLES
Retrieve an analyzer definition:
@EXAMPLE_ARANGOSH_RUN{RestAnalyzerGet}
var analyzers = require("@arangodb/analyzers");
var db = require("@arangodb").db;
var analyzerName = db._name() + "::testAnalyzer";
analyzers.save(analyzerName, "identity", "test properties");
// retrieval
var url = "/_api/analyzer/" + encodeURIComponent(analyzerName);
var response = logCurlRequest('GET', url);
assert(response.code === 200);
logJsonResponse(response);
analyzers.remove(analyzerName, true);
@END_EXAMPLE_ARANGOSH_RUN
@endDocuBlock

View File

@ -0,0 +1,32 @@
@startDocuBlock get_api_analyzers
@brief returns a listing of available analyzer definitions
@RESTHEADER{GET /_api/analyzer, List all analyzers}
@RESTDESCRIPTION
Retrieves a an array of all analyzer definitions.
The resulting array contains objects with the following attributes:
- *name*: the analyzer name
- *type*: the analyzer type
- *properties*: the properties used to configure the specified type
- *features*: the set of features to set on the analyzer generated fields
@RESTRETURNCODES
@RESTRETURNCODE{200}
The analyzer definitions was retrieved successfully.
@EXAMPLES
Retrieve all analyzer definitions:
@EXAMPLE_ARANGOSH_RUN{RestAnalyzersGet}
// retrieval
var url = "/_api/analyzer";
var response = logCurlRequest('GET', url);
assert(response.code === 200);
logJsonResponse(response);
@END_EXAMPLE_ARANGOSH_RUN
@endDocuBlock

View File

@ -0,0 +1,60 @@
@startDocuBlock post_api_analyzer
@brief creates a new analyzer based on the provided definition
@RESTHEADER{POST /_api/analyzer, Create an analyzer with the suppiled definition}
@RESTBODYPARAM{name,string,required,string}
The analyzer name.
@RESTBODYPARAM{type,string,required,string}
The analyzer type.
@RESTBODYPARAM{properties,string,optional,string}
The properties used to configure the specified type.
Value may be a string, an object or null.
The default value is *null*.
@RESTBODYPARAM{features,array,optional,string}
The set of features to set on the analyzer generated fields.
The default value is an empty array.
@RESTDESCRIPTION
Creates a new analyzer based on the provided configuration.
@RESTRETURNCODES
@RESTRETURNCODE{200}
An analyzer with a matching name and definition already exists.
@RESTRETURNCODE{201}
A new analyzer definition was successfully created.
@RESTRETURNCODE{400}
One or more of the required parameters is missing or one or more of the parameters
is not valid.
@RESTRETURNCODE{403}
The user does not have permission to create and analyzer with this configuration.
@EXAMPLES
@EXAMPLE_ARANGOSH_RUN{RestAnalyzerPost}
var analyzers = require("@arangodb/analyzers");
var db = require("@arangodb").db;
var analyzerName = db._name() + "::testAnalyzer";
// creation
var url = "/_api/analyzer";
var body = {
name: db._name() + "::testAnalyzer",
type: "identity"
};
var response = logCurlRequest('POST', url, body);
assert(response.code === 201);
logJsonResponse(response);
analyzers.remove(analyzerName, true);
@END_EXAMPLE_ARANGOSH_RUN
@endDocuBlock

View File

@ -0,0 +1,11 @@
<span class="hljs-meta">shell&gt;</span><span class="bash"> curl -X DELETE --header <span class="hljs-string">'accept: application/json'</span> --dump - http://localhost:8529/_api/analyzer/_system%3A%3AtestAnalyzer</span>
HTTP/<span class="hljs-number">1.1</span> OK
content-type: application/json; charset=utf<span class="hljs-number">-8</span>
x-content-type-options: nosniff
{
<span class="hljs-string">"error"</span> : <span class="hljs-literal">false</span>,
<span class="hljs-string">"code"</span> : <span class="hljs-number">200</span>,
<span class="hljs-string">"name"</span> : <span class="hljs-string">"_system::testAnalyzer"</span>
}

View File

@ -0,0 +1,33 @@
<span class="hljs-meta">shell&gt;</span><span class="bash"> curl -X POST --header <span class="hljs-string">'accept: application/json'</span> --data-binary @- --dump - http://localhost:8529/_api/collection</span> &lt;&lt;EOF
{
<span class="hljs-string">"name"</span> : <span class="hljs-string">"testCollection"</span>
}
EOF
<span class="hljs-meta">shell&gt;</span><span class="bash"> curl -X POST --header <span class="hljs-string">'accept: application/json'</span> --data-binary @- --dump - http://localhost:8529/_api/view</span> &lt;&lt;EOF
{
<span class="hljs-string">"name"</span> : <span class="hljs-string">"testView"</span>,
<span class="hljs-string">"type"</span> : <span class="hljs-string">"arangosearch"</span>,
<span class="hljs-string">"links"</span> : {
<span class="hljs-string">"testCollection"</span> : {
<span class="hljs-string">"analyzers"</span> : [
<span class="hljs-string">"_system::testAnalyzer"</span>
]
}
}
}
EOF
<span class="hljs-meta">shell&gt;</span><span class="bash"> curl -X DELETE --header <span class="hljs-string">'accept: application/json'</span> --dump - http://localhost:8529/_api/analyzer/_system%3A%3AtestAnalyzer?force=<span class="hljs-literal">false</span></span>
<span class="hljs-meta">shell&gt;</span><span class="bash"> curl -X DELETE --header <span class="hljs-string">'accept: application/json'</span> --dump - http://localhost:8529/_api/analyzer/_system%3A%3AtestAnalyzer?force=<span class="hljs-literal">true</span></span>
HTTP/<span class="hljs-number">1.1</span> OK
content-type: application/json; charset=utf<span class="hljs-number">-8</span>
x-content-type-options: nosniff
{
<span class="hljs-string">"error"</span> : <span class="hljs-literal">false</span>,
<span class="hljs-string">"code"</span> : <span class="hljs-number">200</span>,
<span class="hljs-string">"name"</span> : <span class="hljs-string">"_system::testAnalyzer"</span>
}

View File

@ -0,0 +1,14 @@
<span class="hljs-meta">shell&gt;</span><span class="bash"> curl --header <span class="hljs-string">'accept: application/json'</span> --dump - http://localhost:8529/_api/analyzer/_system%3A%3AtestAnalyzer</span>
HTTP/<span class="hljs-number">1.1</span> OK
content-type: application/json; charset=utf<span class="hljs-number">-8</span>
x-content-type-options: nosniff
{
<span class="hljs-string">"error"</span> : <span class="hljs-literal">false</span>,
<span class="hljs-string">"code"</span> : <span class="hljs-number">200</span>,
<span class="hljs-string">"type"</span> : <span class="hljs-string">"identity"</span>,
<span class="hljs-string">"properties"</span> : <span class="hljs-string">"test properties"</span>,
<span class="hljs-string">"features"</span> : [ ],
<span class="hljs-string">"name"</span> : <span class="hljs-string">"_system::testAnalyzer"</span>
}

View File

@ -0,0 +1,17 @@
<span class="hljs-meta">shell&gt;</span><span class="bash"> curl -X POST --header <span class="hljs-string">'accept: application/json'</span> --data-binary @- --dump - http://localhost:8529/_api/analyzer</span> &lt;&lt;EOF
{
<span class="hljs-string">"name"</span> : <span class="hljs-string">"_system::testAnalyzer"</span>,
<span class="hljs-string">"type"</span> : <span class="hljs-string">"identity"</span>
}
EOF
HTTP/<span class="hljs-number">1.1</span> Created
content-type: application/json; charset=utf<span class="hljs-number">-8</span>
x-content-type-options: nosniff
{
<span class="hljs-string">"name"</span> : <span class="hljs-string">"_system::testAnalyzer"</span>,
<span class="hljs-string">"type"</span> : <span class="hljs-string">"identity"</span>,
<span class="hljs-string">"properties"</span> : <span class="hljs-literal">null</span>,
<span class="hljs-string">"features"</span> : [ ]
}

View File

@ -0,0 +1,11 @@
<span class="hljs-meta">shell&gt;</span><span class="bash"> curl --header <span class="hljs-string">'accept: application/json'</span> --dump - http://localhost:8529/_api/analyzer</span>
HTTP/<span class="hljs-number">1.1</span> OK
content-type: application/json; charset=utf<span class="hljs-number">-8</span>
x-content-type-options: nosniff
{
<span class="hljs-string">"error"</span> : <span class="hljs-literal">false</span>,
<span class="hljs-string">"code"</span> : <span class="hljs-number">200</span>,
<span class="hljs-string">"result"</span> : [ ]
}

View File

@ -289,7 +289,7 @@ arangodb::RestStatus RestAnalyzerHandler::execute() {
generateError( // generate error generateError( // generate error
arangodb::rest::ResponseCode::BAD, // HTTP code arangodb::rest::ResponseCode::BAD, // HTTP code
TRI_ERROR_BAD_PARAMETER, // code TRI_ERROR_BAD_PARAMETER, // code
std::string("expecting DELETE ") + ANALYZER_PATH + "/<analyzer-name>[#force=true]" // mesage std::string("expecting DELETE ") + ANALYZER_PATH + "/<analyzer-name>[?force=true]" // mesage
); );
return arangodb::RestStatus::DONE; return arangodb::RestStatus::DONE;

View File

@ -410,7 +410,7 @@ void JS_Get(v8::FunctionCallbackInfo<v8::Value> const& args) {
// expecting one argument // expecting one argument
// analyzer(name: <string>); // analyzer(name: <string>);
if (args.Length() != 1 || !args[0]->IsString()) { if (args.Length() != 1 || !args[0]->IsString()) {
TRI_V8_THROW_EXCEPTION_USAGE("analyser(<name>)"); TRI_V8_THROW_EXCEPTION_USAGE("analyzer(<name>)");
} }
PREVENT_EMBEDDED_TRANSACTION(); PREVENT_EMBEDDED_TRANSACTION();

View File

@ -7391,6 +7391,133 @@
"x-hints": "" "x-hints": ""
} }
}, },
"/_api/analyzer": {
"get": {
"description": "\n\nRetrieves a an array of all analyzer definitions.\nThe resulting array contains objects with the following attributes:\n- *name*: the analyzer name\n- *type*: the analyzer type\n- *properties*: the properties used to configure the specified type\n- *features*: the set of features to set on the analyzer generated fields\n\n\n\n\n**Example:**\n Retrieve all analyzer definitions:\n\n<pre><code><span class=\"hljs-meta\">shell&gt;</span><span class=\"bash\"> curl --header <span class=\"hljs-string\">'accept: application/json'</span> --dump - http://localhost:8529/_api/analyzer</span>\n</code><code>\n</code><code>HTTP/<span class=\"hljs-number\">1.1</span> OK\n</code><code>content-type: application/json; charset=utf<span class=\"hljs-number\">-8</span>\n</code><code>x-content-type-options: nosniff\n</code><code>\n</code><code>{ \n</code><code> <span class=\"hljs-string\">\"error\"</span> : <span class=\"hljs-literal\">false</span>, \n</code><code> <span class=\"hljs-string\">\"code\"</span> : <span class=\"hljs-number\">200</span>, \n</code><code> <span class=\"hljs-string\">\"result\"</span> : [ ] \n</code><code>}\n</code></pre>\n\n\n\n\n",
"parameters": [],
"responses": {
"200": {
"description": "The analyzer definitions was retrieved successfully.\n\n"
}
},
"summary": "List all analyzers",
"tags": [
"Analyzers"
],
"x-examples": [],
"x-filename": "/home/user/git-root/arangodb-devel.debug/Documentation/DocuBlocks/Rest/Analyzers/get_api_analyzers.md",
"x-hints": ""
},
"post": {
"description": "\n**A JSON object with these properties is required:**\n\n - **features** (string): The set of features to set on the analyzer generated fields.\n The default value is an empty array.\n - **type**: The analyzer type.\n - **name**: The analyzer name.\n - **properties**: The properties used to configure the specified type.\n Value may be a string, an object or null.\n The default value is *null*.\n\n\n\n\nCreates a new analyzer based on the provided configuration.\n\n\n\n\n**Example:**\n \n\n<pre><code><span class=\"hljs-meta\">shell&gt;</span><span class=\"bash\"> curl -X POST --header <span class=\"hljs-string\">'accept: application/json'</span> --data-binary @- --dump - http://localhost:8529/_api/analyzer</span> &lt;&lt;EOF\n</code><code>{ \n</code><code> <span class=\"hljs-string\">\"name\"</span> : <span class=\"hljs-string\">\"_system::testAnalyzer\"</span>, \n</code><code> <span class=\"hljs-string\">\"type\"</span> : <span class=\"hljs-string\">\"identity\"</span> \n</code><code>}\n</code><code>EOF\n</code><code>\n</code><code>HTTP/<span class=\"hljs-number\">1.1</span> Created\n</code><code>content-type: application/json; charset=utf<span class=\"hljs-number\">-8</span>\n</code><code>x-content-type-options: nosniff\n</code><code>\n</code><code>{ \n</code><code> <span class=\"hljs-string\">\"name\"</span> : <span class=\"hljs-string\">\"_system::testAnalyzer\"</span>, \n</code><code> <span class=\"hljs-string\">\"type\"</span> : <span class=\"hljs-string\">\"identity\"</span>, \n</code><code> <span class=\"hljs-string\">\"properties\"</span> : <span class=\"hljs-literal\">null</span>, \n</code><code> <span class=\"hljs-string\">\"features\"</span> : [ ] \n</code><code>}\n</code></pre>\n\n\n\n\n",
"parameters": [
{
"in": "body",
"name": "Json Request Body",
"required": true,
"schema": {
"$ref": "#/definitions/post_api_analyzer"
},
"x-description-offset": 54
}
],
"responses": {
"200": {
"description": "An analyzer with a matching name and definition already exists.\n\n"
},
"201": {
"description": "A new analyzer definition was successfully created.\n\n"
},
"400": {
"description": "One or more of the required parameters is missing or one or more of the parameters\nis not valid.\n\n"
},
"403": {
"description": "The user does not have permission to create and analyzer with this configuration.\n\n"
}
},
"summary": "Create an analyzer with the suppiled definition",
"tags": [
"Analyzers"
],
"x-examples": [],
"x-filename": "/home/user/git-root/arangodb-devel.debug/Documentation/DocuBlocks/Rest/Analyzers/post_api_analyzer.md",
"x-hints": ""
}
},
"/_api/analyzer/{analyzer-name}": {
"delete": {
"description": "\n\nRemoves an analyzer configuration identified by *analyzer-name*.\n\nIf the analyzer definition was successfully dropped, an object is returned with\nthe following attributes:\n- *error*: *false*\n- *name*: The name of the removed analyzer\n\n\n\n\n**Example:**\n Removing without *force*:\n\n<pre><code><span class=\"hljs-meta\">shell&gt;</span><span class=\"bash\"> curl -X DELETE --header <span class=\"hljs-string\">'accept: application/json'</span> --dump - http://localhost:8529/_api/analyzer/_system%3A%3AtestAnalyzer</span>\n</code><code>\n</code><code>HTTP/<span class=\"hljs-number\">1.1</span> OK\n</code><code>content-type: application/json; charset=utf<span class=\"hljs-number\">-8</span>\n</code><code>x-content-type-options: nosniff\n</code><code>\n</code><code>{ \n</code><code> <span class=\"hljs-string\">\"error\"</span> : <span class=\"hljs-literal\">false</span>, \n</code><code> <span class=\"hljs-string\">\"code\"</span> : <span class=\"hljs-number\">200</span>, \n</code><code> <span class=\"hljs-string\">\"name\"</span> : <span class=\"hljs-string\">\"_system::testAnalyzer\"</span> \n</code><code>}\n</code></pre>\n\n\n\n\n**Example:**\n Removing with *force*:\n\n<pre><code><span class=\"hljs-meta\">shell&gt;</span><span class=\"bash\"> curl -X POST --header <span class=\"hljs-string\">'accept: application/json'</span> --data-binary @- --dump - http://localhost:8529/_api/collection</span> &lt;&lt;EOF\n</code><code>{ \n</code><code> <span class=\"hljs-string\">\"name\"</span> : <span class=\"hljs-string\">\"testCollection\"</span> \n</code><code>}\n</code><code>EOF\n</code><code>\n</code><code><span class=\"hljs-meta\">shell&gt;</span><span class=\"bash\"> curl -X POST --header <span class=\"hljs-string\">'accept: application/json'</span> --data-binary @- --dump - http://localhost:8529/_api/view</span> &lt;&lt;EOF\n</code><code>{ \n</code><code> <span class=\"hljs-string\">\"name\"</span> : <span class=\"hljs-string\">\"testView\"</span>, \n</code><code> <span class=\"hljs-string\">\"type\"</span> : <span class=\"hljs-string\">\"arangosearch\"</span>, \n</code><code> <span class=\"hljs-string\">\"links\"</span> : { \n</code><code> <span class=\"hljs-string\">\"testCollection\"</span> : { \n</code><code> <span class=\"hljs-string\">\"analyzers\"</span> : [ \n</code><code> <span class=\"hljs-string\">\"_system::testAnalyzer\"</span> \n</code><code> ] \n</code><code> } \n</code><code> } \n</code><code>}\n</code><code>EOF\n</code><code>\n</code><code><span class=\"hljs-meta\">shell&gt;</span><span class=\"bash\"> curl -X DELETE --header <span class=\"hljs-string\">'accept: application/json'</span> --dump - http://localhost:8529/_api/analyzer/_system%3A%3AtestAnalyzer?force=<span class=\"hljs-literal\">false</span></span>\n</code><code>\n</code><code><span class=\"hljs-meta\">shell&gt;</span><span class=\"bash\"> curl -X DELETE --header <span class=\"hljs-string\">'accept: application/json'</span> --dump - http://localhost:8529/_api/analyzer/_system%3A%3AtestAnalyzer?force=<span class=\"hljs-literal\">true</span></span>\n</code><code>\n</code><code>HTTP/<span class=\"hljs-number\">1.1</span> OK\n</code><code>content-type: application/json; charset=utf<span class=\"hljs-number\">-8</span>\n</code><code>x-content-type-options: nosniff\n</code><code>\n</code><code>{ \n</code><code> <span class=\"hljs-string\">\"error\"</span> : <span class=\"hljs-literal\">false</span>, \n</code><code> <span class=\"hljs-string\">\"code\"</span> : <span class=\"hljs-number\">200</span>, \n</code><code> <span class=\"hljs-string\">\"name\"</span> : <span class=\"hljs-string\">\"_system::testAnalyzer\"</span> \n</code><code>}\n</code></pre>\n\n\n\n\n",
"parameters": [
{
"description": "The name of the analyzer to remove.\n\n",
"format": "string",
"in": "path",
"name": "analyzer-name",
"required": true,
"type": "string"
},
{
"description": "The analyzer configuration should be removed even if it is in-use.\nThe default value is *false*.\n\n",
"in": "query",
"name": "force",
"required": false,
"type": "boolean"
}
],
"responses": {
"200": {
"description": "The analyzer configuration was removed successfully.\n\n"
},
"400": {
"description": "The *analyzer-name* was not supplied or another request parameter was not\nvalid.\n\n"
},
"403": {
"description": "The user does not have permission to remove this analyzer configuration.\n\n"
},
"404": {
"description": "Such an analyzer configuration does not exist.\n\n"
},
"409": {
"description": "The specified analyzer configuration is still in use and *force* was omitted or\n*false* specified.\n\n"
}
},
"summary": "Remove an analyzer",
"tags": [
"Analyzers"
],
"x-examples": [],
"x-filename": "/home/user/git-root/arangodb-devel.debug/Documentation/DocuBlocks/Rest/Analyzers/delete_api_analyzer.md",
"x-hints": ""
},
"get": {
"description": "\n\nRetrieves the full definition for the specified analyzer name.\nThe resulting object contains the following attributes:\n- *name*: the analyzer name\n- *type*: the analyzer type\n- *properties*: the properties used to configure the specified type\n- *features*: the set of features to set on the analyzer generated fields\n\n\n\n\n**Example:**\n Retrieve an analyzer definition:\n\n<pre><code><span class=\"hljs-meta\">shell&gt;</span><span class=\"bash\"> curl --header <span class=\"hljs-string\">'accept: application/json'</span> --dump - http://localhost:8529/_api/analyzer/_system%3A%3AtestAnalyzer</span>\n</code><code>\n</code><code>HTTP/<span class=\"hljs-number\">1.1</span> OK\n</code><code>content-type: application/json; charset=utf<span class=\"hljs-number\">-8</span>\n</code><code>x-content-type-options: nosniff\n</code><code>\n</code><code>{ \n</code><code> <span class=\"hljs-string\">\"error\"</span> : <span class=\"hljs-literal\">false</span>, \n</code><code> <span class=\"hljs-string\">\"code\"</span> : <span class=\"hljs-number\">200</span>, \n</code><code> <span class=\"hljs-string\">\"type\"</span> : <span class=\"hljs-string\">\"identity\"</span>, \n</code><code> <span class=\"hljs-string\">\"properties\"</span> : <span class=\"hljs-string\">\"test properties\"</span>, \n</code><code> <span class=\"hljs-string\">\"features\"</span> : [ ], \n</code><code> <span class=\"hljs-string\">\"name\"</span> : <span class=\"hljs-string\">\"_system::testAnalyzer\"</span> \n</code><code>}\n</code></pre>\n\n\n\n\n\n",
"parameters": [
{
"description": "The name of the analyzer to retrieve.\n\n",
"format": "string",
"in": "path",
"name": "analyzer-name",
"required": true,
"type": "string"
}
],
"responses": {
"200": {
"description": "The analyzer definition was retrieved successfully.\n\n"
},
"404": {
"description": "Such an analyzer configuration does not exist.\n\n"
}
},
"summary": "Return the analyzer definition",
"tags": [
"Analyzers"
],
"x-examples": [],
"x-filename": "/home/user/git-root/arangodb-devel.debug/Documentation/DocuBlocks/Rest/Analyzers/get_api_analyzer.md",
"x-hints": ""
}
},
"/_api/aqlfunction": { "/_api/aqlfunction": {
"get": { "get": {
"description": "\n\nReturns all registered AQL user functions.\n\nThe call will return a JSON array with status codes and all user functions found under *result*.\n\n\n**HTTP 200**\n*A json document with these Properties is returned:*\n\non success *HTTP 200* is returned.\n\n- **code**: the HTTP status code\n- **result**: All functions, or the ones matching the *namespace* parameter \n - **isDeterministic**: an optional boolean value to indicate whether the function\n results are fully deterministic (function return value solely depends on\n the input value and return value is the same for repeated calls with same\n input). The *isDeterministic* attribute is currently not used but may be\n used later for optimizations.\n - **code**: A string representation of the function body\n - **name**: The fully qualified name of the user function\n- **error**: boolean flag to indicate whether an error occurred (*false* in this case)\n\n\n**HTTP 400**\n*A json document with these Properties is returned:*\n\nIf the user function name is malformed, the server will respond with *HTTP 400*.\n\n- **errorMessage**: a descriptive error message\n- **errorNum**: the server error number\n- **code**: the HTTP status code\n- **error**: boolean flag to indicate whether an error occurred (*true* in this case)\n\n\n\n\n**Example:**\n \n\n<pre><code><span class=\"hljs-meta\">shell&gt;</span><span class=\"bash\"> curl --header <span class=\"hljs-string\">'accept: application/json'</span> --dump - http://localhost:8529/_api/aqlfunction/<span class=\"hljs-built_in\">test</span></span>\n</code><code>\n</code><code>HTTP/<span class=\"hljs-number\">1.1</span> OK\n</code><code>content-type: application/json; charset=utf<span class=\"hljs-number\">-8</span>\n</code><code>x-content-type-options: nosniff\n</code><code>\n</code><code>{ \n</code><code> <span class=\"hljs-string\">\"error\"</span> : <span class=\"hljs-literal\">false</span>, \n</code><code> <span class=\"hljs-string\">\"code\"</span> : <span class=\"hljs-number\">200</span>, \n</code><code> <span class=\"hljs-string\">\"result\"</span> : [ ] \n</code><code>}\n</code></pre>\n\n\n\n\n", "description": "\n\nReturns all registered AQL user functions.\n\nThe call will return a JSON array with status codes and all user functions found under *result*.\n\n\n**HTTP 200**\n*A json document with these Properties is returned:*\n\non success *HTTP 200* is returned.\n\n- **code**: the HTTP status code\n- **result**: All functions, or the ones matching the *namespace* parameter \n - **isDeterministic**: an optional boolean value to indicate whether the function\n results are fully deterministic (function return value solely depends on\n the input value and return value is the same for repeated calls with same\n input). The *isDeterministic* attribute is currently not used but may be\n used later for optimizations.\n - **code**: A string representation of the function body\n - **name**: The fully qualified name of the user function\n- **error**: boolean flag to indicate whether an error occurred (*false* in this case)\n\n\n**HTTP 400**\n*A json document with these Properties is returned:*\n\nIf the user function name is malformed, the server will respond with *HTTP 400*.\n\n- **errorMessage**: a descriptive error message\n- **errorNum**: the server error number\n- **code**: the HTTP status code\n- **error**: boolean flag to indicate whether an error occurred (*true* in this case)\n\n\n\n\n**Example:**\n \n\n<pre><code><span class=\"hljs-meta\">shell&gt;</span><span class=\"bash\"> curl --header <span class=\"hljs-string\">'accept: application/json'</span> --dump - http://localhost:8529/_api/aqlfunction/<span class=\"hljs-built_in\">test</span></span>\n</code><code>\n</code><code>HTTP/<span class=\"hljs-number\">1.1</span> OK\n</code><code>content-type: application/json; charset=utf<span class=\"hljs-number\">-8</span>\n</code><code>x-content-type-options: nosniff\n</code><code>\n</code><code>{ \n</code><code> <span class=\"hljs-string\">\"error\"</span> : <span class=\"hljs-literal\">false</span>, \n</code><code> <span class=\"hljs-string\">\"code\"</span> : <span class=\"hljs-number\">200</span>, \n</code><code> <span class=\"hljs-string\">\"result\"</span> : [ ] \n</code><code>}\n</code></pre>\n\n\n\n\n",

View File

@ -1,5 +1,5 @@
/*jshint globalstrict:false, strict:false, maxlen: 500 */ /*jshint globalstrict:false, strict:false, maxlen: 500 */
/*global assertUndefined, assertEqual, assertTrue, assertFalse, db._query */ /*global assertUndefined, assertEqual, assertTrue, assertFalse, fail, db._query */
//////////////////////////////////////////////////////////////////////////////// ////////////////////////////////////////////////////////////////////////////////
/// DISCLAIMER /// DISCLAIMER
@ -100,6 +100,106 @@ function iResearchFeatureAqlTestSuite () {
assertEqual(oldList.length, analyzers.toArray().length); assertEqual(oldList.length, analyzers.toArray().length);
}, },
testAnalyzersFeatures: function() {
try {
analyzers.save("testAnalyzer", "identity", "test properties", [ "unknown" ]);
fail(); // unsupported feature
} catch(e) {
}
try {
analyzers.save("testAnalyzer", "identity", "test properties", [ "position" ]);
fail(); // feature with dependency
} catch(e) {
}
// feature with dependency satisfied
analyzers.save("testAnalyzer", "identity", "test properties", [ "frequency", "position" ]);
analyzers.remove("testAnalyzer", true);
},
testAnalyzersPrefix: function() {
let dbName = "TestDB";
try { db._dropDatabase(dbName); } catch (e) {}
db._createDatabase(dbName);
db._useDatabase(dbName);
let oldList = analyzers.toArray();
assertTrue(Array === oldList.constructor);
// creation
db._useDatabase("_system");
analyzers.save("testAnalyzer", "identity", "system properties", [ "frequency" ]);
db._useDatabase(dbName);
analyzers.save("testAnalyzer", "identity", "user properties", [ "norm" ]);
// retrieval (system)
db._useDatabase("_system");
{
let analyzer = analyzers.analyzer("testAnalyzer");
assertTrue(null !== analyzer);
assertEqual(db._name() + "::testAnalyzer", analyzer.name());
assertEqual("identity", analyzer.type());
assertEqual("system properties", analyzer.properties());
assertTrue(Array === analyzer.features().constructor);
assertEqual(1, analyzer.features().length);
assertEqual([ "frequency" ], analyzer.features());
}
{
let analyzer = analyzers.analyzer(dbName + "::testAnalyzer");
assertTrue(null !== analyzer);
assertEqual(dbName + "::testAnalyzer", analyzer.name());
assertEqual("identity", analyzer.type());
assertEqual("user properties", analyzer.properties());
assertTrue(Array === analyzer.features().constructor);
assertEqual(1, analyzer.features().length);
assertEqual([ "norm" ], analyzer.features());
}
// retrieval (dbName)
db._useDatabase(dbName);
{
let analyzer = analyzers.analyzer("testAnalyzer");
assertTrue(null !== analyzer);
assertEqual(db._name() + "::testAnalyzer", analyzer.name());
assertEqual("identity", analyzer.type());
assertEqual("user properties", analyzer.properties());
assertTrue(Array === analyzer.features().constructor);
assertEqual(1, analyzer.features().length);
assertEqual([ "norm" ], analyzer.features());
}
{
let analyzer = analyzers.analyzer("::testAnalyzer");
assertTrue(null !== analyzer);
assertEqual("_system::testAnalyzer", analyzer.name());
assertEqual("identity", analyzer.type());
assertEqual("system properties", analyzer.properties());
assertTrue(Array === analyzer.features().constructor);
assertEqual(1, analyzer.features().length);
assertEqual([ "frequency" ], analyzer.features());
}
// listing
let list = analyzers.toArray();
assertTrue(Array === list.constructor);
assertEqual(oldList.length + 2, list.length);
// removal
analyzers.remove("testAnalyzer", true);
assertTrue(null === analyzers.analyzer("testAnalyzer"));
analyzers.remove("::testAnalyzer", true);
assertTrue(null === analyzers.analyzer("::testAnalyzer"));
assertEqual(oldList.length, analyzers.toArray().length);
db._useDatabase("_system");
db._dropDatabase(dbName);
},
//////////////////////////////////////////////////////////////////////////////// ////////////////////////////////////////////////////////////////////////////////
/// @brief IResearchFeature tests /// @brief IResearchFeature tests
//////////////////////////////////////////////////////////////////////////////// ////////////////////////////////////////////////////////////////////////////////