mirror of https://gitee.com/bigwinds/arangodb
100 lines
3.5 KiB
Plaintext
100 lines
3.5 KiB
Plaintext
Distributed Iterative Graph Processing (Pregel)
|
||
======
|
||
|
||
Distributed graph processing enables you to do online analytical processing
|
||
directly on graphs stored into arangodb. This is intended to help you gain analytical insights
|
||
on your data, without having to use external processing sytems. Examples of algorithms
|
||
to execute are PageRank, Vertex Centrality, Vertex Closeness, Connected Components.
|
||
|
||
The processing system inside ArangoDB is based on: [Pregel: A System for Large-Scale Graph Processing](http://www.dcs.bbk.ac.uk/~dell/teaching/cc/paper/sigmod10/p135-malewicz.pdf) – Malewicz et al. (Google) 2010
|
||
This concept enables us to perform distributed graph processing, without the need for distributed global locking.
|
||
|
||
|
||
### Requirements
|
||
|
||
to enable iterative graph processing for your data, you will need to ensure
|
||
that your vertex and edge collections are sharded in a specific way.
|
||
|
||
The pregel computing model requires all edges to be present on the DB Server where
|
||
the vertex document identified by the '_from' value is located.
|
||
This means the vertex collections need to be sharded by '_key' and the edge collection
|
||
will need to be sharded after an attribute which always contains the '_key' of the vertex.
|
||
Our implementation currently requires every edge collection to be sharded after a "vertex" attributes,
|
||
additionally you will need to specify the key ```distributeShardsLike``` to ensure that all shards
|
||
use the same salt when consistent hashing the sharding attributes.
|
||
|
||
|
||
* Create main vertex collection: ```db._create("vertices", {shardKeys:['_key']});```
|
||
* Optionally create additional vertex collections ```db._create("additonal", {distributeShardsLike:"vertices"});```
|
||
* Create (one or more) edge-collections: ```db._createEdgeCollection("edges", {shardKeys:['vertex'], distributeShardsLike:"vertices"});```
|
||
|
||
|
||
You will need to ensure that edge documents contain the proper values in their sharding attribute.
|
||
For a vertex document in collection "vertices" ```{_key:"A", value:0}```
|
||
the edge documents will have look like this:
|
||
|
||
|
||
{_from:"vertices/A", _to: "vertices/B", vertex:"A"}
|
||
{_from:"vertices/A", _to: "vertices/C", vertex:"A"}
|
||
{_from:"vertices/A", _to: "vertices/D", vertex:"A"}
|
||
...
|
||
|
||
|
||
This will ensure that during an insert of an edge document, the document will be placed in a shard
|
||
on the proper DBServer. There is no restriction on additional attributes in your documents.
|
||
|
||
### Arangosh API
|
||
|
||
#### Starting an Algorithm Execution
|
||
|
||
The pregel API is accessible through the ```@arangodb/pregel``` package.
|
||
|
||
var pregel = require("@arangodb/pregel");
|
||
var params = {};
|
||
var execution = pregel.start("<algorithm>", "<graph>", params);
|
||
|
||
The code returned uniquely identifies the algorithm execution.
|
||
|
||
|
||
#### Status of an Algorithm Execution
|
||
|
||
The code returned by the ```pregel.start(...)``` method can be used to
|
||
track the status of your algorithm.
|
||
|
||
|
||
var execution = pregel.start("sssp", "demograph", {source: "vertices/V"});
|
||
var status = pregel.status(execution);
|
||
|
||
|
||
The result will give your some information on the state of the algorithm. The status might
|
||
look something like this:
|
||
|
||
{
|
||
"state" : "running",
|
||
"gss" : 12,
|
||
"totalRuntime" : 123.23,
|
||
"aggregators" : {
|
||
"converged" : false,
|
||
"max" : true,
|
||
"phase" : 2
|
||
},
|
||
"sendCount" : 3240364978,
|
||
"receivedCount" : 3240364975
|
||
}
|
||
|
||
|
||
|
||
### Available Algorithms
|
||
|
||
#### Page Rank
|
||
|
||
var pregel = require("@arangodb/pregel");
|
||
pregel.start("pagerank", "graphname", )
|
||
|
||
|
||
#### Single-Source Shortest Path
|
||
|
||
var pregel = require("@arangodb/pregel");
|
||
pregel.start("sssp", "graphname", {source:"vertices/1337"})
|
||
|