1
0
Fork 0
arangodb/arangod/Pregel
Wilfried Goesgens 315d23d03d Feature 3.3/backport openssl windows (#5892) 2018-07-20 17:23:27 +02:00
..
Algos
examples
Aggregator.h
AggregatorHandler.cpp
AggregatorHandler.h
AlgoRegistry.cpp
AlgoRegistry.h
Algorithm.h
CommonFormats.h
Conductor.cpp
Conductor.h
Graph.h
GraphFormat.h
GraphSerializer.h
GraphStore.cpp
GraphStore.h
IncomingCache.cpp
IncomingCache.h
Iterators.h
MasterContext.h
MessageCombiner.h
MessageFormat.h
OutgoingCache.cpp
OutgoingCache.h
PregelFeature.cpp
PregelFeature.h
README.md
Recovery.cpp
Recovery.h
Statistics.h
TypedBuffer.h
Utils.cpp
Utils.h
VertexComputation.h
Worker.cpp
Worker.h
WorkerConfig.cpp
WorkerConfig.h
WorkerContext.h

README.md

ArangoDB-Logo

Pregel Subsystem

The pregel subsystem implements a variety of different grapg algorithms, this readme is more intended for internal use.

Protocol

Message format between DBServers:

{sender:"someid", executionNumber:1337, globalSuperstep:123, messages: [, , vertexID2, ] } Any type of slice is supported

Useful Commands

Import graph e.g. https://github.com/arangodb/example-datasets/tree/master/Graphs/1000 First rename the columns '_key', '_from', '_to' arangoimp will keep those.

In arangosh:

db._create('vertices', {numberOfShards: 2});
db._createEdgeCollection('alt_edges');
db._createEdgeCollection('edges', {numberOfShards: 2, shardKeys:["_vertex"], distributeShardsLike:'vertices'});

arangoimp --file generated_vertices.csv --type csv --collection vertices --overwrite true --server.endpoint http+tcp://127.0.0.1:8530

Or: for(var i=0; i < 5000; i++) db.vertices.save({_key:i+""});

arangoimp --file generated_edges.csv --type csv --collection alt_edges --overwrite true --from-collection-prefix "vertices" --to-collection-prefix "vertices" --convert false --server.endpoint http+tcp://127.0.0.1:8530

AQL script to copy edge collection into one with '_vertex':

FOR doc IN alt_edges INSERT {_vertex:SUBSTRING(doc._from,FIND_FIRST(doc._from,"/")+1), _from:doc._from, _to:doc._to} IN edges LET values = ( FOR s IN vertices RETURN s.result ) RETURN SUM(values)

AWK Scripts

Make CSV file with IDs unique cat edges.csv | tr '[:space:]' '[\n*]' | grep -v "^\s*$" | awk '!seen[$0]++' > vertices.csv

Make CSV file with arango compatible edges

cat edges.csv | awk -F" " '{print "profiles/" $1 "\tprofiles/" $2 "\t" $1}' >> arango-edges.csv

arangoimp --file vertices.csv --type csv --collection twitter_v --overwrite true --convert false --server.endpoint http+tcp://127.0.0.1:8530 -c none arangoimp --file arango-edges.csv --type csv --collection twitter_e --overwrite true --convert false --separator "\t" --server.endpoint http+tcp://127.0.0.1:8530 -c none