1
0
Fork 0
arangodb/arangod/Pregel
Simon Grätzer 27098e9e4f Revert "Changing PageRank"
This reverts commit 93a03c923a.
2017-03-01 23:20:38 +01:00
..
Algos Revert "Changing PageRank" 2017-03-01 23:20:38 +01:00
examples
Aggregator.h
AggregatorHandler.cpp
AggregatorHandler.h
AlgoRegistry.cpp
AlgoRegistry.h
Algorithm.h
CommonFormats.h
Conductor.cpp
Conductor.h
Graph.h
GraphFormat.h
GraphSerializer.h
GraphStore.cpp
GraphStore.h
IncomingCache.cpp
IncomingCache.h
Iterators.h
MasterContext.h
MemoryMapped.cpp
MemoryMapped.h
MessageCombiner.h
MessageFormat.h
OutgoingCache.cpp
OutgoingCache.h
PregelFeature.cpp
PregelFeature.h
README.md
Recovery.cpp
Recovery.h
Statistics.h
ThreadPool.cpp
ThreadPool.h
Utils.cpp
Utils.h
VertexComputation.h
Worker.cpp
Worker.h
WorkerConfig.cpp
WorkerConfig.h
WorkerContext.h

README.md

ArangoDB-Logo

Pregel Subsystem

Protocol

Message format between DBServers:

{sender:"someid", executionNumber:1337, globalSuperstep:123, messages: [, , vertexID2, ] } Any type of slice is supported

Useful Commands

Import graph e.g. https://github.com/arangodb/example-datasets/tree/master/Graphs/1000 First rename the columns '_key', '_from', '_to' arangoimp will keep those.

In arangosh:

db._create('vertices', {numberOfShards: 2});
db._createEdgeCollection('alt_edges');
db._createEdgeCollection('edges', {numberOfShards: 2, shardKeys:["_vertex"], distributeShardsLike:'vertices'});

arangoimp --file generated_vertices.csv --type csv --collection vertices --overwrite true --server.endpoint http+tcp://127.0.0.1:8530

Or: for(var i=0; i < 5000; i++) db.vertices.save({_key:i+""});

arangoimp --file generated_edges.csv --type csv --collection alt_edges --overwrite true --from-collection-prefix "vertices" --to-collection-prefix "vertices" --convert false --server.endpoint http+tcp://127.0.0.1:8530

AQL script to copy edge collection into one with '_vertex':

FOR doc IN alt_edges INSERT {_vertex:SUBSTRING(doc._from,FIND_FIRST(doc._from,"/")+1), _from:doc._from, _to:doc._to} IN edges LET values = ( FOR s IN vertices RETURN s.result ) RETURN SUM(values)

AWK Scripts

Make CSV file with IDs unique cat edges.csv | tr '[:space:]' '[\n*]' | grep -v "^\s*$" | awk '!seen[$0]++' > vertices.csv

Make CSV file with arango compatible edges

cat edges.csv | awk -F" " '{print "profiles/" $1 "\tprofiles/" $2 "\t" $1}' >> arango-edges.csv

arangoimp --file vertices.csv --type csv --collection twitter_v --overwrite true --convert false --server.endpoint http+tcp://127.0.0.1:8530 -c none arangoimp --file arango-edges.csv --type csv --collection twitter_e --overwrite true --convert false --separator "\t" --server.endpoint http+tcp://127.0.0.1:8530 -c none