|
||
---|---|---|
.. | ||
Algos | ||
examples | ||
Aggregator.h | ||
AggregatorHandler.cpp | ||
AggregatorHandler.h | ||
AlgoRegistry.cpp | ||
AlgoRegistry.h | ||
Algorithm.h | ||
CommonFormats.h | ||
Conductor.cpp | ||
Conductor.h | ||
Graph.h | ||
GraphFormat.h | ||
GraphSerializer.h | ||
GraphStore.cpp | ||
GraphStore.h | ||
IncomingCache.cpp | ||
IncomingCache.h | ||
IndexHelpers.cpp | ||
IndexHelpers.h | ||
Iterators.h | ||
MasterContext.h | ||
MessageCombiner.h | ||
MessageFormat.h | ||
OutgoingCache.cpp | ||
OutgoingCache.h | ||
PregelFeature.cpp | ||
PregelFeature.h | ||
README.md | ||
Recovery.cpp | ||
Recovery.h | ||
Statistics.h | ||
TypedBuffer.h | ||
Utils.cpp | ||
Utils.h | ||
VertexComputation.h | ||
Worker-templates-algorithms.cpp | ||
Worker-templates-native-types.cpp | ||
Worker.cpp | ||
Worker.h | ||
WorkerConfig.cpp | ||
WorkerConfig.h | ||
WorkerContext.h |
README.md
Pregel Subsystem
The pregel subsystem implements a variety of different grapg algorithms, this readme is more intended for internal use.
Protocol
Message format between DBServers:
{sender:"someid", executionNumber:1337, globalSuperstep:123, messages: [, , vertexID2, ] } Any type of slice is supported
Useful Commands
Import graph e.g. https://github.com/arangodb/example-datasets/tree/master/Graphs/1000 First rename the columns '_key', '_from', '_to' arangoimport will keep those.
In arangosh:
db._create('vertices', {numberOfShards: 2});
db._createEdgeCollection('alt_edges');
db._createEdgeCollection('edges', {numberOfShards: 2, shardKeys:["_vertex"], distributeShardsLike:'vertices'});
arangoimport --file generated_vertices.csv --type csv --collection vertices --overwrite true --server.endpoint http+tcp://127.0.0.1:8530
Or: for(var i=0; i < 5000; i++) db.vertices.save({_key:i+""});
arangoimport --file generated_edges.csv --type csv --collection alt_edges --overwrite true --from-collection-prefix "vertices" --to-collection-prefix "vertices" --convert false --server.endpoint http+tcp://127.0.0.1:8530
AQL script to copy edge collection into one with '_vertex':
FOR doc IN alt_edges INSERT {_vertex:SUBSTRING(doc._from,FIND_FIRST(doc._from,"/")+1), _from:doc._from, _to:doc._to} IN edges LET values = ( FOR s IN vertices RETURN s.result ) RETURN SUM(values)
AWK Scripts
Make CSV file with ID’s unique cat edges.csv | tr '[:space:]' '[\n*]' | grep -v "^\s*$" | awk '!seen[$0]++' > vertices.csv
Make CSV file with arango compatible edges
cat edges.csv | awk -F" " '{print "profiles/" $1 "\tprofiles/" $2 "\t" $1}' >> arango-edges.csv
arangoimport --file vertices.csv --type csv --collection twitter_v --overwrite true --convert false --server.endpoint http+tcp://127.0.0.1:8530 -c none arangoimport --file arango-edges.csv --type csv --collection twitter_e --overwrite true --convert false --separator "\t" --server.endpoint http+tcp://127.0.0.1:8530 -c none