1
0
Fork 0
arangodb/arangod/Pregel
Jan 70c31da560 port change from devel by @danielhlarkin (#9773) 2019-08-21 13:07:43 +03:00
..
Algos port Pregel segmented buffers (#9112) 2019-05-28 18:23:20 +02:00
examples Various fixes 2017-03-06 15:41:27 +01:00
Aggregator.h Bug fix/pregel stuff (#8733) 2019-04-11 15:58:28 +02:00
AggregatorHandler.cpp Bug fix/pregel stuff (#8733) 2019-04-11 15:58:28 +02:00
AggregatorHandler.h big reformat 2018-12-26 00:54:03 +01:00
AlgoRegistry.cpp add shutdown protection for PregelFeature (#8628) 2019-03-29 19:36:10 +01:00
AlgoRegistry.h add shutdown protection for PregelFeature (#8628) 2019-03-29 19:36:10 +01:00
Algorithm.h big reformat 2018-12-26 00:54:03 +01:00
CommonFormats.h remove some containers from common.h (#9223) 2019-06-07 13:27:24 +02:00
Conductor.cpp port change from devel by @danielhlarkin (#9773) 2019-08-21 13:07:43 +03:00
Conductor.h Bug fix/pregel micro improvements (#9179) 2019-06-04 09:31:48 +02:00
Graph.h port Pregel segmented buffers (#9112) 2019-05-28 18:23:20 +02:00
GraphFormat.h port Pregel segmented buffers (#9112) 2019-05-28 18:23:20 +02:00
GraphSerializer.h big reformat 2018-12-26 00:54:03 +01:00
GraphStore.cpp [3.5] Check scheduler queue return value (#9759) 2019-08-20 12:57:51 +03:00
GraphStore.h fix windows warnings / compile errors (#9215) 2019-06-06 18:08:44 +02:00
IncomingCache.cpp port Pregel segmented buffers (#9112) 2019-05-28 18:23:20 +02:00
IncomingCache.h remove some containers from common.h (#9223) 2019-06-07 13:27:24 +02:00
IndexHelpers.cpp port Pregel segmented buffers (#9112) 2019-05-28 18:23:20 +02:00
IndexHelpers.h port Pregel segmented buffers (#9112) 2019-05-28 18:23:20 +02:00
Iterators.h add shardKeyAttribute to pregel start parameters (#9149) 2019-06-05 14:25:47 +02:00
MasterContext.h Bug fix/pregel stuff (#8733) 2019-04-11 15:58:28 +02:00
MessageCombiner.h big reformat 2018-12-26 00:54:03 +01:00
MessageFormat.h big reformat 2018-12-26 00:54:03 +01:00
OutgoingCache.cpp port Pregel segmented buffers (#9112) 2019-05-28 18:23:20 +02:00
OutgoingCache.h port Pregel segmented buffers (#9112) 2019-05-28 18:23:20 +02:00
PregelFeature.cpp [3.5] Check scheduler queue return value (#9759) 2019-08-20 12:57:51 +03:00
PregelFeature.h forwardport changes from 3.4 (#8894) 2019-05-08 14:34:25 +02:00
README.md forwardport changes from 3.4 (#8894) 2019-05-08 14:34:25 +02:00
Recovery.cpp [3.5] Check scheduler queue return value (#9759) 2019-08-20 12:57:51 +03:00
Recovery.h remove some containers from common.h (#9223) 2019-06-07 13:27:24 +02:00
Statistics.h apply unique log ids (#8561) 2019-03-25 20:26:51 +01:00
TypedBuffer.h fix windows warnings / compile errors (#9215) 2019-06-06 18:08:44 +02:00
Utils.cpp Forward port some changes (#8949) 2019-05-09 19:42:06 +02:00
Utils.h forwardport changes from 3.4 (#8894) 2019-05-08 14:34:25 +02:00
VertexComputation.h port Pregel segmented buffers (#9112) 2019-05-28 18:23:20 +02:00
Worker-templates-algorithms.cpp fixed typos, removed unneeded includes (#8547) 2019-03-25 12:09:37 +01:00
Worker-templates-native-types.cpp fix windows build (#3855) 2017-12-06 16:35:45 +01:00
Worker.cpp [3.5] Check scheduler queue return value (#9759) 2019-08-20 12:57:51 +03:00
Worker.h port Pregel segmented buffers (#9112) 2019-05-28 18:23:20 +02:00
WorkerConfig.cpp Forward port some changes (#8949) 2019-05-09 19:42:06 +02:00
WorkerConfig.h remove some containers from common.h (#9223) 2019-06-07 13:27:24 +02:00
WorkerContext.h big reformat 2018-12-26 00:54:03 +01:00

README.md

ArangoDB-Logo

Pregel Subsystem

The pregel subsystem implements a variety of different grapg algorithms, this readme is more intended for internal use.

Protocol

Message format between DBServers:

{sender:"someid", executionNumber:1337, globalSuperstep:123, messages: [, , vertexID2, ] } Any type of slice is supported

Useful Commands

Import graph e.g. https://github.com/arangodb/example-datasets/tree/master/Graphs/1000 First rename the columns '_key', '_from', '_to' arangoimport will keep those.

In arangosh:

db._create('vertices', {numberOfShards: 2});
db._createEdgeCollection('alt_edges');
db._createEdgeCollection('edges', {numberOfShards: 2, shardKeys:["_vertex"], distributeShardsLike:'vertices'});

arangoimport --file generated_vertices.csv --type csv --collection vertices --overwrite true --server.endpoint http+tcp://127.0.0.1:8530

Or: for(var i=0; i < 5000; i++) db.vertices.save({_key:i+""});

arangoimport --file generated_edges.csv --type csv --collection alt_edges --overwrite true --from-collection-prefix "vertices" --to-collection-prefix "vertices" --convert false --server.endpoint http+tcp://127.0.0.1:8530

AQL script to copy edge collection into one with '_vertex':

FOR doc IN alt_edges INSERT {_vertex:SUBSTRING(doc._from,FIND_FIRST(doc._from,"/")+1), _from:doc._from, _to:doc._to} IN edges LET values = ( FOR s IN vertices RETURN s.result ) RETURN SUM(values)

AWK Scripts

Make CSV file with IDs unique cat edges.csv | tr '[:space:]' '[\n*]' | grep -v "^\s*$" | awk '!seen[$0]++' > vertices.csv

Make CSV file with arango compatible edges

cat edges.csv | awk -F" " '{print "profiles/" $1 "\tprofiles/" $2 "\t" $1}' >> arango-edges.csv

arangoimport --file vertices.csv --type csv --collection twitter_v --overwrite true --convert false --server.endpoint http+tcp://127.0.0.1:8530 -c none arangoimport --file arango-edges.csv --type csv --collection twitter_e --overwrite true --convert false --separator "\t" --server.endpoint http+tcp://127.0.0.1:8530 -c none