1
0
Fork 0
arangodb/arangod/Pregel
Wilfried Goesgens 315d23d03d Feature 3.3/backport openssl windows (#5892) 2018-07-20 17:23:27 +02:00
..
Algos Active Failover for Foxx Services (3.3) (#4593) 2018-02-15 09:36:25 +01:00
examples Various fixes 2017-03-06 15:41:27 +01:00
Aggregator.h Added single server support 2017-03-08 18:20:36 +01:00
AggregatorHandler.cpp Pregel: Fix concurrent creation of aggregator 2017-03-24 15:14:12 +01:00
AggregatorHandler.h Fixing jslint errors 2017-04-18 12:15:26 +02:00
AlgoRegistry.cpp Little changes for SLPA support 2017-05-15 10:42:25 +02:00
AlgoRegistry.h
Algorithm.h Fixed SLPA 2017-05-17 11:50:24 +02:00
CommonFormats.h Various changes 2017-05-16 10:58:15 +02:00
Conductor.cpp backport Scheduler from devel (#5533) 2018-06-28 13:26:04 +02:00
Conductor.h backport Scheduler from devel (#5533) 2018-06-28 13:26:04 +02:00
Graph.h Pregel improvements 2017-05-22 11:10:28 +02:00
GraphFormat.h Fixed SLPA 2017-05-17 11:50:24 +02:00
GraphSerializer.h Added single server support 2017-03-08 18:20:36 +01:00
GraphStore.cpp Feature 3.3/backport openssl windows (#5892) 2018-07-20 17:23:27 +02:00
GraphStore.h Feature/async failover (#3451) 2017-10-18 23:59:29 +02:00
IncomingCache.cpp fixed typos 2017-04-18 14:54:48 +02:00
IncomingCache.h Pregel: code reformatting 2017-03-17 16:21:25 +01:00
Iterators.h Pregel: code reformatting 2017-03-17 16:21:25 +01:00
MasterContext.h DMID improvements 2017-05-18 23:27:54 +02:00
MessageCombiner.h
MessageFormat.h
OutgoingCache.cpp Bug fix/adjust agency comm timeouts (#2765) 2017-07-13 00:44:28 +02:00
OutgoingCache.h Pregel: Coverty Scan fixes 2017-03-24 11:04:15 +01:00
PregelFeature.cpp backport Scheduler from devel (#5533) 2018-06-28 13:26:04 +02:00
PregelFeature.h Geo index update, renaming 2017-05-11 13:19:51 +02:00
README.md Feature/auth context (#2704) 2017-07-02 23:15:57 +02:00
Recovery.cpp Pregel: Changed scheduler post call 2017-03-20 12:22:16 +01:00
Recovery.h Added single server support 2017-03-08 18:20:36 +01:00
Statistics.h added pregel vertex / edge count checks 2017-06-07 17:18:59 +02:00
TypedBuffer.h fix potential duplicate closing of typed buffer (#3587) 2017-11-07 10:27:51 +01:00
Utils.cpp try to not fail hard when a collection is dropped while the WAL is tailed (#4225) 2018-01-04 16:31:17 +01:00
Utils.h Pregel: code reformatting 2017-03-17 16:21:25 +01:00
VertexComputation.h Feature/async failover (#3451) 2017-10-18 23:59:29 +02:00
Worker.cpp backport Scheduler from devel (#5533) 2018-06-28 13:26:04 +02:00
Worker.h Fixing Pregel module (#3557) 2017-10-31 20:15:51 +01:00
WorkerConfig.cpp Pregel: code reformatting 2017-03-17 16:21:25 +01:00
WorkerConfig.h Pregel: code reformatting 2017-03-17 16:21:25 +01:00
WorkerContext.h Pregel: Coverty Scan fixes 2017-03-24 11:04:15 +01:00

README.md

ArangoDB-Logo

Pregel Subsystem

The pregel subsystem implements a variety of different grapg algorithms, this readme is more intended for internal use.

Protocol

Message format between DBServers:

{sender:"someid", executionNumber:1337, globalSuperstep:123, messages: [, , vertexID2, ] } Any type of slice is supported

Useful Commands

Import graph e.g. https://github.com/arangodb/example-datasets/tree/master/Graphs/1000 First rename the columns '_key', '_from', '_to' arangoimp will keep those.

In arangosh:

db._create('vertices', {numberOfShards: 2});
db._createEdgeCollection('alt_edges');
db._createEdgeCollection('edges', {numberOfShards: 2, shardKeys:["_vertex"], distributeShardsLike:'vertices'});

arangoimp --file generated_vertices.csv --type csv --collection vertices --overwrite true --server.endpoint http+tcp://127.0.0.1:8530

Or: for(var i=0; i < 5000; i++) db.vertices.save({_key:i+""});

arangoimp --file generated_edges.csv --type csv --collection alt_edges --overwrite true --from-collection-prefix "vertices" --to-collection-prefix "vertices" --convert false --server.endpoint http+tcp://127.0.0.1:8530

AQL script to copy edge collection into one with '_vertex':

FOR doc IN alt_edges INSERT {_vertex:SUBSTRING(doc._from,FIND_FIRST(doc._from,"/")+1), _from:doc._from, _to:doc._to} IN edges LET values = ( FOR s IN vertices RETURN s.result ) RETURN SUM(values)

AWK Scripts

Make CSV file with IDs unique cat edges.csv | tr '[:space:]' '[\n*]' | grep -v "^\s*$" | awk '!seen[$0]++' > vertices.csv

Make CSV file with arango compatible edges

cat edges.csv | awk -F" " '{print "profiles/" $1 "\tprofiles/" $2 "\t" $1}' >> arango-edges.csv

arangoimp --file vertices.csv --type csv --collection twitter_v --overwrite true --convert false --server.endpoint http+tcp://127.0.0.1:8530 -c none arangoimp --file arango-edges.csv --type csv --collection twitter_e --overwrite true --convert false --separator "\t" --server.endpoint http+tcp://127.0.0.1:8530 -c none