1
0
Fork 0
arangodb/arangod/Pregel
jsteemann 44c7b1b476 remove tabstops 2018-07-16 15:00:12 +02:00
..
Algos remove tabstops 2018-07-16 15:00:12 +02:00
examples
Aggregator.h
AggregatorHandler.cpp
AggregatorHandler.h
AlgoRegistry.cpp backport: use vocbase reference instead of pointer in arangodb::pregel::GraphStore 2018-04-13 11:23:34 +03:00
AlgoRegistry.h backport: use vocbase reference instead of pointer in arangodb::pregel::GraphStore 2018-04-13 11:23:34 +03:00
Algorithm.h Seeded pagerank (#5491) 2018-06-08 16:44:23 +02:00
CommonFormats.h Various changes 2017-05-16 10:58:15 +02:00
Conductor.cpp Converting Pregel AQL function to c++ and fixing a bug (#5620) 2018-06-28 10:46:16 +02:00
Conductor.h Converting Pregel AQL function to c++ and fixing a bug (#5620) 2018-06-28 10:46:16 +02:00
Graph.h fixed minor several compiler complaints (#5406) 2018-05-23 11:50:00 +02:00
GraphFormat.h Seeded pagerank (#5491) 2018-06-08 16:44:23 +02:00
GraphSerializer.h
GraphStore.cpp Feature/make the boolean true again (#5646) 2018-06-27 08:49:51 +02:00
GraphStore.h backport: use vocbase reference instead of pointer in arangodb::pregel::GraphStore 2018-04-13 11:23:34 +03:00
IncomingCache.cpp remove TRI_usleep and TRI_sleep, and use std::this_thread::sleep_for … (#3817) 2017-12-06 18:43:49 +01:00
IncomingCache.h Bug fix/remove most of aql js (#5223) 2018-04-30 11:17:11 +02:00
Iterators.h
MasterContext.h Using asio::io_context::strands instead of locks (#5266) 2018-05-07 15:58:19 +02:00
MessageCombiner.h add missing override specifiers, add final specifiers 2018-05-04 09:01:50 +02:00
MessageFormat.h
OutgoingCache.cpp Bug fix/adjust agency comm timeouts (#2765) 2017-07-13 00:44:28 +02:00
OutgoingCache.h Bug fix/remove most of aql js (#5223) 2018-04-30 11:17:11 +02:00
PregelFeature.cpp Feature/feature phases (#5272) 2018-07-16 14:09:36 +02:00
PregelFeature.h backport: use vocbase reference instead of pointer in arangodb::pregel::GraphStore 2018-04-13 11:23:34 +03:00
README.md Renamed arangoimp to arangoimport (with alias for compatibility.) (#4040) 2017-12-14 21:31:21 +01:00
Recovery.cpp issue 344.6: remove some redundant functions (#4842) 2018-03-15 11:03:35 +01:00
Recovery.h
Statistics.h added pregel vertex / edge count checks 2017-06-07 17:18:59 +02:00
TypedBuffer.h Feature/misc spelling corrections (#5164) 2018-07-13 13:06:20 +02:00
Utils.cpp remove TRI_ERROR_ARANGO_VIEW_NOT_FOUND, rename TRI_ERROR_ARANGO_COLLECTION_NOT_FOUND to TRI_ERROR_ARNANGO_DATA_SOURCE_NOT_FOUND 2018-03-17 19:36:14 +03:00
Utils.h
VertexComputation.h Feature/async failover (#3451) 2017-10-18 23:59:29 +02:00
Worker-templates-algorithms.cpp fix windows build (#3855) 2017-12-06 16:35:45 +01:00
Worker-templates-native-types.cpp fix windows build (#3855) 2017-12-06 16:35:45 +01:00
Worker.cpp Converting Pregel AQL function to c++ and fixing a bug (#5620) 2018-06-28 10:46:16 +02:00
Worker.h Converting Pregel AQL function to c++ and fixing a bug (#5620) 2018-06-28 10:46:16 +02:00
WorkerConfig.cpp
WorkerConfig.h
WorkerContext.h Bug fix/remove most of aql js (#5223) 2018-04-30 11:17:11 +02:00

README.md

ArangoDB-Logo

Pregel Subsystem

The pregel subsystem implements a variety of different grapg algorithms, this readme is more intended for internal use.

Protocol

Message format between DBServers:

{sender:"someid", executionNumber:1337, globalSuperstep:123, messages: [, , vertexID2, ] } Any type of slice is supported

Useful Commands

Import graph e.g. https://github.com/arangodb/example-datasets/tree/master/Graphs/1000 First rename the columns '_key', '_from', '_to' arangoimport will keep those.

In arangosh:

db._create('vertices', {numberOfShards: 2});
db._createEdgeCollection('alt_edges');
db._createEdgeCollection('edges', {numberOfShards: 2, shardKeys:["_vertex"], distributeShardsLike:'vertices'});

arangoimport --file generated_vertices.csv --type csv --collection vertices --overwrite true --server.endpoint http+tcp://127.0.0.1:8530

Or: for(var i=0; i < 5000; i++) db.vertices.save({_key:i+""});

arangoimport --file generated_edges.csv --type csv --collection alt_edges --overwrite true --from-collection-prefix "vertices" --to-collection-prefix "vertices" --convert false --server.endpoint http+tcp://127.0.0.1:8530

AQL script to copy edge collection into one with '_vertex':

FOR doc IN alt_edges INSERT {_vertex:SUBSTRING(doc._from,FIND_FIRST(doc._from,"/")+1), _from:doc._from, _to:doc._to} IN edges LET values = ( FOR s IN vertices RETURN s.result ) RETURN SUM(values)

AWK Scripts

Make CSV file with IDs unique cat edges.csv | tr '[:space:]' '[\n*]' | grep -v "^\s*$" | awk '!seen[$0]++' > vertices.csv

Make CSV file with arango compatible edges

cat edges.csv | awk -F" " '{print "profiles/" $1 "\tprofiles/" $2 "\t" $1}' >> arango-edges.csv

arangoimport --file vertices.csv --type csv --collection twitter_v --overwrite true --convert false --server.endpoint http+tcp://127.0.0.1:8530 -c none arangoimport --file arango-edges.csv --type csv --collection twitter_e --overwrite true --convert false --separator "\t" --server.endpoint http+tcp://127.0.0.1:8530 -c none