1
0
Fork 0
Commit Graph

2340 Commits

Author SHA1 Message Date
Jan a5db298c92
fix buffer overrun, remove unused variable (#7302) 2018-11-13 14:18:50 +01:00
Jan dbf8d582d5
added missing change to clusterinfo (#7294) 2018-11-13 11:37:48 +01:00
Dan Larkin-York 48c3fd3b7f Fix nullptr dereference in SynchronizeShard. (#7268) 2018-11-08 14:13:00 +01:00
Max Neunhöffer a74330250f
Port bug fix 3.4/cluster comm threads start stop (#6939) to devel. (#7253)
* Start ClusterComm threads in `ClusterFeature::start`. Stop ClusterComm threads in `ClusterFeature::stop`.

* Do not free objects in `Scheduler::shutdown`. Let the `unique_ptr` do their job. Stop ClusterComm threads in `ClusterFeature::stop`, but free instance in `ClusterFeature::unprepare`.

* `io_context` may contains lambdas that hold `shared_ptr`s to `Tasks` the required a functional `VocBase` in their destructor.
2018-11-07 21:42:34 +01:00
Simon 386fc0e9ad Simplify dropDatabaseCoordinator & fix some bugs (#7211) 2018-11-06 15:26:33 +01:00
Vasiliy 68953ae33a issue 496.4.1: move StorageEngine-specific flag out of the genric API and closer to the storage engine (#7212) 2018-11-04 16:52:28 +03:00
Simon 40f54aebc8 Fix a crash in DBServerAgencySync (#7207)
(cherry picked from commit 99ba608be98a44b8ce1d0e681107271a22f42761)
2018-11-03 20:18:15 +01:00
Jan 1973022d00
Bug fix/refactor find emplace (#7197) 2018-11-02 17:18:47 +01:00
Max Neunhöffer 37359821cb
Fix arangorestore by adjusting timeouts in write ops. (#7083)
* Improve logging on coordinator when doing `arangorestore`.

* Return more error information in `mergeResults`.

* Longer timeout for communication coordinator -> leader for writes.

This is taking into account possible write stops from followers needed
to get in sync.

* Fix compilation.

* Get rid of numbers in exception log messages.

* Fix a typo.

* Fix compilation.
2018-10-31 14:39:58 +01:00
Vasiliy 8f44afb6cf issue 496.1: switch scope of responsibility between a TRI_vocbase_t and a LogicalView in respect to view creation/deletion (#7101)
* issue 496.1: switch scope of responsibility between a TRI_vocbase_t and a LogicalView in respect to view creation/deletion

* backport: address test failures

* backport: ensure arangosearch links get exported in the dump

* backport: ensure view is created during restore on the coordinator

* Updates for ArangoSearch DDL tests, IResearchView unregistration and known issues

* Add fix for internal issue 483
2018-10-30 12:50:35 +03:00
Simon 5b71dff64f RocksDB replication thread safety (#7088) 2018-10-29 18:09:46 +01:00
Simon c72818a9dc Make ensureIndexOnCoordinator more robust (#7110) 2018-10-29 17:45:46 +01:00
Jan 5719a7fc99
remove unneeded nullptr checks (#7121) 2018-10-29 11:51:27 +01:00
Matthew Von-Maszewski 97ba8ca2be Bugfix: More 3.4 scheduler changes backported (#7091) 2018-10-26 17:09:20 +02:00
Simon 10dc287eb3 Silence Tsan warnings (#7075) 2018-10-25 15:50:39 +02:00
Jan 221d036d5d
Bug fix/fix catch test issues (#7044) 2018-10-25 11:39:55 +02:00
Simon d23aaa2198 Better agency pool update (#7040) 2018-10-24 16:23:21 +02:00
Simon 4c1e8819c2 Add engine specific collection APIs (#6977) 2018-10-19 17:46:33 +02:00
Simon 8b7a4099b8 Properly compare velocypack objects in Agency operations (#6921)
* Properly compare velocypack objects in Agency operations

* Add changelog

* added option for VPackDumper
2018-10-17 20:03:53 +02:00
Simon cb4c07e0ed Replace engine equality feature (#6931)
* replace engine equality feature

* remove pointless code
2018-10-17 14:41:47 +02:00
Vasiliy 78567bef09 update iresearch to codebase as of 20181011 (#6858)
* update iresearch to codebase as of 20181011

* backport: address cluster test failures

* backport: address dump test failures

* backport: address discrepency in view creation between single-server and cluster

* backport: address test failure on cluster (revert change)

* backport: address test failures

* backport: address MSVC build issues

* backport: address issue with LogicalDatasource destructing after TRI_vocbase_t

* Revert "backport: address issue with LogicalDatasource destructing after TRI_vocbase_t"

This reverts commit 4f9880bbaa22194dfbb604b5a54658de1d447ac1.
2018-10-12 21:07:12 +03:00
Jan 2dc05429fe
Bug fix/fixes 110918 2 (#6848) 2018-10-12 12:59:10 +02:00
Jan c7cd0262aa
suppress some of these dreaded error messages (#6786) 2018-10-11 10:46:12 +02:00
Dan Larkin-York 4644d2b023 Fix issue with colleciton/view name conflict checking in cluster. (#6796) 2018-10-11 10:45:29 +02:00
Tobias Gödderz 102d17de89 Rework move shards with view test (#6773)
* Fixed testSetup(). Reduced redundant code.

* Reworked assertions in moving-shards-with-arangosearch-view-cluster.js

* Added changes from review

* Removed debug output / fixed jslint error
2018-10-11 10:25:22 +02:00
Max Neunhöffer 282a1a7193
Fix a bug when getting in sync and old requests are still lingering. (#6788) 2018-10-10 16:30:05 +02:00
Jan 165bf3bd1b
fix arangojs issue 573 (#6767) 2018-10-10 09:19:24 +02:00
Matthew Von-Maszewski e2bc7e10e3 port of 3.4 libcurl connection management to devel (#6775) 2018-10-10 09:10:34 +02:00
Max Neunhöffer 79bade7e6b
This is porting from 3.4 a cleanup in Current (follower removed from plan). (#6718)
* Fix cleanup of Current entry in case a follower is removed from Plan. (#6623)
* Properly remove unplanned followers in leader and Current.
* Add a catch test.
* Fix tests.
* Fix a bug with a temporary object.
* Protect against exception from getCollection not found.
* New Maintenance test data.
2018-10-09 15:29:42 +02:00
Max Neunhöffer 2452dcc5d0
Remove a relic from early days in /Target/FailedServers. (#6690)
* Remove a relic from early days in /Target/FailedServers.
* Fix a test.
2018-10-09 13:52:32 +02:00
Jan 46efcff7d7
micro improvements (#6674) 2018-10-05 10:25:13 +02:00
jsteemann cc21a938c7 fixed typos 2018-10-02 18:19:12 +02:00
jsteemann 56147843c2 Merge branch 'feature/additional-logging' of https://github.com/arangodb/arangodb into devel 2018-10-02 17:53:31 +02:00
Jan c06f2d77da
Feature/velocypack update (#6678) 2018-10-02 14:04:14 +02:00
Dan Larkin-York 1f63f16396 Move some logging off of general topic. 2018-10-01 13:28:11 -04:00
Max Neunhöffer ea377b0806
Add more timeout because in Jenkins dbservers can be slow. (#6667) 2018-10-01 16:55:19 +02:00
Kaveh Vahedipour 3fd1375db5 Feature/detailed get maintenance feature (#6668)
* add local state output to _admin/actions
* test data generated from maintennace feature
* coordinators not needed for maintenance tests
2018-10-01 14:51:14 +02:00
Lars Maier 14d1487710 Catch all exceptions to prevent maintenance workers from crashing. (#6645)
* Catch all exceptions to prevent maintenance workers from crashing.
* Please don't free this.
* Unified code paths.
* Remove dub comment.
* Removed debug output.
* Deleted unneeded constructors.
* Assignment operator deleted.
2018-09-28 17:10:44 +02:00
Wilfried Goesgens a477df49cf Feature/windows utf16 fileaccess (#6534) 2018-09-24 19:41:17 +02:00
Simon b16af5ac71 Fix superfluous QueryRegistry::close, cleanup (#6579) 2018-09-24 13:10:07 +02:00
Simon 912f109968 Add simple Future library (#6464) 2018-09-21 16:14:17 +02:00
Simon 0fa7f01c66 Resilience test failure points (#6539) 2018-09-20 01:05:10 +02:00
Jan c38051519e
Bug fix/simplify things (#6516) 2018-09-18 17:47:01 +02:00
Simon aa21ffdb7a Properly check syncer erros, catch more exceptions (#6520) 2018-09-17 16:39:23 +02:00
Kaveh Vahedipour 8bd834bcf7 Maintenance delayed by incomplete hashing maintenance actions (#6448) 2018-09-14 17:44:32 +02:00
Simon 82aa24ad7e Copy installation files on startup (#6491) 2018-09-14 11:15:21 +02:00
Jan e18e9158d8
fix cluster selectivity estimates (#6488) 2018-09-14 00:22:01 +02:00
Max Neunhöffer 84735955ea Add advertised endpoints. (#6104) 2018-09-13 16:30:55 +02:00
Simon 22b9c31c13 Removing ClusterComm ClientTransactionID (#6294) 2018-09-12 22:15:16 +02:00
jsteemann 125cbf00cc Merge branch 'devel' of https://github.com/arangodb/arangodb into devel 2018-09-12 21:35:06 +02:00
jsteemann ec6c6a5e68 pass variables by const reference 2018-09-12 21:34:37 +02:00
Jan 6b32d2d9b2
fix init-order fiasco with static strings (#6475)
* fix init-order fiasco with static strings

* try to work around compile errors
2018-09-12 21:30:49 +02:00
Simon c278fd23b2 Fix ReadOnly Mode without auth-enabled (#6478) 2018-09-12 20:02:57 +02:00
jsteemann 0ab9bdd398 fix typo in string 2018-09-12 16:35:34 +02:00
Jan 3b16913b1b
fix cluster index selectivity (#6467) 2018-09-12 14:35:39 +02:00
jsteemann 9c75d15287 fix init-order fiasco 2018-09-12 14:31:15 +02:00
Kaveh Vahedipour 6b2733625c Feature/static const strings cleanup (#6352)
* AgentConfiguration cleanup
* static strings in maintenance / agency
* more strings unified
* fix windows build
2018-09-11 13:40:03 +02:00
Lars Maier 95345fff8f Dedicated thread for Phase 1&2 - devel (#6412)
* First draft of dedicated thread for phase 1 and phase 2.
* Added comments and removed old code.
2018-09-07 14:46:01 +02:00
Max Neunhöffer bdf8c7d1a4
Wait 2s after switching server mode before answering. (#6390)
This is needed because the change is propagated via the agency and the
heartbeat, which only happens once per second.
2018-09-05 17:04:27 +02:00
Kaveh Vahedipour eda8dac7f9 typo (#6384) 2018-09-05 15:11:08 +02:00
Jan 17ea2d4ec9
suppress some messages which are expected on shutdown (#6381) 2018-09-05 14:15:35 +02:00
Vasiliy 5329f34771 issue 465.2.2: remove redudnant heap allocations and simplify API (#6349)
* issue 465.2.2: remove redudnant heap allocations and simplify API

* address merge issue

* address more merge issues

* address more merge issues

* address review comments

* do not deallocate non-allocated instances
2018-09-05 13:37:37 +03:00
Jan 09bf296545
Bug fix/cache fullcount in query cache (#6364) 2018-09-04 16:33:13 +02:00
Kaveh Vahedipour 89f96c00d1 fixed typo in limiting the thread num (#6347) 2018-09-03 16:09:04 +02:00
Simon 0661a4c1fe Hide Links from getIndexes() (#6325) 2018-09-03 15:17:24 +02:00
Kaveh Vahedipour 85fb0c1776 Fast lane action workers (#6317)
* fast tracking in maintenance
* Maintenance workers and actions have options for matching
* corrected findAction
* added fast track test
* matches should match all labels
2018-09-03 14:27:10 +02:00
Jan 07abfca588
Bug fix/cleanup 020918 (#6338) 2018-09-03 12:56:41 +02:00
Kaveh Vahedipour c7bb7a6f44 Feature/dont get local system collections alltogether (#6250)
* don't need to look at local _system collections in maintenace
* Use system() API to determine if a collection is a system coll.
2018-09-03 09:23:06 +02:00
Jan cb19878fad more explicit order for SystemDatabaseFeature (#6335) 2018-09-01 22:54:03 +02:00
Vasiliy e862efdc3b issue 458.4: retrieve the system database via the SystemDatabaseFeature (#6299) 2018-08-31 19:45:10 +02:00
Jan 5873f63a72
Bug fix/fixes 2908 (#6279) 2018-08-31 17:26:54 +02:00
Lars Maier 66bb45c9c8 Bugfix/No Maintenance threads default to low (#6310)
* Fixed number of maintenance threads to low by default.
* Fixed types for `std::max`.
2018-08-31 15:28:44 +02:00
Lars Maier 63d9cfa081 Maintenance Fixes (#6284)
* Clean up for `FIXMEMAINTENANCE` comments: removed race condition, added errors and `notify()`s.
* Removed dublicated code.
* Added requested changes. Added error reporting for `UpdateCollection`.
* Make it compile. Add missing `notify()`.
* `CreateCollection` generates errors in all code paths.
* Fixed catch test.
2018-08-31 15:24:29 +02:00
Kaveh Vahedipour 679c6904f4 fixed a condition, when the local leader dropped the shard, before leadership resignation had happened (#6282) 2018-08-31 13:20:35 +02:00
Simon 1afe3bce98 Remove header from trx::methods (#6271)
* do not create header here

* move headers up
2018-08-28 17:31:00 +02:00
jsteemann 44eae59dc0 remove functions that are not called anymore 2018-08-28 01:00:40 +02:00
Jan 5022ccc24d
Bug fix/fixes 2508 (#6254) 2018-08-27 21:36:39 +02:00
Lars Maier 5555bd2fad Schmutz++ Improved (#6259)
* Fixed startup order. Don't start maintenance threads in single-server or agent.
Added range check for `--server.maintenance-threads`.
Fixed invalid array access, when shard exists locally but not in plan.
* Removed unused header imports.
* Added CHANGELOG entry
* Fixed shutdown bug. Startup fixed.
* Fixed catch test.
* Add Maintenance improvements to NewFeature34.md.
2018-08-27 20:25:09 +02:00
Vasiliy 5d14775de8 issue 459.3: ensure collection permissions are checked before updating/dropping an IResearch view (#6253)
* issue 459.3: ensure collection permissions are checked before updating/dropping an IResearch view

* backport: ensure collection permissions are checked before updating/dropping an IResearch view on cluster

* backport: address test failures

* backport: address more test failures

* reuse existing classes for scoping ExecContext
2018-08-26 18:00:16 +03:00
jsteemann ebba4fd55a fix memory errors and crashes 2018-08-25 20:17:59 +02:00
jsteemann a14940df54 fix memleak 2018-08-25 11:32:16 +02:00
jsteemann 0d767fcabe removed unused type 2018-08-25 11:28:58 +02:00
jsteemann 08ee458608 blind attempt to fix MacOS compile error 2018-08-24 13:57:33 +02:00
jsteemann bd7352e88f fix compile warnings 2018-08-24 12:30:37 +02:00
Kaveh Vahedipour 28754cbf15 Feature/schmutz plus plus (#5972)
- Schmutz now called "Maintenance" and completely implemented in C++
 - Fix index locking bug in mmfiles
 - Fix a bug in mmfiles with silent option and repsert
 - Slightly increase supervision okperiod and graceperiod
2018-08-24 12:15:35 +02:00
Simon 948820e484 Various small changes (#6234) 2018-08-24 09:39:03 +02:00
Simon 229c09d434 Allow dirty-reads from passive (#6136) 2018-08-20 16:26:14 +02:00
Matthew Von-Maszewski 86ea784372 bugfix: establish unique function name & implementation for communication retry status (#6150)
* initial checkin of isRetryOK().  Includes fixes to known code that has previously hung shutdowns by performing infinite retries.

* slight help on getting out of a loop faster during shutdown.  not essential.
2018-08-17 14:57:12 +02:00
Kaveh Vahedipour 54cc026d34 servers should be retrying registration until successful (#5919) 2018-08-17 10:57:45 +02:00
Jan 10800572d4
do not mange errors in ClusterCommResult (#5933) 2018-08-17 08:44:45 +02:00
Vasiliy 6fd541d110 issue 427.5: use ApplicationServer reference instead of pointer (#6145)
* issue 427.5: use ApplicationServer reference instead of pointer

* address MSVC build failure
2018-08-15 12:16:02 +03:00
Frank Celler a688dc0962
Feature/remove job queue thread (#5986)
limiting V8 calls in flight
2018-08-10 12:17:43 +02:00
Dan Larkin-York 5f87f57cd0 Improved sharding algorithms (#6089) 2018-08-09 19:03:32 +02:00
Tobias Gödderz de4f5587ae Gharial rewrite in C++ (#5631)
* Built a C++ skeleton REST handler for gharial, with fallback to the JS handler

* Moved aql::Graph to graph::Graph

* Added complete edge definitions to Graph

Also:
- some cleanup
- used forward-declarations in headers
- use Graph in graph rest handler

* Handle graph lookup failures according to the test suite

* Added GET vertex

* Bugfixes in ResultT

- Added missing #include
- Fixed move semantics

* Move central code of readVertex to GraphOperations

* ResultT fixes and complements

* Implemented a graph cache

* Added and used graph cache to the rest handler

* Added GET edge

* Added DELETE edge

* Extracted some code

* Added PATCH and PUT for both edge and vertex

* Moved update/replace transaction code to GraphOperations

* Added stub routes for POST and a TODO note

* Added a test checking that deleting a vertex removes all incident edges as well

* Added a test checking that deleting a vertex does not remove edges in non-graph collections

* fixed compiler warnings and errors

* Began work on DELETE vertex

For this, added a V8Context to allow for AQL queries to use subtransactions

* Continued work on DELETE vertex (still WIP)

* prep for graph post routes

* fixed removeVertex operation (aql)

* added post vertex and post edge gharial routes

* wasSynchronoues flag changed

* gharial post c++ handler, naming conventions

* added gharial tests

* temporary disabled cache (because not completed), added graph property read functions

* added c++ gharial list vertex collections

* added c++ gharial graph config

* added c++ gharial list graphs

* added graph manager class

* first implementation of create graph in c++, WIP

* changed error messages

* added etag to create graph api, still multiple edge definition check missing

* finished POST /_api/gharial/<graph>

* WIP - DELETE /_api/gharial/<graphName>

* added DELETE /_api/gharial/<graphName> validation, still missing correct response

* gharial delete

* fixed delete gharial lock

* finished DELETE /_api/gharial/<graphName>

* added routes for graph based vertices and edge definitions

* improved delete route

* added add new edge definition to existing graph

* patch edge definition in a graph, still <WIP>

* finished edit edge definition route

* code changes due to devel code changes

* added remove edge definition route

* added vertex delete function

* added todo note regarding one drop collection issue

* add oprhan collection to graph route implemented

* Added a test

* Updated a comment

* Several minor changes

* Minor changes during review

* Changes during review

* Changes during review

* Bugfix: orphans may be null or omitted

* Bugfix: resolve externals

* minor code changes

* seperated graph class to independent component classes

* seperated graph class to independent component classes

* removed log output

* fixed create collection behaviour in a cluster environment

* fixed enterprise graphs behaviour in c++ gharial api

* removed log output

* formatting

* improved error handling, fixed a linux compile bug

* more result refactoring

* more result type cleanup

* fixed wrongly defined test

* result handling

* error handling

* more refactoring

* Bugfix: avoid race condition in cluster when creating collections

* updated graph documentation

* added graph related static strings

* static strings, new method to create options for gharial created collections

* Some minor cleanup

* more use of static strings

* minor code changes, review

* added missing parseint

* removed gharial foxx, added js common module, added v8 general graph module

* correct use of virtual method

* more v8, js general graph, broken state

* more v8 graph functions

* fixed editEdgeDefinition, added drop function

* fixed drop behaviour

* added _list, _exists

* added c++ rename graph collections, added v8 + graph module function

* Added a regression test

* added graph._deleteEdgeDefinition, v8, server

* more v8g

* added _removeVertexCollection

* added _extendEditDefinitions

* todo, need to add a helper sort method for a local defined relation

* fixed test

* fixed lots of tests, added more client functions, _addVertexCollection on client module is still broken

* added more client graph functions, all tests green

* more client functions

* add del edge def route

* Fix use after move

* Minor changes in client general-graph.js module

* Make a copy before sorting (don't touch the argument)

* Minor changes and some additional asserts in graph tests

* Consistently set parameter defaults

* Renamed static strings

* Remove superfluous function

* Made comment more verbose

* Minor changes in general-graph-common.js

* Added missing template arguments

* Fixed community build

* Cleanup in editEdgeDefinition

* Regression test & bugfix: comparison of edge definitions didn' order from and to

* Fixed errors introduced by merge

* Minor changes in v8-general-graph.cpp

* Fixed test failure due to wrong error code in CE

* added missing id field

* Added permission checks for graph._create

* Removed assertion that is no longer valid

* Moved removeGraph from GraphOperations to GraphManager

* Allow C++ implementation of graph._drop to handle smart graphs

* Flush js client db cache after creating/dropping collections via the general graph module

* Added _deleteEdgeDefinition to the general graph client module

* WIP: Added permission checks for drop graph

* Fixed permission checks for drop graph

* Added permission checks for other graph operations

* Bugfix: assert edge definitions are returned in order

* Some cleanup

* Removed unused method

* Minor improvements in GraphManager

* Fixed a type in general-graph common module

* Most useful fix of all times ever: Do not auto cast from bool to int and alternate error/noerror by this

* Added the initial keyword to StaticStrings

* Added a new error code, used whenever a user tries to inject a documentcollection as a relation into the graph, which is invalid

* Some GraphManager/Ops/Graph cleanup. Less Slice parsing, more usage of GraphObjects

* Test edgeDefinitions in graphs with a defined ordering

* GraphClass Layout cleanup

* Do not test error messages, use codes instead

* Recreated backwards compatibility of Graph Creation Permission errors

* Changed error-code if edgedefinition is used twice

* Added a StaticString for the GraphName

* Renamed graphToVpack => graphForClient

* Partly fixed graph-api test to work with better error messages. Still red: The edgeDefinitions are now sorted, the test is supposed to sort his own list, but appearently does not do so. Under investigation

* Added a new error code that rejects injection of differently sharded smart collection into smartgraph. Should be more helpful to our users

* graph createCollectionOptions now require an open object to be cross-called from enterprise. Made enterprise switch for creation of graph more elegant.

* Updated graphs.cpp

* Massive refactoring. Made Factories for graphs to make SmartGraph much more transparent. Also reduced amount of multiple implementations of the same stuff. Killed vocbase/graphs use GraphManager instead. Removed usage of GraphCache, was not completely implemented anyway and only partially used, which is bad at the moment. Option for later improvement never the less

* Adapted JS code to now really use c++ variants. ALso included 3 Classes: Graph, SmartGraph and GraphModule.

* Fixed undefined behaviour in Remove Vertex. Fixed smartgraph sharding if one collection already exists.

* Removed DEBUG output

* Removed DEBUG logs

* Removed dead code

* Fixed Graph EdgeDefinition test, they now have a different ordering.

* Added a test when adding a vertexCollection that it is actually valid in the graph

* Client Graph API now correctly sends `orphanCollections` and not `orphans`

* Let GraphOperations modify the graph in-place. It should now properly handle edgeDefinitions.

* Added initial cid StaticString

* Included the vocbase in fromPersistence creation of Graphs. Only required to enhance 3.3 SmartGraphs on the fly.

* Fixed internal error message

* Fixed compiler isses originiated from merging

* Removed unused imports

* Regenerated generated file
2018-08-09 09:30:04 +02:00
Jan 93222b15d4
track last used keys in cluster key generators, track key on cluster document insert (#6101) 2018-08-08 14:32:16 +02:00
Kaveh Vahedipour fd60b359b6 fixed parallel creation of indexes in cluster (#6088)
* fixed parallel creation of indexes in cluster

* added tests
2018-08-07 10:00:15 +02:00
Jan 5e96f2777c
add missing mutex for _deadThreads handling (#6087)
log "HeartbeatThread ok" message in debug log level only
2018-08-07 09:14:12 +02:00
Jan 4d4135d25c
Feature/add dbserver as an alias for primary (#6072)
* add "DBSERVER" as an alias for "PRIMARY"

This allows specifying the value "DBSERVER" for `--cluster.my-role`.
"DBSERVER" is only treated as an alias for "PRIMARY", because several
other parts of the code and APIs use the string "PRIMARY".
Changing these from "PRIMARY" to "DBSERVER" would make the change
downwards-incompatible, which we do not want.

The downside of this alias-only solution is that even when specifying
a role value of "DBSERVER", the server will still report its role as
"PRIMARY", which may be a bit confusing. The server will also generate
its id as "PRMR-XXXX" as before:

    2018-08-03T15:23:09Z [9584] INFO {cluster} Starting up with role PRIMARY
    2018-08-03T15:23:09Z [9584] INFO {cluster} Cluster feature is turned on. Agency version: {"server":"arango","version":"3.4.devel","license":"enterprise"}, Agency endpoints: http+tcp://[::]:4001, server id: 'PRMR-f655b728-4cea-44ac-88e9-8b34baa80958', internal address: tcp://[::1]:8629, role: PRIMARY

* adjusted documentation to use "DBSERVER" instead of "PRIMARY"

* api doc

- secondary role not used anymore. stated.
- primary database is not clear. replaced with dbserver
- brief referenced only dbserver and coordinator - better to provide wider description, in line with what is described below, as other roles can be returned

* typo

* typo

* added starting from 3.4

* additional warning

* cited in the release note
2018-08-06 17:20:50 +02:00
Jan 1c8f6a75dd
Bug fix/fix issue 6076 (#6082) 2018-08-06 14:27:25 +02:00
Jan b278d6874a
allow master & slave to work in parallel for RocksDB WAL tailing (#6059) 2018-08-03 13:37:53 +02:00
Dan Larkin-York b2db010969 Register version and storage engine info in agency to facilitate rolling upgrades (#6061) 2018-08-03 09:46:04 +02:00
Simon 42cabb858a Fix dumping of views in the cluster (#6024) 2018-08-02 17:25:49 +02:00
Jan 29ab2fdbe3
Bug fix/3107 (#6046) 2018-08-01 12:50:55 +02:00
Jan e4d7f1c5f0
Bug fix/wenn der shard mann 2mal klingelt (#5890) 2018-07-26 15:37:40 +02:00
Simon 2dd8593609 View Replication (#5915) 2018-07-26 10:28:46 +02:00
Jan 21064144c8
Bug fix/replication improvements (#5962) 2018-07-25 09:04:50 +02:00
Jan a5bb50b0bf
remove methods from VelocyPackHelper that are also in VPackSlice (#5946) 2018-07-25 09:01:29 +02:00
Jan 3319829432
do not run storage engine equality check too early (#5935) 2018-07-24 09:34:11 +02:00
jsteemann 39021d008d make engine equality check feature abort the startup when there are different storage engines used in a cluster 2018-07-17 14:08:01 +02:00
jsteemann 36f05c07e0 cleanup of server options 2018-07-16 21:38:35 +02:00
jsteemann 44c7b1b476 remove tabstops 2018-07-16 15:00:12 +02:00
Michael Hackstein 7a95c5e675
Feature/feature phases (#5272)
* Added feature phases

* BasicsPhase and DatabasePhase to the required files. Server now has Feature circles and does not boot. Will be sorted out later on.

* Added ClusterPhase to features

* Added V8Phase to the required features

* Added AQLPhase to the affected features

* Added ServerPhase to Features

* Added FoxxPhase to the relevant features

* Added AgencyPhase to the relevant features

* Moved registration from local variable SYS_SYSTEM_REPLICATION_FACTOR from cluster to V8 as their ordering is now vice versa

* Moved Bootstrap feature into FoxxPhase. It could be moved to ServerPhase easily if the FoxxQueue dependency would be removed

* Final movement of Startup Phases. Now solved all circles.

* Removed merge conflict

* Moved ReplicationTimeout into cluster phase and fixed cross-phase requirements

* Added greetings phase. This phase separates the Basics Phase and is the first to be run. Includes Logger and Hello/Goodbye

* Added the GreetingsPhase in the corresponding features. Now all BasicsPhase features start after greetings Phase. There is some issue in this branch which prevents the Agency from Gossipping right now. Will be fixed next

* Moved creation of the Agent into the prepare phase of the feature. THereby it is guaranteed that agents at least exists before the GeneralServer is activating endpoints

* Recovery needs to be started after the ServerID

* Moved log output of FeaturePhases to DEBUG instead of ERROR.

* Added feature phases for clients

* ClusterFeature now does not directly require AgencyFeature any more

* Added requirement of TravEngineRegistryFeature in AQL feature. Otherwise shutdown may be undefined

* The ApplicationServer can now handout the list of ordered features. Used for testing purposes

* Fixed IResearchVew Tests Setup to honor new feature ordering

* Fixed IResearchViewDBServer Tests Setup to honor new feature ordering

* Started fixing IResearchView Coordinator tests with startup ordering. Not finished yet

* Added startup phases to ViewCoordinator test

* Disabled expected logoutput in ClusterRepairsTest

* Fixed indention in test code

* LinkCoordinator now honors startup ordering

* Link meta now honors startup rdering

* Supress expected cluster logs in ViewTest

* Removed '#' accidentially added.
2018-07-16 14:09:36 +02:00
Max Neunhöffer 014c3f7f53
Only load Plan and Current in ClusterInfo when actually needed. (#5649)
* Only update Plan and Current from Agency if not already done.
* Add read protection for getPlanVersion and getCurrentVersion.
* Add a further check to loadPlan and loadCurrent.
* Fix tests to new behaviour.
* Try to increase Plan/Version and Current/Version with every change.
* Add two more increments of Plan/Version
* Add missing increments in tests for Plan/Version.
* Add changelog entry.
2018-07-16 12:20:13 +02:00
Jan 2c83454066
Bug fix/fixes 1307 (#5878) 2018-07-13 20:48:21 +02:00
Simran 34ec56d421 Feature/misc spelling corrections (#5164) 2018-07-13 13:06:20 +02:00
Jan 208f1297e1
Bug fix/fixes 1007 (#5815) 2018-07-10 13:47:04 +02:00
Tobias Gödderz fc3e11dbbc Async AQL (#5806)
* Modified header to new initializeCursor API

* Adapted initializeCursor to DONE/WAITING API. Compiles but not tested and no one reacts to WAITING state, it is not returned anywhere yet

* Subqueries now expect a WAITING return from initilize cursor. However they will just return a nullptr and pretend the query is empty, this will be fixed later

* First attempt to simulate thread waiting over information within the query

* Small fix to allow for isDirect handlers to go to sleep.

* Waiting in the necessary places now for the async request to be send.

* Thank you auto-casting compiler, you are totally right i absolutely wanted to use this bool value as an index in may Array. How could i possibly not want to use it here?

* Include cond-var header

* Fixed mutex/cond_var usage

* Added oldAPI wrappers in AQL Blocks for get/skip some variants. This Commit compiles but is NOT tested

* Let getSome now return unique_ptr of AqlItemsBlocks. Also implemented the async variant of getSome in subqueries.

* Removed all references to OLD implementations in AQL. only the base wrappers are allowed to call OLD functions from now on. Now the testing part starts

* Fixed endless virtual recursion

* Implemented new getOrSkip API in SortBlock

* Implemented new getOrSkip API in LimitBlock

* Initilaize all variables

* Fixed logic bug in SubqueryBlock

* getBlock in ExecutionBlock now returns a state. All blocks need to handle this properly!

* Createad a wrapper getBlockOld that servers the old sync api and is used now in AQL. To be replaced overtime.

* Added IndexBlock::skipSome and IndexBlock::getSome

* getBlock now returns its old return value along with the state

* Switch from getBlockOld to getBlock in IndexBlock::skipSome

* Switch from getBlockOld to getBlock in IndexBlock::getSome

* ShortestPathBlock::skipSome is not implemented! Added a regression test

* Attempt to fix SubQueryResult memory management

* Fixed LIMIT Block

* Moved from ShortestPathBlock::getSomeOld to ::getSome

* Implemented ASYNC api on SingletonBlock

* Adapted EnumerateCollectionBlock to new async API

* Fixed FilterBlock and adapted return block to async API

* Adapted NORESULTS block to async AQL api.

* Adapted Modification Blocks to async API

* Fixed some initialize cursor functions to reset values required during get/skipSome

* First steps to adapt ClusterNodes to Async AQL api. Not there yet, need to implement the core still

* Added asnyc implementation for xxxForShard in ClusterBlocks. This commit changes internal logic of _doneForShard. Needs additional testing as soon as everything is in place.

* Adapted CalculationBlock to async API

* Adapted TraversalBlocks to ASYNC Aql. This is not optimal yet, we need a better decission if we are DONE or not on RETURN

* Adapted EnumerateListBlock to Async AQL api

* Adapted RemoteBlock to ASYNC API in getSome/skipSome. The whole thing is now LIVE in the cluster. Exetensive testing to be started now

* Fixed IndexBlock WAITING behaviour if Waiting occurs during a index processing

* Adapted IReasearchViewBlock to ASYNC AQL API

* Fixed SortingGatherBlock in WAITING state.

* Adapted IResearch ExecutionBlockMock to Async API

* Unified the HASMORE/DONE distinction. Code is much more readable now and harder to get incorrect 👍

* Implemented tonly heoretically reachable function of non void function.

* Fixed last commit

* Added inline TODO comments

* fix warning

* Fixed a clearing logic bug in RemoveNodes

* Fixed Error Handling in RemoteBlocks. Also fixed a logic bug (true/false simply has a 50% chance of getting it wrong) in Distribute and Scatter.

* remove unused methods

* Fixed failure test

* implement skipping

* Moved the Query Waiting out of the ExecutionEngine.

* changed one of the collect blocks

* Removed _upstreamState from ExecutionBlockMock, that is in the base-class now

* Added a Test Mock for a an ExecutionBlock that simulates the WAITING/HASMORE/DONE api.

* do not check "hasMore" if not necessary

* Added DistinctCollectBlock::getOrSkipSome from ~Old and changed its return type

(still uses getBlockOld)

* Save state to resume in DistinctCollectBlock::getOrSkipSome

* Extracted redundant code

* fixed some ops

* added one more test

* fix endless blocking

* fix compile error

* fix test

* Refactored HashedCollectBlock::getOrSkipSome

* Return blocks to the manager

* Replaced usage of getBlockOld in HashedCollectBlock::getOrSkipSome

* remove unused shutdown calls, simplify ownership for expressions

* Removed superfluous variable

* Capture const variable by value

* Removed SortedCollectBlock::getOrSkipSomeOld in favour of getOrSkipSome

* Added a working version of SortedCollectBlock::getOrSkipSome

Has yet to be cleaned up

* Removed isTotalAggregation special treatment

* On no input, return a group of nulls (instead of no group at all)

* Bugfixes

* Simplified code

* Move return to the end, eliminate duplicate code

* Corrected skipped count in HashedCollectBlock

* Aligned getNextRow() implementations

* Added comments

* some cleanup

* fix potential memleak

* Bugfix

* Fixed failure tests

* Removed usage of getBlockOld in ExecutionBlock::getOrSkipSome

* Replaced hasMore with an async implementation (mostly)

* Removed getBlockOld()

* Added hasMoreState to the AQL API (and renamed hasMore methods to hasMoreState)

* RemoteBlock now uses the async hasMoreState route

* remove job queue

* options

* Bugfixes in the async implementation of LimitBlock

* LimitBlock::getOrSkipSome now always skips when calculating the fullcount

* fix compile warnings

* restrict threads

* Fixed api of Waiting ExecBlockMock. Unused yet

* Made SortedGatherBlock async-capable

* Removed nonEmptyIndex hack

* Removed duplicate traceGetSome~ calls, moved all to getSome

* Added asserts before replacing getNr*Registers

* Added a TODO note and a comment

* Removed getSomeWithoutRegisterClearoutOld()

* Removed skip()

* Removed common code by using getNr*Registers()

* Use getNr*Registers() in the TraversalBlock as well

* started to add lane

* started to add lane

* added lane

* completed lane

* removed debug output

* fixed merge

* Began working on a test suite for AQL tracing/profiling

* Added more tests and asserts in aql-profiler

* Made some ExecutionBlocks final

* Added a type enum to all blocks and the per-block stats

* Add block type to stats nodes when tracing AQL on block level

* Removed initializeCursor call from instantiateFromPlan

* Avoided additional getSome calls after DONE

* Added more profiler tests

* Refactored ExecutionBlock::getOrSkipSome and fixed two bugs

- set _upstreamState also when skipping
- explicitly use xecutionBlock::getHasMoreState()

* Bugfix: update state

* Reuse parent _skipped wherever possible; rename where not (LimitBlock)

* Simplified SortedCollectBlock::getOrSkipSome and reused general pattern & code

* Implemented missing virtual function (with USE_FAILURE)

* Reset neccessary values during initializeCursor

* Simplified code in EnumerateListBlock a little

* Added a test for DistinctCollectBlock in aql-profiler

* Avoid redundant getSome calls in DistinctCollectBlock

* fix compilation

* Fixed DistinctCollectBlock profiler test

* Added a second profiler test for the DistinctCollectBlock

* Added a profiler test for EnumerateCollectionBlock

* Bugfix in EnumerateListBlock

* added --server.fifoN-size

* Simplified EnumerateCollectionBlock::getSome

* Simplified EnumerateCollectionBlock::getSome, and return HASMORE less often when DONE

* Fix testEnumerateCollectionBlock1 for mmfiles

* do not pass by reference

* Fixed compile error

* fixed merge conflicts

* Added profiler tests for EnumerateCollectionBlock

* Test fix for mmfiles

* Fixed IResearch tests

* Bugfix in DistinctCollectBlock and a regression test

* Updated comment

* Bugfix for query statistics in cluster

* Check plan in distinct test

* Fix aql-profiler tests in cluster

* Removed unused line / bugfix for single server test runs

* This commit implements waking up of AQL queries. (#5651)

* Non-compiling intermediate commit for handover.

* Make branch compile again

* Started implementation of continueable rest cursor handler by moving the callbacks to the outer part. This is not yet fully tested!

* Made finalizeExecute noexcept. We cannot react to this errors as the response was potentially written before. Also introduced continueExecution in the RestHandler engine.

* First successful query wakeup.

* The wakeup callback now posts on the scheduler directly. A resthandler only needs to provide a callback that encapsulates the continueExecution call on this handler

* renamed finalizeExecute to shutdownExecute

* Added a differentiation between Handler and Callback in Query continuation. Handler will be posted in IO service. Callback will be executed directly

* fix audit log

* Removed callback from deleteQueryCursor. This cannot be waiting

* use CONDITION_LOCKER

* removed yet another thread-local variable

* Fixed forward declaration

* Made RestAqlHandler repeatable

* Use defer to close the query in RestAqlHandler. Now waiting will close the query as well.

* Added a mutex in the RestHandlers to make sure if the callback over network is too fast that there is only one Thread running in the RestHandler

* Captured the GeneralCommTask if it is posted to a RestHandler. This is necessary in the PAUSED case

* Refactoring of _noLockHeader responsibilities. Now the BaseHandler selects them and resets them after it is done. Only Coordinators are allowed to define them if a query is loaded.

* Removed reaction to existing nolockheaders in Coordinator Query Planning Phase

* Removed incorrect assertion.

* Further refactoring of NoLockHeaders. Now there is a wrapper class around it which allows for debugging and logging. The state now seems to be better. Also all non-rest-handler triggered queries clean up the NoLockHeaders properly.

* Fixed UserManager, now deletes nolock headers properly

* Swing to the Symphony of Destruction

* Forgot about community build...

* Fixed compiling of Catch tests

* Fixed community build

* need thread for size

* Made the restSimpleHndler repeatable

* Implemented dump and dumpSync in Cursors, Sync will block a thread, dump allows to wait, only relevant for Streaming cursor

* Reactivated StreamingCursors

* Removed debug output.

* Fixed false query continuation

* Reset thread output to non-debug

* Added missing return statements

* Allow some CollectionMethods to hand-in a context that may contain a transaction. This is meant to honor nolock headers.

* Fixed hidden merge conflict

* Bugfix in aql-profiler.js: use plan.nodes order, not stats

* Added two profiler tests for filter

* Avoid too many getBlock calls in the FilterBlock

* Removed debug output

* RemoteBlock API will now send a done(bool) flag whenever we request documents from remote Servers. It is possible that we are DONE and have a result. The pre 3.4.0 API uses exhausted which is exclusive to a result. This API is still implemented for beckwards compatibility.

* Implemented an executeSync function in AqlQuery. This will block the thread until query execution is complete

* Added another test for FILTER, and one test for the HashedCollectBlock

* Added more tests for HashedCollectBlock; avoid unneccessary getSome calls

* Added an profiler IndexBlock test

* IndexBlock: avoid redundant getSome calls, added missing traceGetSomeEnd calls

* Added a second test profiling IndexBlock

* Added a third test for IndexBlock

* Moved general code to module

* Moved noncluster tests into a separate file

* Split aql-profiler testsuite into three files

* Added profiler tests for LimitBlock

* Added a test for NoResultsBlock

* Added profiler tests for TraversalBlock

* Shutdown of an AQL query is now asynchronous. However in Error-Cases it will be executed in a blocking way still

* Optimized TraversalBlock getSome calls due to new (nightly) test results

* Fixed std::min calls I broke

* Let shutdown calls in AQL wait, if the query is executed successfully.

* Fixed queryResult going out of scope

* fix compile error through merge conflict with devel

* Fixed compiler warning "mismatching tags"

* Removed debug log output

* Added TODO notes

* Fixed test fail due to devel merge

* Fixed some invalid sync waiting implementations

* Added a profiler test for SortBlock

* Added profiler tests for SortedCollectBlock

* Fixed bug introduced by devel merge

* Fixed Remoteblocks ignoring errors!

* Added some more continue Callbacks in used places. And removed debug log

* Removed debug log output

* Suppress clang warnings

* Bugfix: use of invalid stack pointer

* Bugfix: RemoteBlock::shutdown now sends code as int, not string

* Revert "Suppress clang warnings"

This reverts commit 05591649c59743c992edd5e78814edc8ca2a83e0.

* Bugfix: cleanup state in RemoteBlock ::shutdown, ::getSome and ::skipSome

* Bugfix in Subquery shutdown: don't skip subquery shutdown when main query shutdown failed

* Allow copy elision
2018-07-09 14:24:10 +02:00
Jan 04d19ccc1b
do not dereference potential nullptrs (#5729)
in ClusterComm::fireAndForgetRequests but use safer accessor methods
2018-07-05 09:50:56 +02:00
Dan Larkin-York 8b0cb1c657 Restrict cursors to generating user (#5744) 2018-07-03 17:44:15 +02:00
Dan Larkin-York 21e16a8a24 Add load balancer awareness for cursor API (#5682) 2018-07-03 14:29:09 +02:00
Simon 545561e9a9 Read only server (#5652) 2018-07-03 09:58:16 +02:00
Vasiliy 7aaeab50fb issue 402.1: share sync thread between IResearchView and IResearchViewDBServer (#5733) 2018-07-02 15:03:00 +03:00
Jan 5e8e1e71c4
don't hang in shutdown while creating a collection or an index (#5710) 2018-06-29 11:21:00 +02:00
jsteemann 4e5c4413ea constify API 2018-06-26 23:37:25 +02:00
Simon ec0d2a1b7b Remove Coordinator DBs (#5661) 2018-06-25 19:18:11 +02:00
Andrey Abramov 5eef6cd618
Feature/test iresearch (#5610)
* start implementing arangosearch cluster tests.

* backport: ensure view lookup is done via collectionNameResover, ensure updateProperties returns current view properties

* first attempt to fix failing tests

* refactor cluster wide view creation logic

* if view is not found in the new plan then check the old plan too

* ensure the cluster-wide view is looked up in vocbase as well on startup/recovery

* do not store cluster-wide IResearchView in vocbase

* move stale view cleanup to the shared pointer deleter, address test failures

* do not print warning

* enable arangosearch tests by default

* fix catch tests

* address icorrect return value for cluster-wide links

* address some issues with test failures due to cluster-view allocated within TRI_vocbase_t

* simplify per-cid view name, address 'catch' test failures

* ensure IResearchViewNode volatility is properly calculated in cluster

* invoke callbacks directly in AgencyMock instead of waiting for timeout

* ensure view updates via JavaScript always use the latest view definition

* pass a list of shards to `IResearchViewDBServer::snapshot`

* extend cluster aql tests

* fixes after merge

* fix class/struct inconsistencies

* comment failing tests

* remove debug logging

* add debug function

* tests cleanup

* simplify upcoming merge: pass resolver from a side

* backport: move all transaction status callback logic to Methods

* add changes missed from previous commit

* fix js and ruby tests

* more tests for IResearchViewNode

* pass transaction to IResearchViewDBServer::snapshot, address IResearchViewDBServer tests segfault

* pass transaction to IResearchView::snapshot instead of transaction state

* temporarily add trace log output to tests to try to find the cause of the core dump on Jenkins

* add more temporary debug output to trace down the segfault on Jenkins

* add even more temporary debug output to trace down the segfault on Jenkins

* ensure Vieew related maps are cleared during shutdown

* reset ClusterInfo::instance() before DatabaseFeature::unprepare()

* remove extraneous debug output

* missed line from previous commit

* uncomment required line

* add nullptr checks to RocksDBIndexFactory::prepareIndexes(...) similar to the ones in MMFilesIndexFactory::prepareIndexes(...)

* attempt to fix deadlock in tests

* add comment as per reviewer request

* fix aql test suite name

* add some debug logging

* address deadlock between ClusterInfo::loadPlan() and CollectionNameResolver::localNameLookup(...)

* eplicitly state which index definition failed in the log message

* use vocbase from shard-view isntead just in case

* explicitly state which index definition failed in the log message

* do not create shard-view instances from cluster-link instances (only register existing ones)

* add some tests
2018-06-21 20:35:04 +03:00
Jan 8a25558e62
fix test (#5618) 2018-06-15 15:53:23 +02:00
Wilfried Goesgens c4c711f372 fix windows compile (#5611) 2018-06-14 19:18:54 +02:00
Simon 3bec336aff TransactionState::addCollection refactoring (#5606) 2018-06-14 15:34:58 +02:00
Jan 08b12942bd
fix AQL DOCUMENT lookup in case a collection has multiple shards and custom shard keys (#5602) 2018-06-14 12:20:06 +02:00
Jan 448a435713
clean up key generators a bit (#5573) 2018-06-12 11:28:38 +02:00
Vasiliy 4253dca6aa issue 381.5: ensure the LogicalView definition that is persisted to the Agency matches the definition that gets created (#5518)
* issue 381.5: ensure the LogicalView definition that is persisted to the Agency matches the definition that gets created

* backport: correct comment
2018-06-02 17:21:55 +03:00
jsteemann bc87837778 simplifications 2018-05-27 19:47:48 +02:00
jsteemann a14b67f584 use nullptr 2018-05-25 18:55:37 +02:00
Jan Christoph Uhde a2dcb6cc5d WIP - start adding optional overwrite to insert operation (RepSert) (#5268) 2018-05-24 19:47:15 +02:00
Simon 332a7958f5 Cleanup cluster selectivity (#5440) 2018-05-23 18:00:14 +02:00
Jan 8e6d5df129
fixed minor several compiler complaints (#5406) 2018-05-23 11:50:00 +02:00
Vasiliy 94ddd7803d issue 389.10: refactor CollectionNameResolver to use TRI_vocbase_t& (#5424) 2018-05-23 00:59:08 +03:00
Simon 35992ad67b Coordinator storage engine (#5405) 2018-05-22 19:30:27 +02:00
Vasiliy d9cda9666f issue 389.8: remove redundant function from Methods, convert Syncer API to user TRI_ocbase_t& wherever possible (#5408) 2018-05-22 16:10:24 +03:00
Kaveh Vahedipour c2c104d6b1 agency pool size had to be larger 1 (#5379) 2018-05-17 11:54:41 +02:00
Vasiliy 843e584746 issue 389.5: refactor StandaloneContext to be constructed with a TRI_vocbase_t& (#5370)
* issue 389.5: refactor StandaloneContext to be constructed with a TRI_vocbase_t&

* backport: address build issues
2018-05-17 01:15:50 +03:00
Vasiliy 6a53154160 issue 389.2: use static strings for Index definition json attributes, use TRI_vocbase_t references instead of pointers in V8Context, use TRI_vocbase_t references instead of pointers in DatabaseInitialSyncer (#5344) 2018-05-14 19:06:24 +03:00
Simon 17b1a2aafb Rest middleware refactoring (#5332) 2018-05-14 17:43:10 +02:00
Simon f2b952134f Fixing agency pool update (#5316) 2018-05-14 14:56:19 +02:00
jsteemann cdfac2049a Merge branch 'bug-fix/remove-unused-class-in-cluster-selectivity-estimate' of https://github.com/arangodb/arangodb into devel 2018-05-09 18:15:54 +02:00
Vasiliy 0d1cf45097 issue 373.3: use TRI_vocbase_t& for Upgrade tasks, remove redundant checks for null TRI_vocbase_t (#5301) 2018-05-09 15:54:07 +03:00
Kaveh Vahedipour 6580ba11cd removal of unused class Cluster/SelectivityEstimate 2018-05-08 12:16:21 +02:00
Michael Hackstein 4ede998186
Bug fix/compile issues on mac (#5280)
* Mvoved unnecessary std::move

* Replaced size_t by uint64_t

* Fixed usage of <array>
2018-05-08 11:07:16 +02:00
Tobias Gödderz 8c87f51429 Feature/fix inconsistent distribute shards like job (#4743) 2018-05-07 16:53:08 +02:00
Simon fdee0544b7 Using asio::io_context::strands instead of locks (#5266)
* initial try adding strands

* working, stable amount of threads

* improve shell_client cluster

* Fixing some accounting for the scheduler

* Fix accounting

* Fixing wrong strand usage

* add missing return

* Fixing thread accounting

* More scheduler accounting issues

* Fixing various things

* Fixing some stuff

* Fixing some stuff

* Some more subtle bugfixes

* Some cleanup code

* fixing some stuff

* adding some more fixes

* Fixing possible issues

* Fixing missing _storeResult

* Fixing some stuff

* Reducing lambda stack, perhaps fixing hangups

* Fix writeunlocker

* Fixing possible issues

* adding some debugging stuff

* refactor sockets

* possible fixes

* Adding more job guards

* Fixin possible bug

* cleaning up some stuff

* working impl

* Remove debugging output

* Fixing build

* fixing import

* Fixing another bug

* removing debug log

* Removing examples

* Reverting scheduler working code

* Cleanup

* Addressing review comments
2018-05-07 15:58:19 +02:00
Andrey Abramov b69b5bdfdf
Bug fix/issue #5186 (#5269)
* do not persist view on startup

* small refactoring

* ensure view is being opened after creation
2018-05-06 20:38:32 +03:00
Simon 828f1d423c S2 based Geo-Spatial index (#5249) 2018-05-02 23:54:41 +02:00
Matthew Von-Maszewski 01cc6d2159 adjust thread crash logging for simple db server case. (#5234) 2018-05-02 22:35:20 +02:00
Andrey Abramov 4649b40b96
Coordinator ArangoSearch view + Execution nodes + AgencyMock (#5160)
* add initial implementation of scatter view rule and node

* add tests for `IResearchViewNode` and `IResearchViewScatterNode`

* add missing check

* modify IResearch execution nodes to use references instead of pointers

* use view id in searialized `ExecutionNode` representation instead of the name

* add cluster mode stubs and checks

* very first attempt to distribute IResearchViewNode

* further implementation of cluster-wide arangosearch views

* fix invalid json format

* add tests for coordinator iresearch view

* allow to retrieve a list of existing views on a coordinator

* more tests for coordinator iresearch view

* some fixes to enable query explanation

* remove Collection dependency from RemoteNode

* remove unnecessary remote ArangoSearch view scatter

* fix explanation appearance

* add some assertions

* minor fixes

* implement IResearchViewCoordinator::updateProperties

* fix view DDL issues

* handle link modifications in DDL operations

* add coordinator implementation of iresearch view links

* fix tests

* further coordinator based view DDL implementation

* further IResearchViewCoordinator implementation

* add initial implementation of AgencyMock

* fix some tests

* code cleanup

* extend test + some fixes

* more tests for IResearchViewCoordinator

* fix tests for IResearchLinkCoordinator

* some fixes after merge

* fix tests

* remove declaration of nonexistent (previously removed) method

* some fixes after review

* remove string duplication

* more tests and fixes

* more fixes and tests

* more tests

* one more test

* fix 'use-after-free' asan error

* fix non-enterprise tests issues
2018-05-02 00:15:11 +03:00
Simon a1416e1067 Make v8 optional on startup (#5220) 2018-04-30 12:48:57 +02:00
Jan 30b12e311b
Bug fix/remove most of aql js (#5223) 2018-04-30 11:17:11 +02:00
Simon 2e0e2574d4 Port from 3.3 (#5213) 2018-04-27 17:05:18 +02:00
Matthew Von-Maszewski a67df088b0 correct race condition leading to infinite job execution (#5201)
* fix infinite loop by setting _lastSyncTime within runBackgroundJob().  add code to make agency callback ignore _lastSyncTime limit.
* create change notes for this PR and previous PR 5114
2018-04-27 13:43:22 +02:00
Wilfried Goesgens ea507bb27c use std::chrono and std::date for date / time formatting (#5194) 2018-04-24 18:23:04 +02:00
Simon 468231efc5 AQL Profiling code (#5165)
* initial start of profiling

* adding profiling code

* Fixing remote block tracing, fixing width and units

* Fixing some tests

* Various fixes

* adressing review comments
2018-04-24 16:17:30 +02:00
Wilfried Goesgens 7d6e580780 Refactoring & code cleanup (#5138) (#5142) 2018-04-24 14:42:23 +02:00
Matthew Von-Maszewski a84f7805ad Feature/mv thread death logging (#5111)
* Initial low level interface for thread crash reporting (and management).
* Add a member version of isClusterRole()
* isolate heartbeat thread creation to new StartHeartbeatThread().  create heartbeat thread even if not a cluster or if an agent.
* update runDBServer() and runCoordinator() to shutdown more quickly by polling isStopping() at additional locations.
* copying updates from different branch / PR
* basic thread crash logging.  Not yet tied into Agency arangod or have any specific threads posting crashes
* make Supervision thread a CriticalThread
* sandwich CriticalThread between Thread and other classes to create long term, repeating thread crash reporting.
* restore code lost upon branch update relating to new startHeartbeatThread() function
* add CriticalThread.cpp to build
* add new runAgentServer() function to loop for Agents.  Make Heartbeat thread derive from CriticalThread.
* remove debug line
2018-04-23 15:50:14 +02:00
Vasiliy 012aaa9469 issue 383.4: push vocbase validity check up from Query constructor out into arangodb::consensus::State, StatisticsWorker and AQLUserFunctions calls (#5177) 2018-04-23 14:52:42 +03:00
Simon 45fbed497b Supervision Job for Active Failover (#5066) 2018-04-23 12:49:41 +02:00
Jan 2b84348b77
remove call to requiresElevatedPrivileges with default value (#5172) 2018-04-23 11:28:24 +02:00
Vasiliy 9062c41592 issue 383.3: implement remainder of IResearchViewDBServer tests, use the data-source id (primary key) instead of an arbitrary instance for dropCollection()/dropView(), backport from iresearch upstream: ensure block is flushed if key index is full (#5176) 2018-04-23 00:33:46 +03:00
Matthew Von-Maszewski dd03ca5dd8 shutdown quicker on db server and coordinator heartbeat threads (#5114)
* shutdown quicker on db server and coordinator heartbeat threads

* Adjust the new condition variable usage to follow normal coding patterns.  Probably was ok.  But why take the chance.  And simplify future maintenance.
2018-04-20 15:00:08 +02:00
Kaveh Vahedipour 3d043b35a3 Feature/supervsion maintenance mode (#5108)
* Supervision goes to Maintenance mode, when /arango/Supervision/Maintenance exists
* coordinator route stands
* stop updates in transient, when supervision off
2018-04-20 13:23:22 +02:00
Wilfried Goesgens 2b1ba8c524 fix windows warnings (#5115) 2018-04-17 11:44:45 +02:00
Simon 8be273efb8 Replication cleanup (#5105) 2018-04-17 08:17:42 +02:00
Simon 7677afabf1 Remove copy of request body in rest handlers (#5104) 2018-04-16 14:49:51 +02:00
Andrey Abramov 6eaaf6abd2
Merge pull request #5103 from arangodb/bug-fix/internal-issue-#374.3
issue 374.3: use a reference to vocbase instead of a pointer in DatabaseGuard
2018-04-13 15:52:01 +03:00
jsteemann 690dd2186d remove unused instance variable 2018-04-13 12:34:31 +02:00
Jan 76dcd6ded5
added option `--cluster.require-persisted-id` (#5001) 2018-04-13 11:08:49 +02:00
Vasiliy f392925903 issue 374.3: use a reference to vocbase instead of a pointer in DatabaseGuard 2018-04-13 09:56:49 +03:00
Vasiliy d1ce3a97ef issue 355.7: ensure LogicalDataSource::vocbase() returns a reference 2018-04-09 15:38:24 +03:00
Vasiliy e4368b0991 issue 355.6: remove create() from LogicalView, remove IResearch dependency from IndexFactory, store vocbase reference in LogicalDataSource 2018-04-06 16:38:34 +03:00
Jan a2f8077cd4
fix restoring of smart graph edge collections so it does not time out (#5049) 2018-04-06 14:14:52 +02:00
Vasiliy 99b83ba8c8 issue 355.5: remove more unused methods, move view-related storage engine functionality from vocbase into DBServerLogicalView, address MSVC cmake dependency issue 2018-04-05 16:17:07 +03:00
Andrey Abramov f6c27ce073
Merge pull request #5017 from arangodb/bug-fix/internal-issue-#355.4
issue 355.4: remove redundant methods and code, use 'cp' instead of 'cmake copy_directory' where possible, use vocbase reference instead of pointer
2018-04-05 14:27:44 +03:00
jsteemann 0bef01d85f add missing call to sendState 2018-04-04 17:29:18 +02:00
Vasiliy 635db3b409 issue 355.4: remove redundant methods and code, use 'cp' instead of 'cmake copy_directory' where possible, use vocbase reference instead of pointer 2018-04-04 10:53:48 +03:00
Jan 7cb115a1a9
remove option `--cluster.my-local-info` (#4999) 2018-04-03 17:34:08 +02:00
Andrey Abramov 57b5823fcf some fixes after review 2018-03-30 14:34:15 +03:00
Andrey Abramov 3eadf5aaf9 add convenient factory method for views 2018-03-29 20:19:03 +03:00
Max Neunhoeffer 69d6d2eb0f
Fix compilation. 2018-03-29 16:24:29 +02:00
Max Neunhoeffer fd6fef6b9b
Try to fix createView in cluster. 2018-03-26 12:57:30 +02:00
Max Neunhoeffer 790824fd68
Merge remote-tracking branch 'origin/devel' into feature/arangosearch-cluster-views 2018-03-26 10:50:23 +02:00
Max Neunhoeffer a84d758105
Increase a log level to warn (as in 3.3). 2018-03-23 12:47:40 +01:00
Andrey Abramov 41eb649556
Merge branch 'devel' of https://github.com/arangodb/arangodb into bug-fix/internal-issue-#345 2018-03-21 21:02:17 +03:00
Jan cd219bdfa1
increase timeout for index creation (#4915) 2018-03-21 18:17:38 +01:00
Max Neunhoeffer 0d3e6fc834
Merge remote-tracking branch 'origin/devel' into feature/arangosearch-cluster-views 2018-03-21 09:58:52 +01:00
Andrey Abramov 04bb3da337
Merge branch 'devel' of https://github.com/arangodb/arangodb into bug-fix/internal-issue-#345 2018-03-20 19:04:54 +03:00
Michael Hackstein c1650702bf
Feature/aql server based locking (#4783)
* Started Implementing the ServerBasedlocking. There now is a container that can contain multiple query snippets. It now has to setup the necessary calls to the Servers

* Added backwards linking of QueryEngines, sth. DBServers can contact their Coordinators.

* Added LogTopic AQL

* Made AccessMode::Type Hashable

* Created a Mapping Server => LockLevel => Shard and createad a JSON object containing the Lock information for a complete AQL query per server

* Added code to build coordinator engines

* Finished with first draft of Coordinator-side of new DBServer based locking.

* Added a _api/aql/setup route that creates and locks all snippets/collections for one DBServer in a single go

* Fixed some Coordinator parts

* Index node now gracefully reports if it could not find it's collection when created from vpack. Otherwise it just hardly crashed...

* Modified the Coordinator Snippet collector to be able to handle subqueries properly.

* Started adding GraphNode handling. WIP. Need to deploy engines properly. Coordinator crashes on Graph tests

* Fixed compiler errors

* WIP: EngineInfoContainer

* Separated the EngineInfoContainers for Coordinator and DBServer into different files. They diverged more than anticipated

* Added forgotten files. THe DBServer container now creates the TraverserEngine Mapping and moves it into the Infos. They are not keeping it yet and need to add it to the message as well.

* The DBServer engine infos now persist the TraverserEngine infos. Need to add them to messages though.

* The new aql exec-engine now sends out traverserEngines as well

* Formatting and adding DEBUG level output

* Made the RestAQLHandler aware of the TraverserEngineRegistry. Also created the engines now. Return format changed server-side coordinator side needs fix.

* Adapted the Coordinator side for the DBServer based Shard Locking

* The DBServer based Locking now honors restrictions to certain shards

* Fixed a strange double lock bug in the new AQL Server based locking technique. Add some DEBUG output

* Fixed usage of MAINTAINERMODE macro. The assertion was never active

* Added TestCase for ContainerCoordinatorTest to cmake

* Added -DTEST_VIRTUAL to CMAKE. This is used to define virtual functions for mocking ONLY on test-builds.

* Fixed usage of ENABLE_MAINTAINER_MODE ifdef. CLANG format

* On non-enterprise builds ENTERPRISE_VERT defaults to TEST_VIRTUAL => virtual in test else non-virtual

* Added TEST_VIRTUAL to ExecutionEngine, Query and QueryRegistry

* Added first testcase for EngineInfoContainerCoordinator not yet ready.

* Mode CreateBlock a member function of engine, we have the engine in our hands anyways no need to make it static. Included some more TEST_VIRTUAL functions.

* Fixed clang/MacOs compile error. Added some more TEST_VIRTUAL declarations

* Finally fixed the first buildEngines UnitTest \o/

* Added a unit-test for backward linking of dependencies in CoordinatorPlanner

* Added multi-snippet test for EngineInfoContainerCoordinator

* Removed QueryRegistry.h from central header files and replaced by a forward declartion.

* Added a createBlocks method on the ExecutionEngine. It should be responsible to create all those blocks at once. Adapted the UnitTests as well. Not included Tests for the new createBlocks functionality. Need to mock the options feature first

* Added another test that Coordinator Snippets of queries can be created correctly

* Fixed Coordinator-site cleanup of QueryRegistry, if any of the query creations fails with error, incl UnitTest.

* Added first test for RestAqlHandler::setup. It does only test the setup and gives prepartion for real testing.

* Added a assertion of http return code. Still no creation of queries is tested. Requires a huge amount of mocking.

* fix some deadlocks found by evil lock manager (tm)

* fix duplicate lock

* fix indentation

* ensure proper lock dependencies

* fix lock acquisition

* removed useless comment

* do not lock twice

* create either a V8 transaction context or a standalone transaction context, depending on if we are called from within V8 or not

* AQL micro optimizations

* use explicit constructor

* only use V8DealerFeature's ConditionLocker for acquiring a free V8 context

entering and exiting the selected context is then done later on without having to hold the ConditionLocker

* remove some recursive locks

* Disable custom deadlock detection when Thread Sanitizer is enabled

* Changing ifdef's

* grr

* broke gcc

* Using atomic for ApplicationServer::_server

* fix premature unlock

* add some asserts

* honor collection locking in cluster

* yet one more lock fix

* removed assertion

* Allow the clustercomm to send nolock headers on count. This is used form within AQL

* IsLocked on transactions will now always yield true IF LOCK_NEVER is set. We simply assume someone else holds the lock for us. Also LOCK_NEVER is now set on collection/count if noLock header is send.

* Moved the flag if collections need to be locked into the TraverserEngines.

* Added enterprise-satellite hooks in EngineInfoContainerDBServer

* Removed now obsolete code

* Replaced throwing of Exception by an ResultObject

* Added some more tests and moved adding snippet to query engine more to the outside.

* Added the AQL result type

* Make the branch compile again

* Register WITH collections for Graphs in the new Collector.

* Fixed test code for failing query clone. Idea was to once clone successfully and second time to fail, we verify that first clone is cleaned up properly. However test failed on first clone...

* Removed a double builderClose

* Added Changelog entry

* Removed empty if

* Removed obsolete todo

* Properly initialize the AqlResult with nullptr on error case

* Updated comment

* Simplified Assertion

* Removed debug output object...

* Added additional catch case for std::exception to get some more error info

* Clarified evaluation order for move case

* Added Explicit

* Fixed cleanup of Coordinator if Registry fails to insert query.

* Allow to use other locks than Read/Write for AQL collections. Not yet in API.

* Updated Comments for other Locks on DBSide. Adapted Destruction CatchTests

* Fixed double builderClose and removed unnecessary double commits

* Added a comment to clarify the state

* Moved error output to trace. Leftover from debugging

* Added some tests for complex subquery patterns

* Added a 'fireAndForgetRequests' methods to cluster comm which allows to send out a bunch of messages but does not wait for their results

* Properly cleanup leftovers of queries if the instantiation step already failed

* Added code comment for fireAndForget

* Added indexes to subquery test to make the plan a bit easier

* The cleanup on DBServerEngines in error case now also cleans up traverser engines.

* Removed unnecessary includes

* Removed debug logging

* Fixed hidden merge conflict
2018-03-20 16:52:19 +01:00
Andrey Abramov 3a30c85e49
Merge branch 'devel' of https://github.com/arangodb/arangodb into bug-fix/internal-issue-#345 2018-03-20 15:59:44 +03:00
Jan 71302170c8
fixed internal issue 2102: Segfaults in arangod console when using ag… (#4874) 2018-03-20 09:17:06 +01:00
Max Neunhoeffer d4616a6063
Merge remote-tracking branch 'origin/devel' into feature/arangosearch-cluster-views 2018-03-19 10:08:47 +01:00
Andrey Abramov 01d9baf359 remove TRI_ERROR_ARANGO_VIEW_NOT_FOUND, rename TRI_ERROR_ARANGO_COLLECTION_NOT_FOUND to TRI_ERROR_ARNANGO_DATA_SOURCE_NOT_FOUND 2018-03-17 19:36:14 +03:00
Vasiliy 06eb8ade01 issue 344.7: remove more redundant functions (#4863)
* issue 344.7: remove more redundant functions

* backport: fix missed functions under USE_ENTERPRISE
2018-03-15 17:10:28 +01:00
Max Neunhöffer 793101528f Repair resilience-sharddist test by increasing timeout. (#4857) 2018-03-15 13:29:55 +01:00
Vasiliy 148bdb7158 issue 344.6: remove some redundant functions (#4842) 2018-03-15 11:03:35 +01:00
Max Neunhöffer 52dd3e82ce
Fix second cluster bootstrap hanger. (#4841)
Call loadCurrentDBServers regularly.
2018-03-15 00:44:15 +01:00
Max Neunhoeffer 0f46598200
Merge remote-tracking branch 'origin/devel' into feature/arangosearch-cluster-views 2018-03-14 23:24:41 +01:00
Max Neunhoeffer e72c8f24fb
Fix compilation. 2018-03-14 23:24:07 +01:00
Max Neunhoeffer ce8db24975
Add methods in ClusterInfo to create and drop views. 2018-03-14 23:22:44 +01:00
Kaveh Vahedipour 2e2d947c1c devel: fixed the missed changes to plan after agency callback is registred f… (#4775)
* fixed the missed changes to plan after agency callback is registred for create collection
* Force check in timeout case.
* Sort out RestAgencyHandler behaviour for inquire.
* Take "ongoing" stuff out of AgencyComm.
2018-03-14 12:01:17 +01:00
Jan c9a3a8b5b9
fix a few blind spots in storage engine selection (#4809) 2018-03-12 13:40:16 +01:00
Jan c4696bfed5 do not mask lock timeout or other errors on WAL flush but report them instead of an "internal error" (#4721) 2018-03-12 09:09:59 +01:00
Max Neunhoeffer 33ce6acc2c
Merge remote-tracking branch 'origin/devel' into feature/arangosearch-cluster-views 2018-03-09 11:39:25 +01:00
Max Neunhoeffer 59785b226f
Fix compilation. 2018-03-08 15:26:41 +01:00
Max Neunhoeffer 28be92ec52
Parse Views hierarchy in loadPlan. 2018-03-08 15:07:46 +01:00
Max Neunhoeffer a8a307b532
Report views in ClusterInfo.
This is incomplete as it is, because we do not yet parse the views
we see in the plan.
2018-03-08 14:24:22 +01:00
Simon 272859c5fd Replacing js upgrade logic (#4061) 2018-03-08 13:57:30 +01:00
Mark 292117e3cf Bug fix devel/bfs filter vertices (#4752) 2018-03-08 09:06:15 +01:00
Jan 5a67a048c5
bump version number for all local DDL changes and tell agency (#4685)
this allows other listeners (e.g. for DC2DC) to get notified when
DDL operations are carried out locally and need to be applied remotely
2018-03-05 17:06:34 +01:00
Simon 345fc3c0b7 Refactor Authentication Layer (devel) (#4592)
* Cherry Picking LDAP changes

* Adding missing merges

* Fixing remaining mentions of FeatureCacheFeature

* Fix jslint

* Fixing some failed tests

* Fixing cluster authentication issue, red tests

* Fixing ldap testsuite, adding trace logging

* Fixint ldap tesuite setup and LDAP recognition

(cherry picked from commit 686d28a779)

* Fixing wrong assert

* Adding changelog entry, making requested changes from code review

* Fixing dump_authentication, fix typos

* improvements found during code review

* oops

* more use of sessionstorage

* fix tests

* Fixing broken handling, disallowing adding of local users when disabled

* Fixing testInvalidGrants

* Removing undefined auth level externally

* Fixing previous commit

* added tests for ldap search mode

* intentionally removed `after` methods from tests

because they are executed before the tests start
no cleanup is performed right now after the authentication tests
however, a cleanup is done at start of every test

* ldap tests all modes

* forward port changes from 3.3

* added generated files

* forward port missing changes for web UI

* added generated files

* added generated files
2018-02-28 13:24:28 +01:00
Michael Hackstein 1f3c4105a0
Now smart edge collections also translate the collectionName of distributeShardsLike (#4568) 2018-02-12 14:42:09 +01:00
Simon 35136a89c0 Fix some problems with active failover (#4540) 2018-02-09 15:11:53 +01:00
Jan b2ceb68205
Feature/small misc optimizations (#4504) 2018-02-08 09:25:07 +01:00
Michael Hackstein 7a5a9a620c
Bug fix/distribute shards like (#4415) 2018-01-29 13:07:06 +01:00
Andrey Abramov a1cfb3d72b Feature iresearch (#4105) 2018-01-19 14:23:58 +01:00
Jan 7f860153ba
Bug fix/msvc fixes (#4243) 2018-01-08 11:20:53 +01:00
Jan b2b6c06cbf
Feature/efficiency (#3736) 2018-01-05 16:51:31 +01:00
Jan 25af4d7f69
try to not fail hard when a collection is dropped while the WAL is tailed (#4226) 2018-01-04 16:31:11 +01:00
Kaveh Vahedipour 7715c75c59 let's not miss failedserver removal (#4208)
* let's not miss failedserver removal
* remove resetting of FailedServers in test code
* Only call abortRequestsToFailedServers at most every 3 seconds.
2018-01-03 21:55:40 +01:00
Heiko 61de1b6099 Bug fix/optimize shard distribution api and ui (#3921)
* UI: document/edge editor now remembering their modes (e.g. code or tree)

* changed shardDistribution api behaviour, added PUT route to only fetch collection based shard distribution

* ui: optimized shards view, added missing cleanup function in nodes view

* broken test

* shard distribution tests not fit the new api behaviour

* variables as reference

* CHANGELOG
2018-01-02 12:42:12 +01:00
Matthew Von-Maszewski cb56f0acf1 Have twice seen coordinator go into long loop on shutdown. Added two tests for isStopping() to break the loops. (#4138) 2017-12-21 20:32:16 +01:00
Matthew Von-Maszewski e6f7282e03 Make chooseTimeout() dynamic (#3996)
* add parameter to increase timeouts per 4096 size of total request package

* remove log output

* change initial timeout math from integer to double ... avoids reviewer confusion.
2017-12-15 18:25:18 +01:00
jsteemann fba103fb49 Merge branch 'devel' of https://github.com/arangodb/arangodb into devel 2017-12-13 18:19:07 +01:00
jsteemann e1a5268fca fix Windows compile error 2017-12-13 18:18:45 +01:00
Jan 73bce34174
potentially fix send request timeout (#4026) 2017-12-13 17:54:12 +01:00
Jan 9c76613e63
fix premature unlock (#3802)
* fix some deadlocks found by evil lock manager (tm)

* fix duplicate lock

* fix indentation

* ensure proper lock dependencies

* fix lock acquisition

* removed useless comment

* do not lock twice

* create either a V8 transaction context or a standalone transaction context, depending on if we are called from within V8 or not

* AQL micro optimizations

* use explicit constructor

* only use V8DealerFeature's ConditionLocker for acquiring a free V8 context

entering and exiting the selected context is then done later on without having to hold the ConditionLocker

* remove some recursive locks

* Disable custom deadlock detection when Thread Sanitizer is enabled

* Changing ifdef's

* grr

* broke gcc

* Using atomic for ApplicationServer::_server

* fix premature unlock

* add some asserts

* honor collection locking in cluster

* yet one more lock fix

* removed assertion

* some more bugfixes

* Fixing assert

(cherry picked from commit 1155df173bfb67303077fbe04ee8d909517bfd21)
2017-12-13 13:27:42 +01:00
Jan 3143805c6f
make replication abortable (#4016) 2017-12-13 12:32:04 +01:00
Simon Grätzer dce677720d Fixing an issue with intermediate commits (#3975) 2017-12-12 09:17:18 +01:00
Jan 635f11884a
Bug fix/tidy up statistics (#3970)
* fix MSVC compile warning

* tidy up statistics

* remove undocumented db.__save V8 method, previously used for the statistics gathering from JavaScript
2017-12-08 16:00:23 +01:00
Jan 282be208cc
remove TRI_usleep and TRI_sleep, and use std::this_thread::sleep_for … (#3817) 2017-12-06 18:43:49 +01:00
Manuel B 857c64c0d9 Move Statistics into c++ (#3184) 2017-12-06 16:36:52 +01:00
Jan 17986ebc08
return error context for "some agency operation failed" (#3760) 2017-12-06 11:16:19 +01:00
Kaveh Vahedipour 11cfa74495 Bug fix/no support for inquiry in send transaction with failover less verbosity (#3847) 2017-11-30 14:54:22 +01:00
Kaveh Vahedipour 7015db790f replaced all AgencyGeneralTransactions by AgencyWriteTransaction (#3841) 2017-11-29 17:33:24 +01:00
Max Neunhöffer 5e7bfaf24b
Improve an error message, add httpStatus and reformat. (#3837) 2017-11-29 11:18:49 +01:00
Kaveh Vahedipour f7b4150b64 no clientId anymore in send/sendWithFailOver SPIs (#3819) 2017-11-28 10:47:36 +01:00
Kaveh Vahedipour 27cd691bbf Bug fix/agencycomm validate methods broken (#3805) 2017-11-27 14:18:25 +01:00
Matthew Von-Maszewski 874e1d8eb0 correct stupid mistake: was erasing iterator in middle of same iterators loop (#3740) 2017-11-20 10:24:53 +01:00
Matthew Von-Maszewski 0558b4372d Improve ClusterComm::wait() (#3722) 2017-11-17 22:46:54 +01:00
Simon Grätzer 0e485f7441 Fixing collection name collection handling in Syncer (#3710) 2017-11-17 16:36:57 +01:00
Jan 4a199e2fe3 make truncate a bit less obstrusive (#3721) 2017-11-16 20:28:33 +01:00
Jan 5abf0c1185 Bug fix/fixes 1511 (#3711) 2017-11-16 14:18:51 +01:00
Michael Hackstein 3911a6a722 Fixed minor issue in shard distribution reporter thanks to @danielhlarkin for spotting (#3695) 2017-11-14 16:47:25 +01:00
Jan 8ece5914a3 do not warn when the user attempts to drop a non-existing index (#3682) 2017-11-13 17:47:57 +01:00
Jan ba9bc41457 fix some typos in code and docs (#3671) 2017-11-13 17:33:36 +01:00
Frank Celler 813f5c8541
fixed output (#3668) 2017-11-12 18:06:56 +01:00
Kaveh Vahedipour 7e816db51e Bug fix/agency restart enhancements (#3619)
* Removed unused active(...) method in Agent
* Inception's restart from persistence allows peer with empty active RAFT list to join
* Agency's UUID is persisted outside of the database comparable to coordinator and db server action.
* Publicized Methods to UUID stuff in ServerState
* Inception method documentation
* added --agency.disaster-recovery-id to allow for specification of known former agency id. this is a very dangerous option potentially.
* Delete a unused methods.
* separate _id and _recoveryId
* populating active list with entire pool
* Improve logging.
* reject gossip from unknown agent, if pool is complete
2017-11-10 23:40:26 +01:00
m0ppers 6cbf7159be Feature/server mode (#3590)
* Switching from ttl to supervision based failover mechanism

* Allowing canceling of ongoing actions

* refactored asyncjobmanager

* refactoring some code

* adding read-only flag

* Fixing some bugs

* Fixing tests

* Canceling ongoing operations

* removing some unused code + some asserts

* catching some exceptions to reduce log pollution, removing unnecessary code, removing tests for _changeMode

* fixing "createsANewDatabaseWithAnInvalidUser"

* Current work

* proper ifs

* Migrate resthandler to c++ and implement mode

* readonly server mode spec test

* Add changelog

* change code so it expects a full object and not just a string

* Fix jslint
2017-11-10 17:56:21 +01:00
Kaveh Vahedipour b09c329a7c fixed 10 min dead time, when bogus --cluster.my-local-info specification (#3643) 2017-11-10 16:05:03 +01:00
Jan 733f27e997 Bug fix/fix compilation with gxx7 (#3637) 2017-11-10 16:00:57 +01:00
Michael Hackstein 5c633f9fae Bug fix/speedup shard distribution (#3645)
* Added a more sophisticated test for shardDistribution format

* Updated shard distribution test to use request instead of download

* Added a cxx reporter for the shard distribuation. WIP

* Added some virtual functions/classes for Mocking

* Added a unittest for the new CXX ShardDistribution Reporter.

* The ShardDsitributionReporter now reports Plan and Current correctly. However it does not dare to find a good total/current value and just returns a default. Hence these tests are still red

* Shard distribution now uses the cxx variant

* The ShardDistribution reporter now tries to execute count on the shards

* Updated changelog

* Added error case tests. If the servers time out the mechanism will stop bothering after two seconds and just report default values.
2017-11-10 15:17:08 +01:00
Jan 7613bc4314 Bug fix/fixes 0211 (#3568)
* remove some non-unused V8 persistents

* do not throw that many bogus assertions

* do not rely on server role being defined

* slightly better debug output for V8 context debugging

* fix collection ids in inventory response

* simplify bootstrap a bit

* slightly better error handling

* make elapsed time a queryable value

* use less memory for stub collections

* added assertions that will always make sense

* added assertions

* do not garbage-collect while waiting

* less copying of parameters

* do not show "load indexes into memory" buttons for mmfiles engine

  as all indexes are in memory anyway

* when a collection is truncated via the web interface, flush the WAL and rotate all active journals

this will make close all open journals on leader and followers and make them subject to compaction opportunities

* fix invalid server id values being passed from web interface to backend

* introduce afterTruncate method for indexes

* added test case for issue #3447

* updated CHANGELOG

* don't warn about replicationFactor for system collections

* check that the queries actually use the geo index and not some other index

* properly report error in web interface

* fix some internals checks that made truncate fail for bigger collections in maintainer mode

* also run a compact() operation after a serious truncate

in order to make iteration over the truncated range much faster
when the collection is next accessed

* increase default maximum number of V8 contexts to at least 16
2017-11-09 12:48:15 +01:00
Jan f6a90c4879 some cleanup (#3583) 2017-11-07 10:27:21 +01:00
Simon Grätzer ee8209943f Missing things for active / passive (#3578)
* Switching from ttl to supervision based failover mechanism

* Allowing canceling of ongoing actions

* refactored asyncjobmanager

* refactoring some code

* adding read-only flag

* catching some exceptions to reduce log pollution, removing unnecessary code, removing tests for _changeMode

* fixing "createsANewDatabaseWithAnInvalidUser"

* auth = off does not longer make everyone superuser

* Fixing cluster_sync and maybe resilience
2017-11-04 20:30:23 +01:00
m0ppers b552853909 Feature/optional replication factor enforcement (#3570) 2017-11-02 21:41:25 +01:00
Simon Grätzer 64e9377c05 Replacing /_api/collection with RestHandler (#3543) 2017-11-02 14:57:17 +01:00
Jan 4d03d3bd82
Bug fix/fixes 2410 (#3511)
* slightly clean up indexes

* proper stringification of error code

* improve diagnostic messages

* improve memory usage for incremental sync
2017-10-30 22:49:24 +01:00
Simon Grätzer d14710b683 Updating Replication Factor (#3528)
* Changing replication factor in 3.2 (#3513)

* Allow changing replicationFactor on coordinator

* Fixing logic

* Allowing change of replication factor

* Additional input validation

* grrr

* Testing invalid inputs

(cherry picked from commit 47600e1ea1)

* jslint

(cherry picked from commit 4d223597db)

* cherry-pick commits from 3.2: 47600e1ea1 4d223597db d684eaa7f8 a898af3723
2017-10-26 15:28:05 +02:00
Jan e9ade3f02d increase timeout for truncate operation in cluster (#3515) 2017-10-26 10:29:18 +02:00
Jan 9eb5e1545f Bug fix/fix cluster foxx queue startup (#3507)
* slightly clean up indexes

* proper stringification of error code

* do not run AQL queries during Foxx queue startup

* always create keyspace
2017-10-25 18:07:27 +02:00
Michael Hackstein 15d9a4be5f Reactivated the failover of the FoxxMaster, it was not modified anymore after the current master dies (#3510) 2017-10-25 18:03:24 +02:00
Simon Grätzer 3e211f0d9e Fix --cluster.my-local-info (#3481)
* Compatibility for old startup methods.

* removed my-local-info from docs
2017-10-23 12:20:17 +02:00
Jan 720e6df82e Bug fix/fixes 1910 (#3471)
* properly initialize all properties

* use faster comparison

* properly detect and handle "method not allowed"

* code-style

* remove unused variable

* narrow variable scope

* handle non-existance of AuthenticationFeature

* remove dead code

* replace some C string handling with std::strings

* moved assertion to the correct place

* honor number of array members for IN operator

* slightly adjust error messages

* slighty adjust some error messages

* try to fix issue with lingering replication contexts on shutdown

* clean up heartbeat thread a little bit

* small fixes
2017-10-23 09:17:36 +02:00
Simon Grätzer fd3f9d99d9 Fixing webinterface access (#3464)
* intermediate commit

* Refactoring the ExecContext

* Fixing authentication

* Added start script

* some fixes

* fixed access to nullptr

* some c++

* fixed misleading message

* Made DatabaseGuard movable. Also adapted map insertions to _vocbase in Syncer classes, which failed to compile under older GCC versions

* added support for global flag to replication handler

* Started Refactoring in replication-static

* Fixing syncer code

* store applier configuration

* Static replication tests now test replication in a non system Database

* added flags to replication feature

* Adding some extra checks

* Fixing issue with rocksdb rest replication handler

* replication static now runs _system and otherdatabase replication tests.

* Fixing crash on startup

* Replication_sync now tests _system as well as other Database

* Fixing up heartbeat thread, adding global flag to rest handler

* Fixing wrong assert

* some cleanup, probably some tests are broken

* Made non-system db version of replication-ongoing tests

* fix determine-open-transaction

* Fixed ongoing tests. And added a test where we drop a database on slave while replication is still ongoing

* test fixes

* Activated ongoing other db tests. Also added a test that drops the DB on master, while the slave is still syncing.

* some better error reporting

* gradually switch to Result

* createCollection -> create

* re-activate using of collection ids for now

* enable auto-start

* Fixed create collection in replication ongoing test

* Added first draft of a test for global replication

* move to Result

* use system database for global applier

* improved error reporting

* fixed invalid URLs

* add test case filter

* load existing global applier configuration

* improve error reporting

* Added further tests for global replication

* Fixed global replication test, it now properly waits for replication. Timeouts after 10 seconds.

* Removed erronious assertion

* improve error reporting

* intermediate commit

* Added a test-case for global replication where the Master already has some data and the slave is clean

* fix deletion of replication contexts

* Fixed JSLint

* compiling code

* fix typo

* do not fail for global applier when no database is configured

* intermediate commit

* syncer supports switch for 3.3 / 3.2

* fixed errors

* Fixing some replication bugs

* Fixing some assertions

* Fixed missing commit markers

* Fixing assertion on database drop

* Attempt to fix deadlock in applier and assertion

* Fixing some stupid things

* Support for collection parameter

* Acidentally turned off some tests

* Grrr

* Fixing wrong method call

* Fixed startscript

* Fixed assignmet instead of equality check typo

* Added a test far interrupted replication. For now it justs tests basics on _system database.

* Improved index tests on replication.

* properly initialize variable

* fixed some replication problems

* MMFiles wal access support

* fix replication issues

* Started mmfiles replication support

* fixing a bug

* Fixing an issue

* fixing some mmfiles stuff

* fix test

* reload users

* prevent pure virtual method call

* intermediate commit

* Making from exclusive

* do not call getMasterState if child syncer

* some reformatting

* Adding global support for handleCommandSync

* Fixing assertion

* removing some debug logs

* Changing return codes

* Fixing some issues in the rest handler

* Make replication less susceptible to errors

* remove some debug output

* return last log tick

* remove waits from tests

* fix two tests

* changing header for open-transactions call

* some fixes

* fix test

* invalidate cached databases

* merging request and execcontext

* try to fix assertion error

* renamed method

* fix compile warning

* small changes

* Always use execcontext

* Fixing an assert

* fix replication issues

* try to fix collection lookups

* try to fix master/slave start

* Changing comments in heartbeat thread

* fix wrong signature of READ_LOCKER_EVENTUAL

* log server role in testing mode

* Fixed authentication, removed execContext in favor of request context

* Adding cluster rest api

* Fixing cluster rest handler

* Fixing cluster callback

* Some refactoring

* Queue creation is not a single operation

* Allowed for leader redirects

* Setting start of batch

* Disabling 2.8 compat tests

* fix start/stop bugs

* jslint

* various little changes

* add flag for exposing jwt

* indentation

* cleanup

* Some changed to guid

* fixing tcp to http, vst

* changed endpoint header

* small fixes

* Reorder servers by health status

* Higher timeout

* Changing error messages

* update the fromTick when fetching multiple batches from the coordinator

* more debug info

* Reducing copy pasted code

* change uid generation

* reducing logspam

* more exceptions for redirects

* more exceptions

* attempt to fix uniqids in cluster

* centralize printing of HTTP errors in replication

* debug output

* fix messages for authentication

* cleanup

* removing --cluster.my-id, --cluster.my-local-info

* Added leadership race to bootstrap, determine foxxmaster on boostrap, removing obsolete code

* improve error reporting in RestAqlHandler

* Changing heartbeat thread, fixing cluster_sync

* some more debug output

* added master

* attempt to make tests more deterministic

* added logging about indexes

* added some safety checks to the logger

* slighty better error messages

* fix location header for SSL

* fix error message

* try to make tests more deterministic

* change error code from TRI_ERROR_INTERNAL (which we want to avoid) to TRI_ERROR_FAILED

* Fixing broken webinterface access

* reverting groovy change

* Fixing read-only internal users

* Using superuser rights for dashboard now

* Adding mode field to _admin/server/role

* added mode TRYAGAIN

* remove inventory lock (does not seem necessary here)

* remove invalid assertion

* fixing agency bugs

* Removing debug output

* return proper errors in case of "method not allowed"

* Fixed up some info messages

* jslint
2017-10-20 18:06:59 +02:00
Jan 7840d3f824 Bug fix/fixes 1810 (#3460)
* improve error reporting in RestAqlHandler

* added logging about indexes

* added some safety checks to the logger

* slighty better error messages

* fix location header for SSL

* fix error message

* try to make tests more deterministic

* change error code from TRI_ERROR_INTERNAL (which we want to avoid) to TRI_ERROR_FAILED
2017-10-19 11:28:01 +02:00
Simon Grätzer 7c31960cf2 Feature/async failover (#3451) 2017-10-18 23:59:29 +02:00
Wilfried Goesgens 88fd685296 Bug fix/fix shard state restore display (#3385)
* don't add  items into the array if we have an empty list member.

* we mustn't display 100% here, since its a rounding error. If, the checkmark would be set.

* rather use string cutting to round so we don't get into the 100% trap.
2017-10-17 14:08:07 +02:00
Max Neunhöffer 9a2385b941 Add host id detection and show in /_admin/cluster/Health. (#3389) 2017-10-11 12:42:44 +02:00
Jan 0561bf45ce Bug fix/isrestore (#3283)
* Make isRestore work in the cluster.

This covers sharded collections with default sharding and non-default
sharding.

* always use locally generate revision ids for storing and looking up documents
2017-10-03 11:53:49 +02:00
Jan 56fab56ff5 Bug fix/issues 2709 (#3333)
* logging improvements

* fixed copy&paste error

* enable assertions in non-release mode

* updated CHANGELOG

* fix --temp.path (was dysfunctional), fix creation of temporary directories and races
2017-10-01 09:43:03 +02:00
Jan 4ba38bf981 try to work around some assertions (#3296)
* remove obsolete values from relative config

* warn about using obsolete config parameters
2017-09-28 09:21:33 +02:00
Frank Celler 111c826026 Bugfix/modify document coordinator (#3266)
* Fixed ModifyDocument on coordinator. It uses the internally optimized function to extract _key values, which fails on Compact objects

* do not crash server when passing invalid document key type
2017-09-15 22:49:20 +02:00
m0ppers f0ae3d3de7 Bug fix/uaf communicator request abort (#3216)
* Fix abortion of requests

* Fix progress callback
2017-09-15 22:42:30 +02:00
Michael Hackstein 0d522a6136 Fixed ModifyDocument on coordinator. It uses the internally optimized function to extract _key values, which fails on Compact objects (#3258) 2017-09-15 14:23:04 +02:00
Jan 5165155ed1 Bug fix/fixes 0609 (#3227)
* do not use V8 variant of AQL functions in early optimization stage when a C++ variant is available

* additionally, simplify AQL function definitions and aliases

* warn when more than 90% of max mappings are in use

* added C++ variant of replication catchup

* added `--log.role` option

* updated CHANGELOG

* removed non-existing scheduler.threads option from config

* removed useless __FILE__, __LINE__ invocations

* updated CHANGELOG

* allow a priority V8 context

* remove TRI_CORE_MEM_ZONE

* try to fix Windows errors & warnings

* cleanup

* removed memory zones altogether

* exclude system collections from collection tests
2017-09-13 16:28:21 +02:00
Simon Grätzer ffc465433a No access collections Improvements (#3190)
* consolidated EdgeDocumentToken

* optimizing cluster traversal

* adding skip collection checks

* API cleanup

* copying AQLValue to avoid use-after-free bugs

* Fixing rocksdb SingleServerEdgeCursor

* Fixing a collection resolving issue
2017-09-07 14:55:07 +02:00
Jan cd2dfdfd0d fix deadlocks in cluster traversals (#3198) 2017-09-04 17:35:24 +02:00
Simon Grätzer 88d01b89b5 Optimizations for Caches and Graph Traversals (#3169) 2017-08-31 18:33:10 +02:00
Max Neunhöffer f3acea797b Feature/cluster inventory version (#3152)
* Get rid of a compiler warning in community edition.

* Teach /_api/replication/clusterInventory to report Plan/Version and readiness.

This is first implemented in the ClusterInfo library.
Then the clusterInventory code uses it and checks readiness.
Readiness of a collection means that it is created and all shards
and all replications have been created and are in sync.
2017-08-30 13:34:23 +02:00
Kaveh Vahedipour 00650e6a3f Bug fix/agency mt fixes (#3158)
* added debugging methods

* try to fix invalid access in case of error

* remove unused members

* bugfixes and comments

* all agency fixes in

* merge bug

* partially unguarded Agent::lead fixed

* all agency fixes in

* added nrBlocked to thread startup eval

* added nrBlocked to thread startup eval

* recombination of cases in State::get

* some maps replaced with unordered_maps

* optimized maps some
2017-08-30 10:43:51 +02:00
Jan 1ace247273 Bug fix/scheduling et al (#3161)
* added V8 context lifetime control options `--javascript.v8-contexts-max-invocations` and `--javascript.v8-contexts-max-age`

* make thread scheduling take into account most of the tasks dispatched via the io service
2017-08-30 10:40:02 +02:00
Kaveh Vahedipour 4c94a1c8ab fixed a bug in create collection in cluster, where transaction result was not checked for success before access (#3137) 2017-08-28 14:58:26 +02:00
Simon Grätzer f3eb0c2ac0 No access collections (#3088)
* Added virtual attributes for enterprise on Methods.cpp

* Working no access collections

* align a comment

* Documentation and test fix

* fixing community build
2017-08-25 13:59:03 +02:00
Kaveh Vahedipour 1d1e0f5a50 Feature/cluster id and extended health (#3046)
* added unique id to cluster, added access to Health

* added agents to health api

* added agents to health api

* added agents to health api

* transaction information for api

* agents listed like other servers

* missing line through merge conflict
2017-08-18 11:13:23 +02:00
m0ppers 930dd8aad2 MSVC is pendantic (but right) (#3047) 2017-08-17 21:25:34 +02:00
Kaveh Vahedipour 0b6d6d9287 Fixed distributeShardsLike bug in creating collections. numberOfShard… (#2895) 2017-08-03 19:38:16 +02:00
Jan Christoph Uhde ed8efe3566 Feature/issue 387 cluster index estimates (#2866) 2017-08-01 09:53:58 +02:00
Max Neunhöffer 2f874249bb Bug fix/adjust agency comm timeouts (#2765)
* Take out 503 timeouts altogether.
* Overhaul of AgencyComm::sendWithFailover loop.
* Let performRequests optionally ignore 404 coll not found.
* Fix error message "database not found" when AgencyComm failed.
* Add log entries in Agency if locks are acquired too slowly.
* Reexecute the javascript cluster sync stuff even if there was no plan/current change...So failed sync jobs can retry later...
* Cover callbacks in Communicator by lock. This fixes https://github.com/arangodb/planning/issues/370
* Put in delay in waiting for leader in agency test.
* Schmutz logging to heartbeat topic.
* Add more lock time diagnostic in agent.
* Switch on agencycomm tracing in coordinator.
2017-07-13 00:44:28 +02:00
Jan 07abf73bd6 fix parallel access to list of failed servers (#2777) 2017-07-12 22:12:25 +02:00
Max Neunhöffer 3d8e590bee Adapt Raft timeouts dynamically and fix create collection timeout race
Various fixes.
2017-07-06 12:51:51 +02:00
Frank Celler bbe7484521 Feature/auth context (#2704)
* added read-only users
2017-07-02 23:15:57 +02:00
Frank Celler 2807ef559c Feature/move shard fix (#2626)
Major overhaul of handling of synchronous replication.
2017-06-26 16:55:01 +02:00
Simon Grätzer 492d832695 Refactored /_api/index and /_api/database (#2582)
* working documents rest handler

* fixed cluster tests

* Consolidating database APIs

* clang-format

* Fixing issue with user creation through db._createDatabase

* replaced and refactored api-index and index api

* fixed cluster

* renaming some files

* added user methods

* removed files intended for later

* Fixed CC build

* Fixed method signature

* Fixing shell_server, shell_client tests
2017-06-19 23:47:40 +02:00