1
0
Fork 0
Commit Graph

238 Commits

Author SHA1 Message Date
Heiko a472758986 Bug fix/fix internal issue 4451 (#10540)
* Fix dump_authentication suite

* user the correct attribute name

* properly reload user permissions after _users collection restore

* fixed foxx restore test

* changelog

* changed the order of index creation during restore for _users collection

* changed interface of forwarding target in general server rest handler

* added remove header function to general request class

* implemented forwardingRequest interface changes

* added method to find collection in vocbase through shard id

* added co perm verification

* Revert "added method to find collection in vocbase through shard id"

This reverts commit af28442c01432224cfe4d777b6134d0f685e38ba.

* added shard to name map to cluster info

* return ResultT in forwarding

* fixed test

* fixed compile issue

* changelog

* Improved changelog formatting.

* Revert "fixed test"

This reverts commit 3f63d94ff099a94e56addcb432bd0fe733d1bcc6.

* Added authentication dump to singleserver and handle admin users.

* restore perms
2019-11-30 02:40:28 +01:00
Jan 98880f3937
clean up a bit (#10391) 2019-11-11 09:28:18 +01:00
Simon d526805e81 Bug fix/fix suspicous stuff (#10273) 2019-10-17 15:34:22 +02:00
Jan 21b0311d57
rename minReplicationFactor to writeConcern (#10118) 2019-10-07 15:12:15 +02:00
Dan Larkin-York 1d7225b289 Pass connection pool directly to network methods. (#10096) 2019-09-30 12:44:47 +02:00
Dan Larkin-York a83c2323c9 Refactor ApplicationServer stack (#9965) 2019-09-25 17:31:59 +02:00
Jan Christoph Uhde 0b8c75c7b7 one shard db - devel (#9395) 2019-09-23 15:48:37 +02:00
Kaveh Vahedipour dd10909dfc rebootIds instead of boot stamps (#10050)
* rebootIds instead of boot stamps
* noexcept wrong as copies are done
2019-09-20 10:26:35 +02:00
Jan 3a59abd1dc
various issues reported by cppcheck (#9962) 2019-09-09 20:32:04 +02:00
Markus Pfeiffer 753ff4aa67 Feature/atomic database creation 2 (#9826) 2019-09-05 12:38:07 +02:00
Lars Maier 2ec2e1c1bc Background Get Ids (#9474)
* Obtain more ids via a background thread.
* Wait for thread to stop on shutdown.
* Added scope guard.
* Atomic weapons.
* Fix log level.
* One big lock!
* Added mutex for cleanup.
* Fixed unused variable.
2019-08-16 12:42:22 +02:00
Tobias Gödderz 9cd332b958 Feature/rebootid notice changes (#9523)
* Consolidated _servers and _serverAdvertisedEndpoints, added rebootId, prepared change notifications

* Cleanup

* Added a RebootId type

* Began implementing RebootTracker (still WIP)

* Moved RebootId operators into the class

* Removed RebootId operator<< again

* Added tests, added CallbackGuard, removed/commented old RebootTracker code

* Fix: do not try to call unset callbacks

* Split one test, added another

* Added more tests

* Renamed tests, added more tests

* Fixed missing variable declarations

* Let MockServer appear to be started

* Reorded test, fixed naming

* Implemented callMeOnChange()

* Re-implemented RebootTracker (not yet working)

* Resolved a TODO, updated a test, added comments

* Call old callbacks immediately

* Fixed tests

* Use EXPECT_* instead of ASSERT_*

* Suppress a log message

* Resolved TODOs

* Reverted changes on reading ServersRegistered

* Update RebootTracker

* Introduce `rebootId` into ServerState for Cluster

 * A server *boots* if it is started on a previously non-existing data
   directory and hence does not have a UUID yet.
 * A server *reboots* if it is started on a pre-existing data directory

We keep the rebootId in the cluster's agency under
Current/ServersKnown/$uuid/rebootId.

When rebooting (and subsequently re-joining a cluster), the server increments
its rebootId in Phase 2 of registration. This way it can be detected within the
cluster whether a server was restarted.

This information will later be used to handle cases where server restarts can
lead to problems, for example with transactions or in-progress queries.

* Move rebootId into Current/ServersKnown/

* Fixed typo

* Fixed log ids

* Add deletion of ServersKnown/UUID from agency

* Add deletion of Current/ServersKnown/UUID to removeServer

* Clean up readRebootIdFromAgency and add retry loop around it

* Bugfix

* Added nolint comments

* Fixed initialization order

* Fixed ClusterInfo-test

* Added log messages

* Revert "Fixed ClusterInfo-test"

This reverts commit d983596979.

* Disabled assertion for google tests

* Ignore windows compile warning

* Always call loadServers in loadCurrent

* Fix really subtle bug when not returning a value

* Introduce `rebootId` into ServerState for Cluster

 * A server *boots* if it is started on a previously non-existing data
   directory and hence does not have a UUID yet.
 * A server *reboots* if it is started on a pre-existing data directory

We keep the rebootId in the cluster's agency under
Current/ServersKnown/$uuid/rebootId.

When rebooting (and subsequently re-joining a cluster), the server increments
its rebootId in Phase 2 of registration. This way it can be detected within the
cluster whether a server was restarted.

This information will later be used to handle cases where server restarts can
lead to problems, for example with transactions or in-progress queries.

* Move rebootId into Current/ServersKnown/

* Add deletion of ServersKnown/UUID from agency

* Add deletion of Current/ServersKnown/UUID to removeServer

* Clean up readRebootIdFromAgency and add retry loop around it

* Fixed compile error due to forbidden implicit cast

* Fixed compile error on windows

* Fixed compile error due to devel merge

* Removed dead comment

* Removed TODO note

* Extended comment

* Removed TODO note

* Fixed using an invalidated iterator

* Copy string only if necessary

* Fixed compile error
2019-08-12 09:33:22 +02:00
Lars Maier ed496fe5dd Feature/hotbackup devel (#9495)
Hotbackup
2019-08-02 11:39:46 +02:00
Jan 7d829de89e added internal function getResponsibleServers() (#9604)
* added internal function getResponsibleServers()

* forgot to commit

* honor review comments

* Update arangod/Cluster/ClusterInfo.cpp

Potentially Fixed Unique logID usage. (let Jenkins test it)
2019-07-31 10:18:37 +02:00
Michael Hackstein 987ad41364
Forward Port of changes in 3.5 review (#9544)
* Bug fix 3.5/min replication factor (#9524)

* Cherry-pick minReplicationFactor

* Bug fix/failover with min replication factor (#9486)

* Improve collection time of IResearchQueryOptimizationTest

* Added a minReplicationFactor field in Collections. It is not possible to modify it yet and noone cares for it

* Added some assertion son minReplicationFactor

* Transaction API will now reject writes as soon as minimal replication factor is NOT fulfilled

* added minReplicationFactor to the user interface, preparation for the collection api changes

* added minReplicationFactor to VocBaseCollection, RestReplicationHandler, RestCollectionHandler, ClusterMethods, ClusterInfo and ClusterCollectionCreationInfo

* added minReplicationFactor usage to tests

* TODO TEMOPORARY COMMIT FOR TESTING PLEASE REVERT ME

* minReplicationFactor now able to change via collection  properties route

* fixed wrongly assert

* added minReplicationFactor to the graph management ui

* added minReplicationFactor to the gharial api

* Fixed off-by-one error in minReplicationFactor. We actually enforced one more.

* adjusted description of minReplicationFactor

* FollowerInfo Refactoring

* added gharial api graph creation tests with minimal replication factor

* proper cleanup of shell collection tests, removed lots of duplicate code, preparation for some new tests

* added collection create tests using invalid/valid names, replicationFactor and minReplicationFactor

* Debug logging

* MORE Debug logging

* Included replication fast lane

* Use correct minreplicationfactor

* modified debug logging

* Fixed compileissues

* MORE Debug logging

* MORE Debug logging

* MORE Debug logging

* MORE Debug logging

* MORE Debug logging

* MORE Debug logging

* MORE Debug logging

* Revert "MORE Debug logging"

This reverts commit dab5af28c0.

* Revert "MORE Debug logging"

This reverts commit 6134b664bd.

* Revert "MORE Debug logging"

This reverts commit 80160bdf3b.

* Revert "MORE Debug logging"

This reverts commit 06aabcdfe1.

* Removed debug output

* Added replication fast lane. Also refactored the commands as i cannot take it any more...

* Put some requests of RocksDBReplication onto CATCHUP Lane.

* Put some requests of MMFilesReplication onto CATCHUP Lane.

* Adjusted Fast and MED lane usage in Supervised scheduler

* Added changelog entry

* Added new features entry

* A new leader will now keep old followers in case of failover

* Update arangod/Cluster/ClusterCollectionCreationInfo.cpp

Co-Authored-By: Tobias Gödderz <tobias@arangodb.com>

* Fixed JSLINT

* Unified lane handling of replication handlers

* Sorry forgotten in last commit

* replaced strings with static strings

* more use of static strings

* optimized min repl description in the ui

* decr initial loop variable

* clean up of the createWithId test

* more use of static strings

* Update js/apps/system/_admin/aardvark/APP/frontend/js/views/collectionsView.js

Co-Authored-By: Tobias Gödderz <tobias@arangodb.com>

* Added some comments on condition, renamed variable as suggested in review

* Added check for min replicationFactor to be non-zero

* Added assertion

* Added function to modify min and max replication factor in one go

* added missing semicolon

* rm log devel

* Added a second information to follower info that can keep track of followers that have been in sync before a failover has taken place

* Maintenance reports previous version now to follower info. instead of lying by itself. The Follower Info now gets a failover save mode to report insync followers

* check replFactor against nr dbservers

* Add lie reporting in CURRENT

* Reverted most of my recent commits about Failover situation. The intended plan simply does not work out

* move replication checks from logical collection to rest collection handler

* added more replication tests

* Include assert only if we are not in gtest

* jslint

* set min repl factor to zero if satellite collection

* check replication attributes in v8 collection

* Initial commit, old plan, does not yet work

* fixed ires tests

* Included FailoverCandidates key. Not fully implemented

* fixed wrong assert

* unified in sync follower reporting

* fixed compiler errors

* Cleanup locking, and fixed potential deadlocks

* Comments about locking order in FollowerInfo.

* properly check uint

* Keep old leader as potential failover candidate

* Transaction methods now use followerInfo to check if the leader can write, this might have the sideeffect that 'failoverCandidates' are updated

* Let agency check failoverCandidates if possible

* Initialize member variables

* Use unified follower reporting in DBServerAgencySync

* Removed obsolete variable, collecting it somewhere else

* repl factor attr check

* Reimplemented previous followers, second attempt now. PhaseOne and PhaseTwo can now synchronize on current.

* Fixed assertion, forgot an off-by-one

* adjusted test to be more preciese now

* Fixed failove candidates list

* Disable write on dropping too many followers

* Allow to run updateFailoerCandidates multiple times with same leader.

* Final fixes, resilience tests now green, crossing fingers for jenkins

* Fixed race on atomics comparison

* Fixed invalid number type

* added nullptr handling

* added nullptr handling

* Removed invalid assert

* Make takeover of leadership an atomic operation

* Update tests/js/common/shell/shell-cluster-collection.js

Co-Authored-By: Tobias Gödderz <tobias@arangodb.com>

* Review fixes

* Fixed creation code to use takeoverLeadership

* Update arangod/Cluster/FollowerInfo.h

Co-Authored-By: Tobias Gödderz <tobias@arangodb.com>

* Applied review fixes

* There is no timeout

* Moved AQL + Pregel to INTERNAL_AQL lane, which is medium priority, to avoid deadlocks with Sync replication

* More review fixes

* Use difference if you want to compare two vectors...

* Use std::string ...

* Now check if we are in recovery mode

* Added documentation for minReplicationFactor

* Added readme update as well in documenation

* Removed merge conflict leftovers 0o, i should not trust the IDE

* Update js/apps/system/_admin/aardvark/APP/frontend/js/views/collectionsView.js

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* Update js/apps/system/_admin/aardvark/APP/frontend/js/views/collectionsView.js

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* Update Documentation/Books/Manual/Architecture/Replication/README.md

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* Update CHANGELOG

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* Update Documentation/Books/Manual/DataModeling/Collections/DatabaseMethods.md

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* Update Documentation/Books/Manual/ReleaseNotes/NewFeatures35.md

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* Update Documentation/DocuBlocks/Rest/Collections/1_structs.md

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* Update js/apps/system/_admin/aardvark/APP/frontend/js/views/graphManagementView.js

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* Update js/apps/system/_admin/aardvark/APP/frontend/js/views/graphManagementView.js

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* Update Documentation/DocuBlocks/Rest/Graph/1_structs.md

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* Apply suggestions from code review

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* Adepted review requests, thanks for finding!

* Removed unnecessary const

* Apply suggestions from code review

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* Moved initilization of variable more downwards

* Apply lock before notify_all()

* Remove documentation except DocuBlocks, covered by PR in docs repo

* Remove accidental indent

* Removed leftover merge conflict in documentation block
2019-07-23 13:14:38 +02:00
Jan cdbe63fa6e
Bug fix/fix races in collection creation (#9506) 2019-07-19 15:11:08 +02:00
Michael Hackstein 36b1d290a9
Bug fix/failover with min replication factor (#9486)
* Improve collection time of IResearchQueryOptimizationTest

* Added a minReplicationFactor field in Collections. It is not possible to modify it yet and noone cares for it

* Added some assertion son minReplicationFactor

* Transaction API will now reject writes as soon as minimal replication factor is NOT fulfilled

* added minReplicationFactor to the user interface, preparation for the collection api changes

* added minReplicationFactor to VocBaseCollection, RestReplicationHandler, RestCollectionHandler, ClusterMethods, ClusterInfo and ClusterCollectionCreationInfo

* added minReplicationFactor usage to tests

* TODO TEMOPORARY COMMIT FOR TESTING PLEASE REVERT ME

* minReplicationFactor now able to change via collection  properties route

* fixed wrongly assert

* added minReplicationFactor to the graph management ui

* added minReplicationFactor to the gharial api

* Fixed off-by-one error in minReplicationFactor. We actually enforced one more.

* adjusted description of minReplicationFactor

* FollowerInfo Refactoring

* added gharial api graph creation tests with minimal replication factor

* proper cleanup of shell collection tests, removed lots of duplicate code, preparation for some new tests

* added collection create tests using invalid/valid names, replicationFactor and minReplicationFactor

* Debug logging

* MORE Debug logging

* Included replication fast lane

* Use correct minreplicationfactor

* modified debug logging

* Fixed compileissues

* MORE Debug logging

* MORE Debug logging

* MORE Debug logging

* MORE Debug logging

* MORE Debug logging

* MORE Debug logging

* MORE Debug logging

* Revert "MORE Debug logging"

This reverts commit dab5af28c0.

* Revert "MORE Debug logging"

This reverts commit 6134b664bd.

* Revert "MORE Debug logging"

This reverts commit 80160bdf3b.

* Revert "MORE Debug logging"

This reverts commit 06aabcdfe1.

* Removed debug output

* Added replication fast lane. Also refactored the commands as i cannot take it any more...

* Put some requests of RocksDBReplication onto CATCHUP Lane.

* Put some requests of MMFilesReplication onto CATCHUP Lane.

* Adjusted Fast and MED lane usage in Supervised scheduler

* Added changelog entry

* Added new features entry

* A new leader will now keep old followers in case of failover

* Update arangod/Cluster/ClusterCollectionCreationInfo.cpp

Co-Authored-By: Tobias Gödderz <tobias@arangodb.com>

* Fixed JSLINT

* Unified lane handling of replication handlers

* Sorry forgotten in last commit

* replaced strings with static strings

* more use of static strings

* optimized min repl description in the ui

* decr initial loop variable

* clean up of the createWithId test

* more use of static strings

* Update js/apps/system/_admin/aardvark/APP/frontend/js/views/collectionsView.js

Co-Authored-By: Tobias Gödderz <tobias@arangodb.com>

* Added some comments on condition, renamed variable as suggested in review

* Added check for min replicationFactor to be non-zero

* Added assertion

* Added function to modify min and max replication factor in one go

* added missing semicolon

* rm log devel

* Added a second information to follower info that can keep track of followers that have been in sync before a failover has taken place

* Maintenance reports previous version now to follower info. instead of lying by itself. The Follower Info now gets a failover save mode to report insync followers

* check replFactor against nr dbservers

* Add lie reporting in CURRENT

* Reverted most of my recent commits about Failover situation. The intended plan simply does not work out

* move replication checks from logical collection to rest collection handler

* added more replication tests

* Include assert only if we are not in gtest

* jslint

* set min repl factor to zero if satellite collection

* check replication attributes in v8 collection

* Initial commit, old plan, does not yet work

* fixed ires tests

* Included FailoverCandidates key. Not fully implemented

* fixed wrong assert

* unified in sync follower reporting

* fixed compiler errors

* Cleanup locking, and fixed potential deadlocks

* Comments about locking order in FollowerInfo.

* properly check uint

* Keep old leader as potential failover candidate

* Transaction methods now use followerInfo to check if the leader can write, this might have the sideeffect that 'failoverCandidates' are updated

* Let agency check failoverCandidates if possible

* Initialize member variables

* Use unified follower reporting in DBServerAgencySync

* Removed obsolete variable, collecting it somewhere else

* repl factor attr check

* Reimplemented previous followers, second attempt now. PhaseOne and PhaseTwo can now synchronize on current.

* Fixed assertion, forgot an off-by-one

* adjusted test to be more preciese now

* Fixed failove candidates list

* Disable write on dropping too many followers

* Allow to run updateFailoerCandidates multiple times with same leader.

* Final fixes, resilience tests now green, crossing fingers for jenkins

* Fixed race on atomics comparison

* Fixed invalid number type

* added nullptr handling

* added nullptr handling

* Removed invalid assert

* Make takeover of leadership an atomic operation

* Update tests/js/common/shell/shell-cluster-collection.js

Co-Authored-By: Tobias Gödderz <tobias@arangodb.com>

* Review fixes

* Fixed creation code to use takeoverLeadership

* Update arangod/Cluster/FollowerInfo.h

Co-Authored-By: Tobias Gödderz <tobias@arangodb.com>

* Applied review fixes

* There is no timeout

* Moved AQL + Pregel to INTERNAL_AQL lane, which is medium priority, to avoid deadlocks with Sync replication

* More review fixes

* Use difference if you want to compare two vectors...

* Use std::string ...

* Now check if we are in recovery mode

* Added documentation for minReplicationFactor

* Added readme update as well in documenation
2019-07-19 15:00:30 +02:00
Michael Hackstein cbcf561450
Feature/min replication factor (#9433)
* Added a minReplicationFactor field in Collections. It is not possible to modify it yet and noone cares for it

* Added some assertion son minReplicationFactor

* Transaction API will now reject writes as soon as minimal replication factor is NOT fulfilled

* added minReplicationFactor to the user interface, preparation for the collection api changes

* added minReplicationFactor to VocBaseCollection, RestReplicationHandler, RestCollectionHandler, ClusterMethods, ClusterInfo and ClusterCollectionCreationInfo

* added minReplicationFactor usage to tests

* TODO TEMOPORARY COMMIT FOR TESTING PLEASE REVERT ME

* minReplicationFactor now able to change via collection  properties route

* fixed wrongly assert

* added minReplicationFactor to the graph management ui

* added minReplicationFactor to the gharial api

* Fixed off-by-one error in minReplicationFactor. We actually enforced one more.

* adjusted description of minReplicationFactor

* FollowerInfo Refactoring

* added gharial api graph creation tests with minimal replication factor

* proper cleanup of shell collection tests, removed lots of duplicate code, preparation for some new tests

* added collection create tests using invalid/valid names, replicationFactor and minReplicationFactor

* Debug logging

* MORE Debug logging

* Included replication fast lane

* Use correct minreplicationfactor

* modified debug logging

* Fixed compileissues

* MORE Debug logging

* MORE Debug logging

* MORE Debug logging

* MORE Debug logging

* MORE Debug logging

* MORE Debug logging

* MORE Debug logging

* Revert "MORE Debug logging"

This reverts commit dab5af28c0.

* Revert "MORE Debug logging"

This reverts commit 6134b664bd.

* Revert "MORE Debug logging"

This reverts commit 80160bdf3b.

* Revert "MORE Debug logging"

This reverts commit 06aabcdfe1.

* Removed debug output

* Added replication fast lane. Also refactored the commands as i cannot take it any more...

* Put some requests of RocksDBReplication onto CATCHUP Lane.

* Put some requests of MMFilesReplication onto CATCHUP Lane.

* Adjusted Fast and MED lane usage in Supervised scheduler

* Added changelog entry

* Added new features entry

* A new leader will now keep old followers in case of failover

* Update arangod/Cluster/ClusterCollectionCreationInfo.cpp

Co-Authored-By: Tobias Gödderz <tobias@arangodb.com>

* Fixed JSLINT

* Unified lane handling of replication handlers

* Sorry forgotten in last commit

* replaced strings with static strings

* more use of static strings

* optimized min repl description in the ui

* decr initial loop variable

* clean up of the createWithId test

* more use of static strings

* Update js/apps/system/_admin/aardvark/APP/frontend/js/views/collectionsView.js

Co-Authored-By: Tobias Gödderz <tobias@arangodb.com>

* Added some comments on condition, renamed variable as suggested in review

* Added check for min replicationFactor to be non-zero

* Added assertion

* Added function to modify min and max replication factor in one go

* added missing semicolon

* rm log devel

* Added a second information to follower info that can keep track of followers that have been in sync before a failover has taken place

* Maintenance reports previous version now to follower info. instead of lying by itself. The Follower Info now gets a failover save mode to report insync followers

* check replFactor against nr dbservers

* Add lie reporting in CURRENT

* Reverted most of my recent commits about Failover situation. The intended plan simply does not work out

* move replication checks from logical collection to rest collection handler

* added more replication tests

* Include assert only if we are not in gtest

* jslint

* set min repl factor to zero if satellite collection

* check replication attributes in v8 collection

* fixed ires tests

* fixed wrong assert

* properly check uint

* repl factor attr check

* adjusted test to be more preciese now

* Fixed race on atomics comparison

* Fixed invalid number type

* Update tests/js/common/shell/shell-cluster-collection.js

Co-Authored-By: Tobias Gödderz <tobias@arangodb.com>

* Review fixes

* More review fixes
2019-07-19 13:02:28 +02:00
Michael Hackstein d135d55d55
Bug fix/collection babies (#9124)
* Bug fix 3.4/collection babies (#9033)

* Prepare API to create multiple collections in a single request to ClusterMethods to improve speedup

* Added counter on how many collections are successfully created

* Allow multi collection creation one level higher

* CollectionMethods now allow batch createion of Collections

* Improved array size assertions

* Now a graph is createad within a single roundtrip in the agency.

* Added new header files

* Insert collections in the AGENCY with TTL and a isBuilding flag, collections with this flag should not be visisible in the coordinator

* Added forgotten C++ file

* Fixed a rare race condition, and the failing IResearch Tests

* readded callback on DONE, otherwise lists are out of sync

* Fixed assertions to let mocked tests pass...

* Fixed community cluster

* Started fixing IResearch analyzer test, catch-tests are failing ;(

* Solved missed merge-conflict

* Added helper functions in AnalyzerFeature-test

* Refactoring AnalyzerTest Section-Auth

* Refactoring AnalyzerTest Section-Emplace-Duplicates

* Refactoring AnalyzerTest Section-Emplace-Error-Cases. Recovery-Test is now red, it seemed to be green because of invalid test case before.

* Refactoring AnalyzerTest, split GET test into multiple parts, still left 'cluster simulation'.

* Attempt to extract Coordinator / DBServer tests a little bit. This commit starts to break all Coordinator tests. However i am convinced that earlier version did NOT test a cluster situation at all, but some hybrid of SingleServer with full local storage that got told to be a Coordinator from now on, but without any Coordinator setup...

* Temporarly disabled some tests in AnalyzerFeature, as discussed with @gnusi.

* Fixed include guard.

* Temporarily deactivated failing tests

* You shall save your files before you commit...

* Fixed test asserting on plan version, which is now higher than before
2019-06-03 17:11:22 +02:00
Dan Larkin-York d5ecdd143a Convert unit tests to googletest framework (#9034) 2019-05-21 09:17:46 +02:00
Simon 569198a089 Abort el-cheapo transactions if servers fail (#8799) 2019-04-22 19:31:24 +02:00
Kaveh Vahedipour 68178ba165 [devel] supervision bug fix backports (#8314)
* back ports for supervision fixes from 3.4 part 1

* back ports for supervision fixes from 3.4 part 2
2019-03-04 19:27:24 +01:00
Vasiliy 8b94be9bf1 issue 504: return Result instead of int from all ClusterInfo functions (#7954) 2019-01-16 18:07:27 +03:00
Frank Celler ac9f375fb5 big reformat 2018-12-26 00:54:03 +01:00
Andrey Abramov 6674a4282d
avoid calling cluster related functions while instantiating views on … (#7509)
* avoid calling cluster related functions while instantiating views on a db server

* minor cleanup
2018-11-29 15:43:53 +03:00
Max Neunhöffer ae29e5d2ba
Fix index creation in cluster. (#7440)
* Fix index creation in cluster.

Simplify and correct error handling logic in ensureIndexCoordinator.

* After index creation, wait until index appears.

We wait until the Supervision has removed the isBuilding flag and
the coordinator has reloaded the Plan.

* More index handling fixes.

* Directly remove isBuilding in ensureIndexCoordinator (again).

* Fix catch tests by holding mutex shorter.

* Better mutex handling in ClusterInfo.
2018-11-28 16:58:05 +01:00
Kaveh Vahedipour 9ec6619b84 Bug fix/index readiness (#6541)
* indexes are marked  while still missing in Current
* index handling getCollection
* supervision gets indexes from isbuilding, when coordinator is gone before finishing
* seems right now
* fixed broken views
* remove junk comments
* cleanup
* node / supervision adjustements
* supervision fixes
* neunhoef remarks part i
* neunhoef remarks part ii
* neunhoef remarks part ii
* neunhoef remarks part iiI
* collection's current version please
* no need to wait for current once again
* no longer necessary code
* clear comments
* delete left overs
* dead code revived
2018-11-21 14:42:58 +01:00
Wilfried Goesgens 05a7d4e96e add alternative to ClusterInfo::getCollection() that doesn't throw (#7339) 2018-11-20 16:05:57 +01:00
Simon c72818a9dc Make ensureIndexOnCoordinator more robust (#7110) 2018-10-29 17:45:46 +01:00
Simon 0fa7f01c66 Resilience test failure points (#6539) 2018-09-20 01:05:10 +02:00
Max Neunhöffer 84735955ea Add advertised endpoints. (#6104) 2018-09-13 16:30:55 +02:00
Kaveh Vahedipour 28754cbf15 Feature/schmutz plus plus (#5972)
- Schmutz now called "Maintenance" and completely implemented in C++
 - Fix index locking bug in mmfiles
 - Fix a bug in mmfiles with silent option and repsert
 - Slightly increase supervision okperiod and graceperiod
2018-08-24 12:15:35 +02:00
Jan e4d7f1c5f0
Bug fix/wenn der shard mann 2mal klingelt (#5890) 2018-07-26 15:37:40 +02:00
Max Neunhöffer 014c3f7f53
Only load Plan and Current in ClusterInfo when actually needed. (#5649)
* Only update Plan and Current from Agency if not already done.
* Add read protection for getPlanVersion and getCurrentVersion.
* Add a further check to loadPlan and loadCurrent.
* Fix tests to new behaviour.
* Try to increase Plan/Version and Current/Version with every change.
* Add two more increments of Plan/Version
* Add missing increments in tests for Plan/Version.
* Add changelog entry.
2018-07-16 12:20:13 +02:00
Dan Larkin-York 21e16a8a24 Add load balancer awareness for cursor API (#5682) 2018-07-03 14:29:09 +02:00
Vasiliy 7aaeab50fb issue 402.1: share sync thread between IResearchView and IResearchViewDBServer (#5733) 2018-07-02 15:03:00 +03:00
Andrey Abramov 5eef6cd618
Feature/test iresearch (#5610)
* start implementing arangosearch cluster tests.

* backport: ensure view lookup is done via collectionNameResover, ensure updateProperties returns current view properties

* first attempt to fix failing tests

* refactor cluster wide view creation logic

* if view is not found in the new plan then check the old plan too

* ensure the cluster-wide view is looked up in vocbase as well on startup/recovery

* do not store cluster-wide IResearchView in vocbase

* move stale view cleanup to the shared pointer deleter, address test failures

* do not print warning

* enable arangosearch tests by default

* fix catch tests

* address icorrect return value for cluster-wide links

* address some issues with test failures due to cluster-view allocated within TRI_vocbase_t

* simplify per-cid view name, address 'catch' test failures

* ensure IResearchViewNode volatility is properly calculated in cluster

* invoke callbacks directly in AgencyMock instead of waiting for timeout

* ensure view updates via JavaScript always use the latest view definition

* pass a list of shards to `IResearchViewDBServer::snapshot`

* extend cluster aql tests

* fixes after merge

* fix class/struct inconsistencies

* comment failing tests

* remove debug logging

* add debug function

* tests cleanup

* simplify upcoming merge: pass resolver from a side

* backport: move all transaction status callback logic to Methods

* add changes missed from previous commit

* fix js and ruby tests

* more tests for IResearchViewNode

* pass transaction to IResearchViewDBServer::snapshot, address IResearchViewDBServer tests segfault

* pass transaction to IResearchView::snapshot instead of transaction state

* temporarily add trace log output to tests to try to find the cause of the core dump on Jenkins

* add more temporary debug output to trace down the segfault on Jenkins

* add even more temporary debug output to trace down the segfault on Jenkins

* ensure Vieew related maps are cleared during shutdown

* reset ClusterInfo::instance() before DatabaseFeature::unprepare()

* remove extraneous debug output

* missed line from previous commit

* uncomment required line

* add nullptr checks to RocksDBIndexFactory::prepareIndexes(...) similar to the ones in MMFilesIndexFactory::prepareIndexes(...)

* attempt to fix deadlock in tests

* add comment as per reviewer request

* fix aql test suite name

* add some debug logging

* address deadlock between ClusterInfo::loadPlan() and CollectionNameResolver::localNameLookup(...)

* eplicitly state which index definition failed in the log message

* use vocbase from shard-view isntead just in case

* explicitly state which index definition failed in the log message

* do not create shard-view instances from cluster-link instances (only register existing ones)

* add some tests
2018-06-21 20:35:04 +03:00
Vasiliy 4253dca6aa issue 381.5: ensure the LogicalView definition that is persisted to the Agency matches the definition that gets created (#5518)
* issue 381.5: ensure the LogicalView definition that is persisted to the Agency matches the definition that gets created

* backport: correct comment
2018-06-02 17:21:55 +03:00
Andrey Abramov 4649b40b96
Coordinator ArangoSearch view + Execution nodes + AgencyMock (#5160)
* add initial implementation of scatter view rule and node

* add tests for `IResearchViewNode` and `IResearchViewScatterNode`

* add missing check

* modify IResearch execution nodes to use references instead of pointers

* use view id in searialized `ExecutionNode` representation instead of the name

* add cluster mode stubs and checks

* very first attempt to distribute IResearchViewNode

* further implementation of cluster-wide arangosearch views

* fix invalid json format

* add tests for coordinator iresearch view

* allow to retrieve a list of existing views on a coordinator

* more tests for coordinator iresearch view

* some fixes to enable query explanation

* remove Collection dependency from RemoteNode

* remove unnecessary remote ArangoSearch view scatter

* fix explanation appearance

* add some assertions

* minor fixes

* implement IResearchViewCoordinator::updateProperties

* fix view DDL issues

* handle link modifications in DDL operations

* add coordinator implementation of iresearch view links

* fix tests

* further coordinator based view DDL implementation

* further IResearchViewCoordinator implementation

* add initial implementation of AgencyMock

* fix some tests

* code cleanup

* extend test + some fixes

* more tests for IResearchViewCoordinator

* fix tests for IResearchLinkCoordinator

* some fixes after merge

* fix tests

* remove declaration of nonexistent (previously removed) method

* some fixes after review

* remove string duplication

* more tests and fixes

* more fixes and tests

* more tests

* one more test

* fix 'use-after-free' asan error

* fix non-enterprise tests issues
2018-05-02 00:15:11 +03:00
Max Neunhoeffer ce8db24975
Add methods in ClusterInfo to create and drop views. 2018-03-14 23:22:44 +01:00
Max Neunhoeffer a8a307b532
Report views in ClusterInfo.
This is incomplete as it is, because we do not yet parse the views
we see in the plan.
2018-03-08 14:24:22 +01:00
Andrey Abramov a1cfb3d72b Feature iresearch (#4105) 2018-01-19 14:23:58 +01:00
Jan 25af4d7f69
try to not fail hard when a collection is dropped while the WAL is tailed (#4226) 2018-01-04 16:31:11 +01:00
Heiko 61de1b6099 Bug fix/optimize shard distribution api and ui (#3921)
* UI: document/edge editor now remembering their modes (e.g. code or tree)

* changed shardDistribution api behaviour, added PUT route to only fetch collection based shard distribution

* ui: optimized shards view, added missing cleanup function in nodes view

* broken test

* shard distribution tests not fit the new api behaviour

* variables as reference

* CHANGELOG
2018-01-02 12:42:12 +01:00
Jan 17986ebc08
return error context for "some agency operation failed" (#3760) 2017-12-06 11:16:19 +01:00
Michael Hackstein 5c633f9fae Bug fix/speedup shard distribution (#3645)
* Added a more sophisticated test for shardDistribution format

* Updated shard distribution test to use request instead of download

* Added a cxx reporter for the shard distribuation. WIP

* Added some virtual functions/classes for Mocking

* Added a unittest for the new CXX ShardDistribution Reporter.

* The ShardDsitributionReporter now reports Plan and Current correctly. However it does not dare to find a good total/current value and just returns a default. Hence these tests are still red

* Shard distribution now uses the cxx variant

* The ShardDistribution reporter now tries to execute count on the shards

* Updated changelog

* Added error case tests. If the servers time out the mechanism will stop bothering after two seconds and just report default values.
2017-11-10 15:17:08 +01:00
Simon Grätzer 7c31960cf2 Feature/async failover (#3451) 2017-10-18 23:59:29 +02:00
Jan 0561bf45ce Bug fix/isrestore (#3283)
* Make isRestore work in the cluster.

This covers sharded collections with default sharding and non-default
sharding.

* always use locally generate revision ids for storing and looking up documents
2017-10-03 11:53:49 +02:00
Jan 5165155ed1 Bug fix/fixes 0609 (#3227)
* do not use V8 variant of AQL functions in early optimization stage when a C++ variant is available

* additionally, simplify AQL function definitions and aliases

* warn when more than 90% of max mappings are in use

* added C++ variant of replication catchup

* added `--log.role` option

* updated CHANGELOG

* removed non-existing scheduler.threads option from config

* removed useless __FILE__, __LINE__ invocations

* updated CHANGELOG

* allow a priority V8 context

* remove TRI_CORE_MEM_ZONE

* try to fix Windows errors & warnings

* cleanup

* removed memory zones altogether

* exclude system collections from collection tests
2017-09-13 16:28:21 +02:00
Max Neunhöffer f3acea797b Feature/cluster inventory version (#3152)
* Get rid of a compiler warning in community edition.

* Teach /_api/replication/clusterInventory to report Plan/Version and readiness.

This is first implemented in the ClusterInfo library.
Then the clusterInventory code uses it and checks readiness.
Readiness of a collection means that it is created and all shards
and all replications have been created and are in sync.
2017-08-30 13:34:23 +02:00