1
0
Fork 0
Commit Graph

233 Commits

Author SHA1 Message Date
Simon Grätzer ee8209943f Missing things for active / passive (#3578)
* Switching from ttl to supervision based failover mechanism

* Allowing canceling of ongoing actions

* refactored asyncjobmanager

* refactoring some code

* adding read-only flag

* catching some exceptions to reduce log pollution, removing unnecessary code, removing tests for _changeMode

* fixing "createsANewDatabaseWithAnInvalidUser"

* auth = off does not longer make everyone superuser

* Fixing cluster_sync and maybe resilience
2017-11-04 20:30:23 +01:00
Michael Hackstein 15d9a4be5f Reactivated the failover of the FoxxMaster, it was not modified anymore after the current master dies (#3510) 2017-10-25 18:03:24 +02:00
Simon Grätzer 7c31960cf2 Feature/async failover (#3451) 2017-10-18 23:59:29 +02:00
Max Neunhöffer 9a2385b941 Add host id detection and show in /_admin/cluster/Health. (#3389) 2017-10-11 12:42:44 +02:00
Kaveh Vahedipour 627f344266 fixed a bug, where when servers failed, when also agency leadership c… (#3189)
* fixed a bug, where when servers failed, when also agency leadership changes

* redid entire design of checkDBServers/checkCoordinators.

* comparison in supervision must be between oldPersisted and newHealth

* UI stuff

* UI stuff

* FailedServer test needed adjustment

* Hopefully final round

* fixed supervision failure detection

* FailedServer tests back to origin devel

* oldNot documented among preconditions in Agency HTTP API docs

* changed only look for status updated

* non action line in api-cluster
2017-09-07 16:10:23 +02:00
Kaveh Vahedipour 00650e6a3f Bug fix/agency mt fixes (#3158)
* added debugging methods

* try to fix invalid access in case of error

* remove unused members

* bugfixes and comments

* all agency fixes in

* merge bug

* partially unguarded Agent::lead fixed

* all agency fixes in

* added nrBlocked to thread startup eval

* added nrBlocked to thread startup eval

* recombination of cases in State::get

* some maps replaced with unordered_maps

* optimized maps some
2017-08-30 10:43:51 +02:00
Andreas Streichardt 8e15412e06 Wait for supervision node to prevent races 2017-06-09 15:52:29 +02:00
jsteemann 2930ab6b57 cppcheck 2017-05-15 22:39:16 +02:00
Andreas Streichardt fe59502848 Fix server health 2017-05-11 12:20:15 +02:00
Kaveh Vahedipour de77b5ec7a getting rid of exceptions in supervision 2017-05-10 17:50:31 +02:00
Kaveh Vahedipour b0e7ce40f0 avoid exceptions in supervision main thread when running without cluster 2017-05-04 14:37:03 +02:00
Kaveh Vahedipour 68efba18e8 keep agencyPrefix, when non set 2017-04-26 15:32:26 +02:00
jsteemann 4289105eb3 fix shutdown issue 2017-04-25 16:09:01 +02:00
Kaveh Vahedipour 09a6888d14 attempt at fixing shutdown bug on mac os x 2017-04-24 10:45:54 +02:00
jsteemann ea8496f1a5 cppcheck 2017-04-21 20:19:36 +02:00
Kaveh Vahedipour 1f81ce28b0 merge in cpp & js from 3.1.18 yet to do tests 2017-04-21 15:41:05 +02:00
Kaveh Vahedipour 4cc830b0df merge from 3.1 2017-02-20 20:05:52 +01:00
jsteemann b3ac54d065 remove global namespace include 2017-02-13 13:03:33 +01:00
jsteemann d024a6d00a remove logging for non-topics 2017-02-10 09:32:50 +01:00
Andreas Streichardt 8349f56e40 Properly check return valiue 2017-02-07 15:15:56 +01:00
Kaveh Vahedipour 8d66d69f83 supervision handles coordinator demise correctly 2017-02-07 11:29:37 +01:00
Kaveh Vahedipour f3cb1307a5 3.1 fixes backported to devel 2017-02-03 10:48:25 +01:00
jsteemann fa917937c4 do not use namespaces in header files 2017-02-01 13:41:31 +01:00
Kaveh Vahedipour 3f3633bd2c supervision to proper preconditioning of jobs on plan 2017-01-27 15:29:22 +01:00
Kaveh Vahedipour c4bff477a6 wrong persistence of status 2017-01-24 12:52:31 +01:00
Kaveh Vahedipour cfbdaff0a8 Back in add follower 2017-01-23 09:39:32 +01:00
Kaveh Vahedipour 163e0158dc before cppcheck enthusiasts start slacking :) 2017-01-20 15:22:30 +01:00
Kaveh Vahedipour d2760f4ef1 pushing avoidServers property 2017-01-20 15:15:03 +01:00
Kaveh Vahedipour bbb45ca397 Correct depiction of servers health status 2017-01-20 09:17:04 +01:00
Kaveh Vahedipour eb661f95f2 Merge branch 'devel' of https://github.com/arangodb/arangodb into devel 2017-01-18 17:26:54 +01:00
Kaveh Vahedipour f47b3b3c9d transient heartbeats 2017-01-18 17:26:45 +01:00
jsteemann 73da10a7e7 remove unused variable 2017-01-18 13:50:07 +01:00
Kaveh Vahedipour aaee2f9e61 transient heartbeats 2017-01-18 13:43:33 +01:00
Kaveh Vahedipour 879102117d more replicationTest 2017-01-16 15:43:32 +01:00
Kaveh Vahedipour a75b3624de resilience move ok again? 2017-01-16 12:09:21 +01:00
Kaveh Vahedipour d30458b011 Supervision should not exit of empty plan collection 2017-01-10 16:53:24 +01:00
Kaveh Vahedipour 331d074ebe more information from ClusterInfo's dropCollectionCoordinator 2017-01-10 16:25:00 +01:00
Kaveh Vahedipour 90c18e4914 waitFor will report more paranoid 2017-01-10 13:53:31 +01:00
Kaveh Vahedipour 55985ed5de missing prototypes 2017-01-09 10:38:34 +01:00
Kaveh Vahedipour ab6678eb1f need to fix tests first 2016-12-29 16:25:30 +01:00
Kaveh Vahedipour ce687562f2 less rigid expectation on smooth operations through agency comm under worst case scenarios. 2016-12-28 10:32:20 +01:00
Kaveh Vahedipour fcdc7601f3 Merge branch 'devel' of https://github.com/arangodb/arangodb into devel 2016-12-23 14:06:34 +01:00
Max Neunhoeffer b6ad88d7f8 Do not getUniqueIds when not leading. 2016-12-23 14:02:44 +01:00
Kaveh Vahedipour 770a49d8ca supervision should only collect unique ids in leadership 2016-12-23 13:27:59 +01:00
Kaveh Vahedipour facd58b8b5 replication factor is enforced after failedleader to add new follower 2016-12-22 13:12:16 +01:00
Kaveh Vahedipour 8924cf7852 let's not count failed db servers in replication factor fix 2016-12-21 15:49:56 +01:00
Kaveh Vahedipour 12e54902df agency's supervision must wait grace period after becoming leader before acting on db server failure 2016-12-21 11:17:41 +01:00
Max Neunhoeffer 985ccaeb70 Get rid of Supervision::wakeUp(). 2016-12-20 10:19:24 +01:00
Kaveh Vahedipour 0e29e93816 race condition in agency when leader impaired 2016-12-19 15:00:32 +01:00
jsteemann 350da367bd fixes for Visual Studio 2016-12-08 17:32:46 +01:00
Kaveh Vahedipour 62c3240f8d Merge branch 'devel' of https://github.com/arangodb/arangodb into devel 2016-12-08 15:12:11 +01:00
Kaveh Vahedipour c6ef45b64d AddFollower to handle multiple followers at the same time 2016-12-08 15:12:05 +01:00
Andreas Streichardt 76c0789e3c Distribute satellites across cluster 2016-12-08 15:11:22 +01:00
Kaveh Vahedipour 2fadea8332 AddFollower tests 2016-12-07 16:42:35 +01:00
Kaveh Vahedipour b930b23fc2 AddFollower jobs for newly arrived db server to satisfy replication factors 2016-12-07 16:20:47 +01:00
Andreas Streichardt 0c97df2527 Fix compilation 2016-12-06 17:20:28 +01:00
Kaveh Vahedipour 51b279346b redirects to myelf should be hinstory 2016-12-06 17:10:15 +01:00
Kaveh Vahedipour 77c8c51865 FailedFollower and Windows build problmes 2016-11-30 15:39:10 +01:00
Kaveh Vahedipour 4a95e82fa6 ShortName for servers in new ugly UUID world 2016-11-25 15:25:51 +01:00
Kaveh Vahedipour ffe7f9f3ad merged in devel 2016-11-16 14:59:53 +01:00
Kaveh Vahedipour 9a6f605f2f fixed small double / long conversion 2016-10-31 17:00:55 +01:00
Kaveh Vahedipour 0e15a305d2 Merge branch 'devel' of https://github.com/arangodb/arangodb into devel 2016-10-31 10:01:33 +01:00
Kaveh Vahedipour cc8270a897 Proposal for auto adaption of agency's RAFT timings 2016-10-31 10:01:26 +01:00
jsteemann cfc9ecd198 fix Visual Studio complaints 2016-10-31 09:59:18 +01:00
Frank Celler 62f4acc325 Merge branch 'devel' of github.com:arangodb/arangodb into FMH 2016-10-26 14:49:16 +02:00
Kaveh Vahedipour f8235b9c63 agency locks code review 2016-10-25 15:07:57 +02:00
Frank Celler e4ba82e8e9 rewrite of AgencyComm 2016-10-23 00:46:30 +02:00
Max Neunhoeffer 3a76784af4 Protect memory accesses to _snapshot in Supervision. 2016-10-12 10:23:21 +00:00
Kaveh Vahedipour cf09546d93 fixed erroneous break of supervision agency updates 2016-10-07 11:01:45 +02:00
Kaveh Vahedipour 0d82614fcc fixed erroneous break of supervision agency updates 2016-10-07 09:19:29 +02:00
Kaveh Vahedipour 1f4abf3c36 upgrade 3.0 agency to 3.1 2016-10-06 17:04:29 +02:00
Kaveh Vahedipour cc0a4ffbaa supervision grace period introduced as command line argument. reappeared db servers are removed from failedServers 2016-09-26 16:00:07 +02:00
jsteemann 34f7e27d6c Merge branch 'devel' of https://github.com/arangodb/arangodb into generic-col-types 2016-09-08 09:27:53 +02:00
Frank Celler e394653cae silenced cppcheck warning 2016-09-08 08:43:22 +02:00
jsteemann e1c847b0f6 Merge branch 'devel' of https://github.com/arangodb/arangodb into generic-col-types 2016-09-07 09:52:38 +02:00
Andreas Streichardt ee0312e65a Fix double lock during shutdown 2016-09-07 09:45:57 +02:00
jsteemann 8a4b51cb92 fixed compile error 2016-09-07 08:53:35 +02:00
jsteemann f5a595f464 Merge branch 'devel' of https://github.com/arangodb/arangodb into generic-col-types 2016-09-07 08:52:07 +02:00
Andreas Streichardt 6786049442 Fix exception 2016-09-06 17:13:26 +02:00
Andreas Streichardt 6396ac4dc7 Implement removeServer job 2016-09-06 16:49:25 +02:00
jsteemann 6ddf8bab54 Merge branch 'devel' of https://github.com/arangodb/arangodb into generic-col-types 2016-09-06 11:22:14 +02:00
Kaveh Vahedipour 85ea1d5ff9 clang-format 2016-09-06 10:01:33 +02:00
Kaveh Vahedipour b1d645b841 We can handle dying dbservers during collection creation 2016-09-05 14:06:53 +02:00
jsteemann 4492409d5f Merge branch 'devel' of https://github.com/arangodb/arangodb into generic-col-types 2016-09-02 15:23:15 +02:00
Kaveh Vahedipour b3b7d7c907 failed servers are excluded from new shard creation 2016-09-02 12:37:53 +02:00
jsteemann fa21e70256 Merge branch 'devel' of https://github.com/arangodb/arangodb into generic-col-types 2016-08-31 17:59:54 +02:00
Kaveh Vahedipour 2550dd22e0 fixed issue with leadership in minority 2016-08-31 17:23:48 +02:00
Kaveh Vahedipour c8178239e1 agency configuration persisted to state machine 2016-08-30 14:41:28 +02:00
jsteemann 07055384b8 Merge branch 'devel' of https://github.com/arangodb/arangodb into readcache 2016-08-24 17:34:59 +02:00
Andreas Streichardt 89ebeefbb9 Proper shutdown 2016-08-24 13:51:23 +02:00
jsteemann f92815b09b Merge branch 'devel' of https://github.com/arangodb/arangodb into engine-vs-velocystream 2016-08-24 09:38:06 +02:00
Andreas Streichardt a9253c7ba5 Coorect agency path (no more / duplicates) 2016-08-23 14:03:41 +02:00
Andreas Streichardt 47a0f8602a Better shutdown handling 2016-08-23 12:51:38 +02:00
jsteemann c5f151da5c Merge branch 'devel' of https://github.com/arangodb/arangodb into readcache 2016-08-19 11:01:15 +02:00
jsteemann d7b2141da0 Merge branch 'devel' of https://github.com/arangodb/arangodb into readcache 2016-08-18 16:22:00 +02:00
Kaveh Vahedipour aee9548308 Merge branch 'devel' of https://github.com/arangodb/arangodb into agency-startup 2016-08-18 15:55:51 +02:00
Kaveh Vahedipour 6edc1ff5fa AgentConfiguration needed to handle its own read/write locks 2016-08-18 15:54:42 +02:00
Andreas Streichardt e7dd194129 Wow...tried waiting for cluster members to shutdown...forgot to set bool flag...fail 2016-08-18 12:25:25 +02:00
jsteemann 53e567f28f Merge branch 'devel' of https://github.com/arangodb/arangodb into readcache 2016-08-18 11:30:07 +02:00
Andreas Streichardt 03b9d97e2f Implement proper cluster shutdown 2016-08-18 11:23:23 +02:00