1
0
Fork 0
Commit Graph

237 Commits

Author SHA1 Message Date
Kaveh Vahedipour ace06575dd when upgrading from 3.1 LastHeartBeatAcked could also have been missing, when the 3.1 cluster had not run for long enough (#3757) 2017-12-08 15:56:19 +01:00
Kaveh Vahedipour c300eee5f0 minor (#3813) 2017-11-27 18:22:13 +01:00
Kaveh Vahedipour 7b80deb5cc Fixed object assignment operator for agency's key value store (#3701)
* Fixed object assignment operator for agency's key value store
* Node's toJson is now actually toJson. getString should be used for string extractions
* adjust agency's documentation (clarify precondition)
2017-11-17 15:49:40 +01:00
Kaveh Vahedipour 255d90d26a cherry pick from 3.2 pull request for bug-fix/supervision-thread-exists-on-pre3.2-agency (#3709)
This is the HealthRecord upgrade patch.
2017-11-17 10:14:14 +01:00
Simon Grätzer ee8209943f Missing things for active / passive (#3578)
* Switching from ttl to supervision based failover mechanism

* Allowing canceling of ongoing actions

* refactored asyncjobmanager

* refactoring some code

* adding read-only flag

* catching some exceptions to reduce log pollution, removing unnecessary code, removing tests for _changeMode

* fixing "createsANewDatabaseWithAnInvalidUser"

* auth = off does not longer make everyone superuser

* Fixing cluster_sync and maybe resilience
2017-11-04 20:30:23 +01:00
Michael Hackstein 15d9a4be5f Reactivated the failover of the FoxxMaster, it was not modified anymore after the current master dies (#3510) 2017-10-25 18:03:24 +02:00
Simon Grätzer 7c31960cf2 Feature/async failover (#3451) 2017-10-18 23:59:29 +02:00
Max Neunhöffer 9a2385b941 Add host id detection and show in /_admin/cluster/Health. (#3389) 2017-10-11 12:42:44 +02:00
Kaveh Vahedipour 627f344266 fixed a bug, where when servers failed, when also agency leadership c… (#3189)
* fixed a bug, where when servers failed, when also agency leadership changes

* redid entire design of checkDBServers/checkCoordinators.

* comparison in supervision must be between oldPersisted and newHealth

* UI stuff

* UI stuff

* FailedServer test needed adjustment

* Hopefully final round

* fixed supervision failure detection

* FailedServer tests back to origin devel

* oldNot documented among preconditions in Agency HTTP API docs

* changed only look for status updated

* non action line in api-cluster
2017-09-07 16:10:23 +02:00
Kaveh Vahedipour 00650e6a3f Bug fix/agency mt fixes (#3158)
* added debugging methods

* try to fix invalid access in case of error

* remove unused members

* bugfixes and comments

* all agency fixes in

* merge bug

* partially unguarded Agent::lead fixed

* all agency fixes in

* added nrBlocked to thread startup eval

* added nrBlocked to thread startup eval

* recombination of cases in State::get

* some maps replaced with unordered_maps

* optimized maps some
2017-08-30 10:43:51 +02:00
Andreas Streichardt 8e15412e06 Wait for supervision node to prevent races 2017-06-09 15:52:29 +02:00
jsteemann 2930ab6b57 cppcheck 2017-05-15 22:39:16 +02:00
Andreas Streichardt fe59502848 Fix server health 2017-05-11 12:20:15 +02:00
Kaveh Vahedipour de77b5ec7a getting rid of exceptions in supervision 2017-05-10 17:50:31 +02:00
Kaveh Vahedipour b0e7ce40f0 avoid exceptions in supervision main thread when running without cluster 2017-05-04 14:37:03 +02:00
Kaveh Vahedipour 68efba18e8 keep agencyPrefix, when non set 2017-04-26 15:32:26 +02:00
jsteemann 4289105eb3 fix shutdown issue 2017-04-25 16:09:01 +02:00
Kaveh Vahedipour 09a6888d14 attempt at fixing shutdown bug on mac os x 2017-04-24 10:45:54 +02:00
jsteemann ea8496f1a5 cppcheck 2017-04-21 20:19:36 +02:00
Kaveh Vahedipour 1f81ce28b0 merge in cpp & js from 3.1.18 yet to do tests 2017-04-21 15:41:05 +02:00
Kaveh Vahedipour 4cc830b0df merge from 3.1 2017-02-20 20:05:52 +01:00
jsteemann b3ac54d065 remove global namespace include 2017-02-13 13:03:33 +01:00
jsteemann d024a6d00a remove logging for non-topics 2017-02-10 09:32:50 +01:00
Andreas Streichardt 8349f56e40 Properly check return valiue 2017-02-07 15:15:56 +01:00
Kaveh Vahedipour 8d66d69f83 supervision handles coordinator demise correctly 2017-02-07 11:29:37 +01:00
Kaveh Vahedipour f3cb1307a5 3.1 fixes backported to devel 2017-02-03 10:48:25 +01:00
jsteemann fa917937c4 do not use namespaces in header files 2017-02-01 13:41:31 +01:00
Kaveh Vahedipour 3f3633bd2c supervision to proper preconditioning of jobs on plan 2017-01-27 15:29:22 +01:00
Kaveh Vahedipour c4bff477a6 wrong persistence of status 2017-01-24 12:52:31 +01:00
Kaveh Vahedipour cfbdaff0a8 Back in add follower 2017-01-23 09:39:32 +01:00
Kaveh Vahedipour 163e0158dc before cppcheck enthusiasts start slacking :) 2017-01-20 15:22:30 +01:00
Kaveh Vahedipour d2760f4ef1 pushing avoidServers property 2017-01-20 15:15:03 +01:00
Kaveh Vahedipour bbb45ca397 Correct depiction of servers health status 2017-01-20 09:17:04 +01:00
Kaveh Vahedipour eb661f95f2 Merge branch 'devel' of https://github.com/arangodb/arangodb into devel 2017-01-18 17:26:54 +01:00
Kaveh Vahedipour f47b3b3c9d transient heartbeats 2017-01-18 17:26:45 +01:00
jsteemann 73da10a7e7 remove unused variable 2017-01-18 13:50:07 +01:00
Kaveh Vahedipour aaee2f9e61 transient heartbeats 2017-01-18 13:43:33 +01:00
Kaveh Vahedipour 879102117d more replicationTest 2017-01-16 15:43:32 +01:00
Kaveh Vahedipour a75b3624de resilience move ok again? 2017-01-16 12:09:21 +01:00
Kaveh Vahedipour d30458b011 Supervision should not exit of empty plan collection 2017-01-10 16:53:24 +01:00
Kaveh Vahedipour 331d074ebe more information from ClusterInfo's dropCollectionCoordinator 2017-01-10 16:25:00 +01:00
Kaveh Vahedipour 90c18e4914 waitFor will report more paranoid 2017-01-10 13:53:31 +01:00
Kaveh Vahedipour 55985ed5de missing prototypes 2017-01-09 10:38:34 +01:00
Kaveh Vahedipour ab6678eb1f need to fix tests first 2016-12-29 16:25:30 +01:00
Kaveh Vahedipour ce687562f2 less rigid expectation on smooth operations through agency comm under worst case scenarios. 2016-12-28 10:32:20 +01:00
Kaveh Vahedipour fcdc7601f3 Merge branch 'devel' of https://github.com/arangodb/arangodb into devel 2016-12-23 14:06:34 +01:00
Max Neunhoeffer b6ad88d7f8 Do not getUniqueIds when not leading. 2016-12-23 14:02:44 +01:00
Kaveh Vahedipour 770a49d8ca supervision should only collect unique ids in leadership 2016-12-23 13:27:59 +01:00
Kaveh Vahedipour facd58b8b5 replication factor is enforced after failedleader to add new follower 2016-12-22 13:12:16 +01:00
Kaveh Vahedipour 8924cf7852 let's not count failed db servers in replication factor fix 2016-12-21 15:49:56 +01:00
Kaveh Vahedipour 12e54902df agency's supervision must wait grace period after becoming leader before acting on db server failure 2016-12-21 11:17:41 +01:00
Max Neunhoeffer 985ccaeb70 Get rid of Supervision::wakeUp(). 2016-12-20 10:19:24 +01:00
Kaveh Vahedipour 0e29e93816 race condition in agency when leader impaired 2016-12-19 15:00:32 +01:00
jsteemann 350da367bd fixes for Visual Studio 2016-12-08 17:32:46 +01:00
Kaveh Vahedipour 62c3240f8d Merge branch 'devel' of https://github.com/arangodb/arangodb into devel 2016-12-08 15:12:11 +01:00
Kaveh Vahedipour c6ef45b64d AddFollower to handle multiple followers at the same time 2016-12-08 15:12:05 +01:00
Andreas Streichardt 76c0789e3c Distribute satellites across cluster 2016-12-08 15:11:22 +01:00
Kaveh Vahedipour 2fadea8332 AddFollower tests 2016-12-07 16:42:35 +01:00
Kaveh Vahedipour b930b23fc2 AddFollower jobs for newly arrived db server to satisfy replication factors 2016-12-07 16:20:47 +01:00
Andreas Streichardt 0c97df2527 Fix compilation 2016-12-06 17:20:28 +01:00
Kaveh Vahedipour 51b279346b redirects to myelf should be hinstory 2016-12-06 17:10:15 +01:00
Kaveh Vahedipour 77c8c51865 FailedFollower and Windows build problmes 2016-11-30 15:39:10 +01:00
Kaveh Vahedipour 4a95e82fa6 ShortName for servers in new ugly UUID world 2016-11-25 15:25:51 +01:00
Kaveh Vahedipour ffe7f9f3ad merged in devel 2016-11-16 14:59:53 +01:00
Kaveh Vahedipour 9a6f605f2f fixed small double / long conversion 2016-10-31 17:00:55 +01:00
Kaveh Vahedipour 0e15a305d2 Merge branch 'devel' of https://github.com/arangodb/arangodb into devel 2016-10-31 10:01:33 +01:00
Kaveh Vahedipour cc8270a897 Proposal for auto adaption of agency's RAFT timings 2016-10-31 10:01:26 +01:00
jsteemann cfc9ecd198 fix Visual Studio complaints 2016-10-31 09:59:18 +01:00
Frank Celler 62f4acc325 Merge branch 'devel' of github.com:arangodb/arangodb into FMH 2016-10-26 14:49:16 +02:00
Kaveh Vahedipour f8235b9c63 agency locks code review 2016-10-25 15:07:57 +02:00
Frank Celler e4ba82e8e9 rewrite of AgencyComm 2016-10-23 00:46:30 +02:00
Max Neunhoeffer 3a76784af4 Protect memory accesses to _snapshot in Supervision. 2016-10-12 10:23:21 +00:00
Kaveh Vahedipour cf09546d93 fixed erroneous break of supervision agency updates 2016-10-07 11:01:45 +02:00
Kaveh Vahedipour 0d82614fcc fixed erroneous break of supervision agency updates 2016-10-07 09:19:29 +02:00
Kaveh Vahedipour 1f4abf3c36 upgrade 3.0 agency to 3.1 2016-10-06 17:04:29 +02:00
Kaveh Vahedipour cc0a4ffbaa supervision grace period introduced as command line argument. reappeared db servers are removed from failedServers 2016-09-26 16:00:07 +02:00
jsteemann 34f7e27d6c Merge branch 'devel' of https://github.com/arangodb/arangodb into generic-col-types 2016-09-08 09:27:53 +02:00
Frank Celler e394653cae silenced cppcheck warning 2016-09-08 08:43:22 +02:00
jsteemann e1c847b0f6 Merge branch 'devel' of https://github.com/arangodb/arangodb into generic-col-types 2016-09-07 09:52:38 +02:00
Andreas Streichardt ee0312e65a Fix double lock during shutdown 2016-09-07 09:45:57 +02:00
jsteemann 8a4b51cb92 fixed compile error 2016-09-07 08:53:35 +02:00
jsteemann f5a595f464 Merge branch 'devel' of https://github.com/arangodb/arangodb into generic-col-types 2016-09-07 08:52:07 +02:00
Andreas Streichardt 6786049442 Fix exception 2016-09-06 17:13:26 +02:00
Andreas Streichardt 6396ac4dc7 Implement removeServer job 2016-09-06 16:49:25 +02:00
jsteemann 6ddf8bab54 Merge branch 'devel' of https://github.com/arangodb/arangodb into generic-col-types 2016-09-06 11:22:14 +02:00
Kaveh Vahedipour 85ea1d5ff9 clang-format 2016-09-06 10:01:33 +02:00
Kaveh Vahedipour b1d645b841 We can handle dying dbservers during collection creation 2016-09-05 14:06:53 +02:00
jsteemann 4492409d5f Merge branch 'devel' of https://github.com/arangodb/arangodb into generic-col-types 2016-09-02 15:23:15 +02:00
Kaveh Vahedipour b3b7d7c907 failed servers are excluded from new shard creation 2016-09-02 12:37:53 +02:00
jsteemann fa21e70256 Merge branch 'devel' of https://github.com/arangodb/arangodb into generic-col-types 2016-08-31 17:59:54 +02:00
Kaveh Vahedipour 2550dd22e0 fixed issue with leadership in minority 2016-08-31 17:23:48 +02:00
Kaveh Vahedipour c8178239e1 agency configuration persisted to state machine 2016-08-30 14:41:28 +02:00
jsteemann 07055384b8 Merge branch 'devel' of https://github.com/arangodb/arangodb into readcache 2016-08-24 17:34:59 +02:00
Andreas Streichardt 89ebeefbb9 Proper shutdown 2016-08-24 13:51:23 +02:00
jsteemann f92815b09b Merge branch 'devel' of https://github.com/arangodb/arangodb into engine-vs-velocystream 2016-08-24 09:38:06 +02:00
Andreas Streichardt a9253c7ba5 Coorect agency path (no more / duplicates) 2016-08-23 14:03:41 +02:00
Andreas Streichardt 47a0f8602a Better shutdown handling 2016-08-23 12:51:38 +02:00
jsteemann c5f151da5c Merge branch 'devel' of https://github.com/arangodb/arangodb into readcache 2016-08-19 11:01:15 +02:00
jsteemann d7b2141da0 Merge branch 'devel' of https://github.com/arangodb/arangodb into readcache 2016-08-18 16:22:00 +02:00
Kaveh Vahedipour aee9548308 Merge branch 'devel' of https://github.com/arangodb/arangodb into agency-startup 2016-08-18 15:55:51 +02:00