Lars Maier
642c5fd994
Bug fix 3.3/cleanup lost collections ( #6721 )
...
* Working draft: clean lost collections in supervision.
* Added early exit as in spec.
* Finished test. Fixed logging.
* Increase plan version when cleaning out a lost collection.
* Increase the current version rather than the plan version.
* Fixed test for 3.3
2018-10-08 16:35:18 +02:00
Matthew Von-Maszewski
ec5a2f62b8
3.3: Bring two key Agency bug fixes, plus some secondary stuff back to 3.3 ( #6009 )
2018-08-08 10:33:17 +02:00
Kaveh Vahedipour
507418d9a4
stop supervision on demand ( #5109 )
...
* stop supervision on demand
* adding tests
* Correct an error message.
2018-04-20 11:58:47 +02:00
Kaveh Vahedipour
cce5b2decb
Bug fix 3.3/supervision to delete removed nodes from health ( #4455 )
2018-02-13 15:55:42 +01:00
Kaveh Vahedipour
255d90d26a
cherry pick from 3.2 pull request for bug-fix/supervision-thread-exists-on-pre3.2-agency ( #3709 )
...
This is the HealthRecord upgrade patch.
2017-11-17 10:14:14 +01:00
Jan
bef52d7dc3
Bug fix/cleanup after cppcheck ( #3639 )
2017-11-10 13:53:28 +01:00
Kaveh Vahedipour
627f344266
fixed a bug, where when servers failed, when also agency leadership c… ( #3189 )
...
* fixed a bug, where when servers failed, when also agency leadership changes
* redid entire design of checkDBServers/checkCoordinators.
* comparison in supervision must be between oldPersisted and newHealth
* UI stuff
* UI stuff
* FailedServer test needed adjustment
* Hopefully final round
* fixed supervision failure detection
* FailedServer tests back to origin devel
* oldNot documented among preconditions in Agency HTTP API docs
* changed only look for status updated
* non action line in api-cluster
2017-09-07 16:10:23 +02:00
Kaveh Vahedipour
00650e6a3f
Bug fix/agency mt fixes ( #3158 )
...
* added debugging methods
* try to fix invalid access in case of error
* remove unused members
* bugfixes and comments
* all agency fixes in
* merge bug
* partially unguarded Agent::lead fixed
* all agency fixes in
* added nrBlocked to thread startup eval
* added nrBlocked to thread startup eval
* recombination of cases in State::get
* some maps replaced with unordered_maps
* optimized maps some
2017-08-30 10:43:51 +02:00
Andreas Streichardt
fe59502848
Fix server health
2017-05-11 12:20:15 +02:00
Kaveh Vahedipour
68efba18e8
keep agencyPrefix, when non set
2017-04-26 15:32:26 +02:00
Kaveh Vahedipour
1f81ce28b0
merge in cpp & js from 3.1.18 yet to do tests
2017-04-21 15:41:05 +02:00
Kaveh Vahedipour
8d66d69f83
supervision handles coordinator demise correctly
2017-02-07 11:29:37 +01:00
Kaveh Vahedipour
aaee2f9e61
transient heartbeats
2017-01-18 13:43:33 +01:00
Kaveh Vahedipour
55985ed5de
missing prototypes
2017-01-09 10:38:34 +01:00
jsteemann
7359ac44b2
more style cleanup
2017-01-05 10:52:03 +01:00
Kaveh Vahedipour
12e54902df
agency's supervision must wait grace period after becoming leader before acting on db server failure
2016-12-21 11:17:41 +01:00
Max Neunhoeffer
985ccaeb70
Get rid of Supervision::wakeUp().
2016-12-20 10:19:24 +01:00
Kaveh Vahedipour
51b279346b
redirects to myelf should be hinstory
2016-12-06 17:10:15 +01:00
Andreas Streichardt
63a173f002
Delete all shard move jobs when server is healthy again
2016-11-22 14:13:09 +01:00
Kaveh Vahedipour
9a6f605f2f
fixed small double / long conversion
2016-10-31 17:00:55 +01:00
Kaveh Vahedipour
f8235b9c63
agency locks code review
2016-10-25 15:07:57 +02:00
Max Neunhoeffer
3a76784af4
Protect memory accesses to _snapshot in Supervision.
2016-10-12 10:23:21 +00:00
Kaveh Vahedipour
1f4abf3c36
upgrade 3.0 agency to 3.1
2016-10-06 17:04:29 +02:00
jsteemann
f5a595f464
Merge branch 'devel' of https://github.com/arangodb/arangodb into generic-col-types
2016-09-07 08:52:07 +02:00
Andreas Streichardt
6396ac4dc7
Implement removeServer job
2016-09-06 16:49:25 +02:00
jsteemann
6ddf8bab54
Merge branch 'devel' of https://github.com/arangodb/arangodb into generic-col-types
2016-09-06 11:22:14 +02:00
Kaveh Vahedipour
85ea1d5ff9
clang-format
2016-09-06 10:01:33 +02:00
Andreas Streichardt
f9fea70c3e
readd method
2016-09-05 15:50:41 +02:00
Kaveh Vahedipour
9808a55a33
some cleaning up
2016-09-05 15:12:46 +02:00
jsteemann
c6efe26198
cppcheck
2016-08-25 14:04:23 +02:00
Andreas Streichardt
89ebeefbb9
Proper shutdown
2016-08-24 13:51:23 +02:00
Andreas Streichardt
47a0f8602a
Better shutdown handling
2016-08-23 12:51:38 +02:00
Andreas Streichardt
03b9d97e2f
Implement proper cluster shutdown
2016-08-18 11:23:23 +02:00
Andreas Streichardt
3f412debf0
Revert futile attempts to implement client resilience tests
2016-08-17 18:12:40 +02:00
Andreas Streichardt
70af1e3647
Implement proper cluster shutdown
2016-08-17 17:25:39 +02:00
Andreas Streichardt
526c8f42c2
Fix foxx issues in cluster
...
Bootstrap will now be done on the bootstrap coordinator.
queues will now be executed by the "foxxmaster"
2016-07-29 16:06:31 +02:00
jsteemann
f21561b25f
use nullptr, don't include Thread.h when unnecessary
2016-06-15 19:21:53 +02:00
Kaveh Vahedipour
beba4887a3
shrink cluster in supervision
2016-06-10 18:10:37 +02:00
Kaveh Vahedipour
00d6111a3e
server health for aardvark
2016-06-03 14:27:04 +02:00
Kaveh Vahedipour
427453bcc7
server health for aardvark
2016-06-03 12:19:39 +02:00
Kaveh Vahedipour
9957270df6
hunting down exceptions in agency supervision
2016-05-31 21:42:41 +02:00
Max Neunhoeffer
b600ddbeb4
Fix getUniqueIds and updateAgencyPrefix in Supervision.
...
This prevents some race conditions at cluster startup that crashed the
agency.
2016-05-31 12:38:17 -06:00
Kaveh Vahedipour
7b440f94dc
Moving Job classes out of Supervision
2016-05-31 16:28:54 +02:00
Kaveh Vahedipour
bad7a6a35a
leader fail seems good
2016-05-31 15:21:42 +02:00
Kaveh Vahedipour
68478f530d
visual studio warning
2016-05-30 15:47:08 +02:00
Kaveh Vahedipour
318a073068
finish cleans up blocks
2016-05-27 16:27:38 +02:00
Kaveh Vahedipour
1846a3c4f7
finished jobs. clean out server, failed leader, move shard
2016-05-25 17:45:28 +02:00
Kaveh Vahedipour
00d3587e9a
Supervision moves shards
2016-05-24 15:57:08 +02:00
Kaveh Vahedipour
3d0ebeab13
Some surious warnings.
2016-05-23 17:34:52 +02:00
Kaveh Vahedipour
6110773fdb
Redone job design in supervision to simpler interface.
2016-05-23 17:07:35 +02:00