* Do not schedule Coordinators in Plan.
* Finish failed server when server is no longer in health.
* Fix removeServer checks.
Check that server is no longer in use before removing it. Give 60s
waiting time for condition to be met. Also observer agency lock.
* Finish FailedFollower job if server no longer follower.
This can happen because RemoveFollower was faster.
* Only use GOOD servers as replacement followers.
* Fix AddFollower for satellite collections.
* Fix RemoveServer for satellite collections.
* MoveShard handles moves from leader to followers
* Prepare CleanoutServer and FailedServer for satellite collections.
* More sorting out of AddFollower and RemoveFollower.
* Fix RemoveFollower job w.r.t. choice of follower to remove.
* Fix message.
* kill you own sub jobs, please
* Added preconditions to payloads for supervision's job finishers
* Improve logging.
* Add agency diagnostics to failed move shard test, start.
* Add coordinator agency diagnostics.
* Remove warning.
* Add changelog entry.
* Add agency diagnostics if things go sour with move shard.
* Add agency diags when things go wrong 2.
* API /_api/agency/state: back to old format.
* Fix Windows compilation.
* handle aborts in supervision and wait for the last Raft log to be committed
* tests compiling, 2 failing for valid reasons
* Correctly report TRI_ERROR_CLUSTER_CONNECTION_LOST as 503.
* FailedLeader /FailedFollower cannot continue, when aborting blocks
* backport of test data generation for maintenance from devel
* 3.4 working
* fixing index use in cluster while still being built
* fixed broken views
* correct 200 for ensureIndex
* merge with 3.4
* agency comm to handle replace in array
* supervision changes
* cluster info's exsureIndex
* 3.4 ready
* timeout
* missing files from origin
* neunhoef complaints
* bogus entry
* no need to wait for current once again
* no longer necessary. done in IndexFactory now
* correct comments
* left overs
* dead code revived
* Move CHANGELOG entry to the right place.
- Schmutz now called "Maintenance" and completely implemented in C++
- Fix index locking bug in mmfiles
- Fix a bug in mmfiles with silent option and repsert
- Slightly increase supervision okperiod and graceperiod
* Initial low level interface for thread crash reporting (and management).
* Add a member version of isClusterRole()
* isolate heartbeat thread creation to new StartHeartbeatThread(). create heartbeat thread even if not a cluster or if an agent.
* update runDBServer() and runCoordinator() to shutdown more quickly by polling isStopping() at additional locations.
* copying updates from different branch / PR
* basic thread crash logging. Not yet tied into Agency arangod or have any specific threads posting crashes
* make Supervision thread a CriticalThread
* sandwich CriticalThread between Thread and other classes to create long term, repeating thread crash reporting.
* restore code lost upon branch update relating to new startHeartbeatThread() function
* add CriticalThread.cpp to build
* add new runAgentServer() function to loop for Agents. Make Heartbeat thread derive from CriticalThread.
* remove debug line
* Supervision goes to Maintenance mode, when /arango/Supervision/Maintenance exists
* coordinator route stands
* stop updates in transient, when supervision off
* fixed a bug, where when servers failed, when also agency leadership changes
* redid entire design of checkDBServers/checkCoordinators.
* comparison in supervision must be between oldPersisted and newHealth
* UI stuff
* UI stuff
* FailedServer test needed adjustment
* Hopefully final round
* fixed supervision failure detection
* FailedServer tests back to origin devel
* oldNot documented among preconditions in Agency HTTP API docs
* changed only look for status updated
* non action line in api-cluster
* added debugging methods
* try to fix invalid access in case of error
* remove unused members
* bugfixes and comments
* all agency fixes in
* merge bug
* partially unguarded Agent::lead fixed
* all agency fixes in
* added nrBlocked to thread startup eval
* added nrBlocked to thread startup eval
* recombination of cases in State::get
* some maps replaced with unordered_maps
* optimized maps some