1
0
Fork 0
Commit Graph

10 Commits

Author SHA1 Message Date
Lars Maier a1bae63cf1 [3.4] Verbose Abort Reason (#8878)
* Added reason to job abort method.

* Additional abort that is not in devel.
2019-05-01 13:54:47 +02:00
Max Neunhöffer 46e479376d
Further supervision fixes. (#8259)
* Do not schedule Coordinators in Plan.

* Finish failed server when server is no longer in health.

* Fix removeServer checks.

Check that server is no longer in use before removing it. Give 60s
waiting time for condition to be met. Also observer agency lock.

* Finish FailedFollower job if server no longer follower.

This can happen because RemoveFollower was faster.

* Only use GOOD servers as replacement followers.

* Fix AddFollower for satellite collections.

* Fix RemoveServer for satellite collections.

* MoveShard handles moves from leader to followers

* Prepare CleanoutServer and FailedServer for satellite collections.

* More sorting out of AddFollower and RemoveFollower.

* Fix RemoveFollower job w.r.t. choice of follower to remove.

* Fix message.

* kill you own sub jobs, please

* Added preconditions to payloads for supervision's job finishers

* Improve logging.

* Add agency diagnostics to failed move shard test, start.

* Add coordinator agency diagnostics.

* Remove warning.

* Add changelog entry.

* Add agency diagnostics if things go sour with move shard.

* Add agency diags when things go wrong 2.

* API /_api/agency/state: back to old format.

* Fix Windows compilation.

* handle aborts in supervision and wait for the last Raft log to be committed

* tests compiling, 2 failing for valid reasons

* Correctly report TRI_ERROR_CLUSTER_CONNECTION_LOST as 503.

* FailedLeader /FailedFollower cannot continue, when aborting blocks
2019-03-04 11:43:35 +01:00
Frank Celler 9477af198b big reformat 2018-12-26 00:57:05 +01:00
Kaveh Vahedipour 28754cbf15 Feature/schmutz plus plus (#5972)
- Schmutz now called "Maintenance" and completely implemented in C++
 - Fix index locking bug in mmfiles
 - Fix a bug in mmfiles with silent option and repsert
 - Slightly increase supervision okperiod and graceperiod
2018-08-24 12:15:35 +02:00
Kaveh Vahedipour 1f81ce28b0 merge in cpp & js from 3.1.18 yet to do tests 2017-04-21 15:41:05 +02:00
Kaveh Vahedipour 85ea1d5ff9 clang-format 2016-09-06 10:01:33 +02:00
Kaveh Vahedipour 382ac052d4 resilience green 2016-06-08 18:27:59 +02:00
Kaveh Vahedipour 2e87f59218 Testing supervision 2016-06-02 12:17:35 +02:00
Kaveh Vahedipour cc23d0df99 Cleaning out server 2016-06-01 13:44:27 +02:00
Kaveh Vahedipour b6e15313c3 Moving Job classes out of Supervision 2016-05-31 16:45:23 +02:00