1
0
Fork 0
Commit Graph

214 Commits

Author SHA1 Message Date
Jan c52f2a8315
refactoring (#9411) 2019-07-09 11:15:52 +02:00
Max Neunhöffer d6d362bd3b
Fix agency election lock step bug. (#9351)
* Fix agency election lockstep bug.

Reset the base point for the random election timeout to now whenever we have
cast a vote, be it for us or for some other server.

* CHANGELOG.

* Fix compilation.
2019-06-27 22:06:26 +02:00
Jan c6d3f8e052
Bug fix/pass on error messages (#8690) 2019-04-10 12:34:25 +02:00
Jan Christoph Uhde c3f7961b88 apply unique log ids (#8561) 2019-03-25 20:26:51 +01:00
Manuel Pöter ecf4d9d62a Fix race conditions in thread management. (#8032) 2019-01-28 15:44:46 +01:00
Frank Celler ac9f375fb5 big reformat 2018-12-26 00:54:03 +01:00
Simon 22b9c31c13 Removing ClusterComm ClientTransactionID (#6294) 2018-09-12 22:15:16 +02:00
Kaveh Vahedipour 28754cbf15 Feature/schmutz plus plus (#5972)
- Schmutz now called "Maintenance" and completely implemented in C++
 - Fix index locking bug in mmfiles
 - Fix a bug in mmfiles with silent option and repsert
 - Slightly increase supervision okperiod and graceperiod
2018-08-24 12:15:35 +02:00
Tobias Gödderz fc3e11dbbc Async AQL (#5806)
* Modified header to new initializeCursor API

* Adapted initializeCursor to DONE/WAITING API. Compiles but not tested and no one reacts to WAITING state, it is not returned anywhere yet

* Subqueries now expect a WAITING return from initilize cursor. However they will just return a nullptr and pretend the query is empty, this will be fixed later

* First attempt to simulate thread waiting over information within the query

* Small fix to allow for isDirect handlers to go to sleep.

* Waiting in the necessary places now for the async request to be send.

* Thank you auto-casting compiler, you are totally right i absolutely wanted to use this bool value as an index in may Array. How could i possibly not want to use it here?

* Include cond-var header

* Fixed mutex/cond_var usage

* Added oldAPI wrappers in AQL Blocks for get/skip some variants. This Commit compiles but is NOT tested

* Let getSome now return unique_ptr of AqlItemsBlocks. Also implemented the async variant of getSome in subqueries.

* Removed all references to OLD implementations in AQL. only the base wrappers are allowed to call OLD functions from now on. Now the testing part starts

* Fixed endless virtual recursion

* Implemented new getOrSkip API in SortBlock

* Implemented new getOrSkip API in LimitBlock

* Initilaize all variables

* Fixed logic bug in SubqueryBlock

* getBlock in ExecutionBlock now returns a state. All blocks need to handle this properly!

* Createad a wrapper getBlockOld that servers the old sync api and is used now in AQL. To be replaced overtime.

* Added IndexBlock::skipSome and IndexBlock::getSome

* getBlock now returns its old return value along with the state

* Switch from getBlockOld to getBlock in IndexBlock::skipSome

* Switch from getBlockOld to getBlock in IndexBlock::getSome

* ShortestPathBlock::skipSome is not implemented! Added a regression test

* Attempt to fix SubQueryResult memory management

* Fixed LIMIT Block

* Moved from ShortestPathBlock::getSomeOld to ::getSome

* Implemented ASYNC api on SingletonBlock

* Adapted EnumerateCollectionBlock to new async API

* Fixed FilterBlock and adapted return block to async API

* Adapted NORESULTS block to async AQL api.

* Adapted Modification Blocks to async API

* Fixed some initialize cursor functions to reset values required during get/skipSome

* First steps to adapt ClusterNodes to Async AQL api. Not there yet, need to implement the core still

* Added asnyc implementation for xxxForShard in ClusterBlocks. This commit changes internal logic of _doneForShard. Needs additional testing as soon as everything is in place.

* Adapted CalculationBlock to async API

* Adapted TraversalBlocks to ASYNC Aql. This is not optimal yet, we need a better decission if we are DONE or not on RETURN

* Adapted EnumerateListBlock to Async AQL api

* Adapted RemoteBlock to ASYNC API in getSome/skipSome. The whole thing is now LIVE in the cluster. Exetensive testing to be started now

* Fixed IndexBlock WAITING behaviour if Waiting occurs during a index processing

* Adapted IReasearchViewBlock to ASYNC AQL API

* Fixed SortingGatherBlock in WAITING state.

* Adapted IResearch ExecutionBlockMock to Async API

* Unified the HASMORE/DONE distinction. Code is much more readable now and harder to get incorrect 👍

* Implemented tonly heoretically reachable function of non void function.

* Fixed last commit

* Added inline TODO comments

* fix warning

* Fixed a clearing logic bug in RemoveNodes

* Fixed Error Handling in RemoteBlocks. Also fixed a logic bug (true/false simply has a 50% chance of getting it wrong) in Distribute and Scatter.

* remove unused methods

* Fixed failure test

* implement skipping

* Moved the Query Waiting out of the ExecutionEngine.

* changed one of the collect blocks

* Removed _upstreamState from ExecutionBlockMock, that is in the base-class now

* Added a Test Mock for a an ExecutionBlock that simulates the WAITING/HASMORE/DONE api.

* do not check "hasMore" if not necessary

* Added DistinctCollectBlock::getOrSkipSome from ~Old and changed its return type

(still uses getBlockOld)

* Save state to resume in DistinctCollectBlock::getOrSkipSome

* Extracted redundant code

* fixed some ops

* added one more test

* fix endless blocking

* fix compile error

* fix test

* Refactored HashedCollectBlock::getOrSkipSome

* Return blocks to the manager

* Replaced usage of getBlockOld in HashedCollectBlock::getOrSkipSome

* remove unused shutdown calls, simplify ownership for expressions

* Removed superfluous variable

* Capture const variable by value

* Removed SortedCollectBlock::getOrSkipSomeOld in favour of getOrSkipSome

* Added a working version of SortedCollectBlock::getOrSkipSome

Has yet to be cleaned up

* Removed isTotalAggregation special treatment

* On no input, return a group of nulls (instead of no group at all)

* Bugfixes

* Simplified code

* Move return to the end, eliminate duplicate code

* Corrected skipped count in HashedCollectBlock

* Aligned getNextRow() implementations

* Added comments

* some cleanup

* fix potential memleak

* Bugfix

* Fixed failure tests

* Removed usage of getBlockOld in ExecutionBlock::getOrSkipSome

* Replaced hasMore with an async implementation (mostly)

* Removed getBlockOld()

* Added hasMoreState to the AQL API (and renamed hasMore methods to hasMoreState)

* RemoteBlock now uses the async hasMoreState route

* remove job queue

* options

* Bugfixes in the async implementation of LimitBlock

* LimitBlock::getOrSkipSome now always skips when calculating the fullcount

* fix compile warnings

* restrict threads

* Fixed api of Waiting ExecBlockMock. Unused yet

* Made SortedGatherBlock async-capable

* Removed nonEmptyIndex hack

* Removed duplicate traceGetSome~ calls, moved all to getSome

* Added asserts before replacing getNr*Registers

* Added a TODO note and a comment

* Removed getSomeWithoutRegisterClearoutOld()

* Removed skip()

* Removed common code by using getNr*Registers()

* Use getNr*Registers() in the TraversalBlock as well

* started to add lane

* started to add lane

* added lane

* completed lane

* removed debug output

* fixed merge

* Began working on a test suite for AQL tracing/profiling

* Added more tests and asserts in aql-profiler

* Made some ExecutionBlocks final

* Added a type enum to all blocks and the per-block stats

* Add block type to stats nodes when tracing AQL on block level

* Removed initializeCursor call from instantiateFromPlan

* Avoided additional getSome calls after DONE

* Added more profiler tests

* Refactored ExecutionBlock::getOrSkipSome and fixed two bugs

- set _upstreamState also when skipping
- explicitly use xecutionBlock::getHasMoreState()

* Bugfix: update state

* Reuse parent _skipped wherever possible; rename where not (LimitBlock)

* Simplified SortedCollectBlock::getOrSkipSome and reused general pattern & code

* Implemented missing virtual function (with USE_FAILURE)

* Reset neccessary values during initializeCursor

* Simplified code in EnumerateListBlock a little

* Added a test for DistinctCollectBlock in aql-profiler

* Avoid redundant getSome calls in DistinctCollectBlock

* fix compilation

* Fixed DistinctCollectBlock profiler test

* Added a second profiler test for the DistinctCollectBlock

* Added a profiler test for EnumerateCollectionBlock

* Bugfix in EnumerateListBlock

* added --server.fifoN-size

* Simplified EnumerateCollectionBlock::getSome

* Simplified EnumerateCollectionBlock::getSome, and return HASMORE less often when DONE

* Fix testEnumerateCollectionBlock1 for mmfiles

* do not pass by reference

* Fixed compile error

* fixed merge conflicts

* Added profiler tests for EnumerateCollectionBlock

* Test fix for mmfiles

* Fixed IResearch tests

* Bugfix in DistinctCollectBlock and a regression test

* Updated comment

* Bugfix for query statistics in cluster

* Check plan in distinct test

* Fix aql-profiler tests in cluster

* Removed unused line / bugfix for single server test runs

* This commit implements waking up of AQL queries. (#5651)

* Non-compiling intermediate commit for handover.

* Make branch compile again

* Started implementation of continueable rest cursor handler by moving the callbacks to the outer part. This is not yet fully tested!

* Made finalizeExecute noexcept. We cannot react to this errors as the response was potentially written before. Also introduced continueExecution in the RestHandler engine.

* First successful query wakeup.

* The wakeup callback now posts on the scheduler directly. A resthandler only needs to provide a callback that encapsulates the continueExecution call on this handler

* renamed finalizeExecute to shutdownExecute

* Added a differentiation between Handler and Callback in Query continuation. Handler will be posted in IO service. Callback will be executed directly

* fix audit log

* Removed callback from deleteQueryCursor. This cannot be waiting

* use CONDITION_LOCKER

* removed yet another thread-local variable

* Fixed forward declaration

* Made RestAqlHandler repeatable

* Use defer to close the query in RestAqlHandler. Now waiting will close the query as well.

* Added a mutex in the RestHandlers to make sure if the callback over network is too fast that there is only one Thread running in the RestHandler

* Captured the GeneralCommTask if it is posted to a RestHandler. This is necessary in the PAUSED case

* Refactoring of _noLockHeader responsibilities. Now the BaseHandler selects them and resets them after it is done. Only Coordinators are allowed to define them if a query is loaded.

* Removed reaction to existing nolockheaders in Coordinator Query Planning Phase

* Removed incorrect assertion.

* Further refactoring of NoLockHeaders. Now there is a wrapper class around it which allows for debugging and logging. The state now seems to be better. Also all non-rest-handler triggered queries clean up the NoLockHeaders properly.

* Fixed UserManager, now deletes nolock headers properly

* Swing to the Symphony of Destruction

* Forgot about community build...

* Fixed compiling of Catch tests

* Fixed community build

* need thread for size

* Made the restSimpleHndler repeatable

* Implemented dump and dumpSync in Cursors, Sync will block a thread, dump allows to wait, only relevant for Streaming cursor

* Reactivated StreamingCursors

* Removed debug output.

* Fixed false query continuation

* Reset thread output to non-debug

* Added missing return statements

* Allow some CollectionMethods to hand-in a context that may contain a transaction. This is meant to honor nolock headers.

* Fixed hidden merge conflict

* Bugfix in aql-profiler.js: use plan.nodes order, not stats

* Added two profiler tests for filter

* Avoid too many getBlock calls in the FilterBlock

* Removed debug output

* RemoteBlock API will now send a done(bool) flag whenever we request documents from remote Servers. It is possible that we are DONE and have a result. The pre 3.4.0 API uses exhausted which is exclusive to a result. This API is still implemented for beckwards compatibility.

* Implemented an executeSync function in AqlQuery. This will block the thread until query execution is complete

* Added another test for FILTER, and one test for the HashedCollectBlock

* Added more tests for HashedCollectBlock; avoid unneccessary getSome calls

* Added an profiler IndexBlock test

* IndexBlock: avoid redundant getSome calls, added missing traceGetSomeEnd calls

* Added a second test profiling IndexBlock

* Added a third test for IndexBlock

* Moved general code to module

* Moved noncluster tests into a separate file

* Split aql-profiler testsuite into three files

* Added profiler tests for LimitBlock

* Added a test for NoResultsBlock

* Added profiler tests for TraversalBlock

* Shutdown of an AQL query is now asynchronous. However in Error-Cases it will be executed in a blocking way still

* Optimized TraversalBlock getSome calls due to new (nightly) test results

* Fixed std::min calls I broke

* Let shutdown calls in AQL wait, if the query is executed successfully.

* Fixed queryResult going out of scope

* fix compile error through merge conflict with devel

* Fixed compiler warning "mismatching tags"

* Removed debug log output

* Added TODO notes

* Fixed test fail due to devel merge

* Fixed some invalid sync waiting implementations

* Added a profiler test for SortBlock

* Added profiler tests for SortedCollectBlock

* Fixed bug introduced by devel merge

* Fixed Remoteblocks ignoring errors!

* Added some more continue Callbacks in used places. And removed debug log

* Removed debug log output

* Suppress clang warnings

* Bugfix: use of invalid stack pointer

* Bugfix: RemoteBlock::shutdown now sends code as int, not string

* Revert "Suppress clang warnings"

This reverts commit 05591649c59743c992edd5e78814edc8ca2a83e0.

* Bugfix: cleanup state in RemoteBlock ::shutdown, ::getSome and ::skipSome

* Bugfix in Subquery shutdown: don't skip subquery shutdown when main query shutdown failed

* Allow copy elision
2018-07-09 14:24:10 +02:00
Vasiliy 843e584746 issue 389.5: refactor StandaloneContext to be constructed with a TRI_vocbase_t& (#5370)
* issue 389.5: refactor StandaloneContext to be constructed with a TRI_vocbase_t&

* backport: address build issues
2018-05-17 01:15:50 +03:00
Wilfried Goesgens 7d6e580780 Refactoring & code cleanup (#5138) (#5142) 2018-04-24 14:42:23 +02:00
Vasiliy 012aaa9469 issue 383.4: push vocbase validity check up from Query constructor out into arangodb::consensus::State, StatisticsWorker and AQLUserFunctions calls (#5177) 2018-04-23 14:52:42 +03:00
Jan b2ceb68205
Feature/small misc optimizations (#4504) 2018-02-08 09:25:07 +01:00
Kaveh Vahedipour 42f543fd10 constituent correctly persisiting _votedFor and _term (#4248) 2018-01-16 09:47:25 +01:00
Max Neunhöffer 7bae6980e8
Bug fix/agent lead hanger (#4147)
* Really enforce the hidden option --server.maximal-threads if given.
* Switch off --log.force-direct in scripts/startStandAloneAgency.sh
* Lower the timeout for sending AppendEntriesRPC to 150s.
* Erase _earliestPackage when becoming a leader.
* Challenge leadership in agent main loop.
* Use steady_clock for _earliestPackage.
* Change _lastAcked and _leaderSince to steady_clock as well.
* time difference calculations based on old readSystemClock to steadyClockToDouble
* All system_clock transitioned to steady_clock in Agent. Remaining system_clock are user input / output or timestamps
* Inception system_clock to steady_clock
2017-12-27 16:45:39 +01:00
Kaveh Vahedipour 22e6a68747 Bug fix/integer overflow when calculating waits in constituent (#4050)
* integer overflow in Constituent could seize operation of Agency

* less likely integer overflow on double conversion

* less likely integer overflow on double conversion

* changed comparison to integer comparison as suggested by @neunhoef
2017-12-19 21:40:46 +01:00
Jan b4f6ee9273 Feature/improved index api for unique constraints and replication (#3715) 2017-11-16 21:02:01 +01:00
Jan bef52d7dc3
Bug fix/cleanup after cppcheck (#3639) 2017-11-10 13:53:28 +01:00
Max Neunhöffer 3c0ee6908b Bug fix/lead to agent (#3541) 2017-11-09 11:10:09 +01:00
Max Neunhöffer ee96c37237 Fix agency restart problems. (#3493)
* Fix agency restart problems (port from a 3.2 fix).

* Further fixes after Craneware rescue.
2017-10-25 18:05:58 +02:00
Simon Grätzer 7c31960cf2 Feature/async failover (#3451) 2017-10-18 23:59:29 +02:00
Max Neunhöffer d86f27bd19 Bug fix/agency leader timeouts (#3373)
* Send out empty heartbeats regardless of non-empty AppendEntriesRPC.
* Also improve logging:
  Note if a log in the empty heartbeat sending takes > 0.01 s.
  Clearly mark places where a leader resigns in logging.
  Log if no empty heartbeat is sent out.
* Make leader more tolerant w.r.t. incoming AppendEntriesRPC responses.
* Add debug logging for _lastAcked and challengeLeadership.
* Remove some unused code. Do not count ourselves in challengeLeadership.
* Removal of entire activation/deactivation mechanisms in agency
* TRI_microtime up to c++11
* added term to response to sendAppendEntries.
2017-10-06 10:11:51 +02:00
Max Neunhoeffer af3f977997
Revert "Send out empty heartbeats regardless of non-empty AppendEntriesRPC."
This reverts commit e974501446.
2017-10-02 15:02:15 +02:00
Max Neunhoeffer e974501446
Send out empty heartbeats regardless of non-empty AppendEntriesRPC.
Also improve logging:
  Note if a log in the empty heartbeat sending takes > 0.01 s.
  Clearly mark places where a leader resigns in logging.
  Log if no empty heartbeat is sent out.
2017-10-02 14:14:41 +02:00
Max Neunhöffer 22e46978a6 Bug fix/sort out agency locks (#3306)
New locking concept in Agency. Ensure empty heartbeats can be sent, answered and processed without long locks. Adjust logging. Fix compaction bugs.
2017-09-27 15:22:30 +02:00
Jan 5165155ed1 Bug fix/fixes 0609 (#3227)
* do not use V8 variant of AQL functions in early optimization stage when a C++ variant is available

* additionally, simplify AQL function definitions and aliases

* warn when more than 90% of max mappings are in use

* added C++ variant of replication catchup

* added `--log.role` option

* updated CHANGELOG

* removed non-existing scheduler.threads option from config

* removed useless __FILE__, __LINE__ invocations

* updated CHANGELOG

* allow a priority V8 context

* remove TRI_CORE_MEM_ZONE

* try to fix Windows errors & warnings

* cleanup

* removed memory zones altogether

* exclude system collections from collection tests
2017-09-13 16:28:21 +02:00
Jan 0abbc3a3c6 fix duplicate mutex (#3215) 2017-09-07 14:38:29 +02:00
Kaveh Vahedipour 00650e6a3f Bug fix/agency mt fixes (#3158)
* added debugging methods

* try to fix invalid access in case of error

* remove unused members

* bugfixes and comments

* all agency fixes in

* merge bug

* partially unguarded Agent::lead fixed

* all agency fixes in

* added nrBlocked to thread startup eval

* added nrBlocked to thread startup eval

* recombination of cases in State::get

* some maps replaced with unordered_maps

* optimized maps some
2017-08-30 10:43:51 +02:00
Frank Celler 545e861829 Bug fix/agency prepare leading bug (#2752) 2017-07-08 17:08:30 +02:00
Max Neunhöffer 3d8e590bee Adapt Raft timeouts dynamically and fix create collection timeout race
Various fixes.
2017-07-06 12:51:51 +02:00
Andreas Streichardt c89c9e44c9 Fix warnings 2017-06-07 14:50:46 +02:00
Kaveh Vahedipour da0cc3490c Squashed commit of the following:
commit 3d9cf792912db1974b9ac5e00ca2b4c9245b7d34
Author: Kaveh Vahedipour <kaveh@vahedipour.de>
Date:   Fri Jun 2 15:32:43 2017 +0200

    optimise for single writes in agency log

commit 65056ab9026f9b4b211dda0f17c75602b978f2bf
Author: Max Neunhoeffer <max@arangodb.com>
Date:   Fri Jun 2 15:01:15 2017 +0200

    More tests, taking agency log compaction interval into account.

commit 6600d707784e8fd5b62c0c75fd1826af09b8e13f
Merge: cf46882 02f00cc
Author: Max Neunhoeffer <max@arangodb.com>
Date:   Fri Jun 2 14:50:38 2017 +0200

    Merge branch 'agency-log-compaction-overhaul' of ssh://github.com/ArangoDB/ArangoDB into agency-log-compaction-overhaul

commit 02f00cc271d027f02b0625afb76745bfa76bf833
Author: Kaveh Vahedipour <kaveh@vahedipour.de>
Date:   Fri Jun 2 14:34:41 2017 +0200

    compaction step and keep size defaults for 3.2

commit 03fc8fbff8f0ac701f7d7f94521c0c3152dd6f92
Author: Kaveh Vahedipour <kaveh@vahedipour.de>
Date:   Fri Jun 2 14:32:46 2017 +0200

    Constituent fatally exists if eletion ballot cannot be persisted

commit cf4688226fc897e74bb2d9ebdfca3ce4578c3b70
Merge: c727fc4 724bd1e
Author: Max Neunhoeffer <max@arangodb.com>
Date:   Fri Jun 2 13:08:15 2017 +0200

    Merge branch 'agency-log-compaction-overhaul' of ssh://github.com/ArangoDB/ArangoDB into agency-log-compaction-overhaul

commit 724bd1efe19e2e9dbfc14cd819f180816b6d62d0
Author: Kaveh Vahedipour <kaveh@vahedipour.de>
Date:   Fri Jun 2 13:02:51 2017 +0200

    persistence success in agency state is properly evaluated

commit c727fc48bb93e7b135b3ca929c03221c7bcaddb9
Author: Max Neunhoeffer <max@arangodb.com>
Date:   Fri Jun 2 12:04:55 2017 +0200

    Set default compaction step size in agency to 20000 and 10000 keep size.

commit ded16ae6945e9c1479e99bc2e7ccb4d6feca19a6
Author: Max Neunhoeffer <max@arangodb.com>
Date:   Fri Jun 2 11:11:12 2017 +0200

    Fix help page in startStandAloneAgency.sh for --use-persistence.

commit 13ae9f40f649a8f92eeca4b16bbb5647b540722d
Merge: 834c7c9 aa3e8c1
Author: Max Neunhoeffer <max@arangodb.com>
Date:   Thu Jun 1 23:41:34 2017 +0200

    Merge remote-tracking branch 'origin/devel' into agency-log-compaction-overhaul

commit 834c7c920d36db3579def66c38fb04870936f8bd
Author: Max Neunhoeffer <max@arangodb.com>
Date:   Thu Jun 1 16:56:03 2017 +0200

    Handle error in recvAppendEntriesRPC properly.

commit bd9c8d03b76ad25d4078740b5bf994fdba3845d0
Author: Max Neunhoeffer <max@arangodb.com>
Date:   Thu Jun 1 16:55:35 2017 +0200

    Handle errors in persist() and log() properly.

commit 5b4d2c3d9af078d6a1b5626af20dc9abf2546baa
Author: Max Neunhoeffer <max@arangodb.com>
Date:   Thu Jun 1 16:25:22 2017 +0200

    Improve error reporting.

commit d60697c5f26d6592eecefc9b9a43e9b699d1773d
Author: Kaveh Vahedipour <kaveh@vahedipour.de>
Date:   Thu Jun 1 12:16:39 2017 +0200

    Agency must not responds to any requests after startup util leader has RAFT commited up to pre shutdown state

commit 92b8ede5fa022ace1596607abcf8fad1130504c8
Merge: 9340e74 d24455c
Author: Kaveh Vahedipour <kaveh@vahedipour.de>
Date:   Wed May 31 16:54:45 2017 +0200

    Merge remote-tracking branch 'origin/devel' into agency-log-compaction-overhaul

commit 9340e7461130a4783c09ad8d91e5a07f9500a045
Merge: 7b7ce9d 63a9d60
Author: Kaveh Vahedipour <kaveh@vahedipour.de>
Date:   Wed May 31 12:13:55 2017 +0200

    agency tests to cover compaction

commit 63a9d604c474eda4302032629dff1f0f69fa0813
Author: Max Neunhoeffer <max@arangodb.com>
Date:   Wed May 31 11:59:11 2017 +0200

    Set agency.compaction-keep-size to at least 0.

commit ef842260968a4769d9502a701b7251da32647e52
Author: Max Neunhoeffer <max@arangodb.com>
Date:   Wed May 31 11:49:34 2017 +0200

    Fix agent size 1 case (thread already gone).

commit 7b7ce9d79f6e8208c13f153b1b9a395b780d6ce1
Merge: 24e2e7e ff306bf
Author: Kaveh Vahedipour <kaveh@vahedipour.de>
Date:   Wed May 31 11:39:58 2017 +0200

    Merge branch 'agency-log-compaction-overhaul' of https://github.com/arangodb/arangodb into agency-log-compaction-overhaul

commit ff306bf547bc4f528c9b66e222271ac143029508
Author: Max Neunhoeffer <max@arangodb.com>
Date:   Wed May 31 11:11:06 2017 +0200

    Move compaction into the future when we take a snapshot from leader.

commit 24e2e7e00f960928a79ce4008b8031d6b9b07fd9
Merge: 84034ac b3ea17a
Author: Kaveh Vahedipour <kaveh@vahedipour.de>
Date:   Wed May 31 11:01:13 2017 +0200

    Merge branch 'agency-log-compaction-overhaul' of https://github.com/arangodb/arangodb into agency-log-compaction-overhaul

commit b3ea17a219baa2abd5892819012fb59f440cdeb8
Author: Max Neunhoeffer <max@arangodb.com>
Date:   Wed May 31 09:42:59 2017 +0200

    Get rid of double nonsense.

commit 035c8d1b34e1b73a381d5468422adf13b2ebc36a
Author: Max Neunhoeffer <max@arangodb.com>
Date:   Wed May 31 09:25:28 2017 +0200

    Sort out Agent::load sequence.

commit 84034ac2809a77145d6b1d23bf44857b3a0c4651
Merge: eb34a2e 3180a9d
Author: Kaveh Vahedipour <kaveh@vahedipour.de>
Date:   Tue May 30 17:07:20 2017 +0200

    merging in

commit eb34a2e64e6ac8dc6571b92cb853c38b7022c833
Author: Kaveh Vahedipour <kaveh@vahedipour.de>
Date:   Tue May 30 16:58:05 2017 +0200

    keep persistence for restarts in standalone agency

commit 3180a9d9ce4a4401a55ef02606b020316d43cbe5
Merge: 5d60524 28b9580
Author: Max Neunhoeffer <max@arangodb.com>
Date:   Tue May 30 16:56:56 2017 +0200

    Merge branch 'agency-log-compaction-overhaul' of ssh://github.com/ArangoDB/ArangoDB into agency-log-compaction-overhaul

commit 5d60524429d8ddda4491beecb931c3b9e3cc1d8a
Author: Max Neunhoeffer <max@arangodb.com>
Date:   Tue May 30 16:56:36 2017 +0200

    Implement snapshot sending in AppendEntriesRCV.

commit 28b958054f51c9cb36706df4e4345aa0f726ed15
Author: Kaveh Vahedipour <kaveh@vahedipour.de>
Date:   Tue May 30 14:20:13 2017 +0200

    state machine should not advance _committed if empty

commit df18f326acea7f5bc2660a37e22f1503952e4b41
Author: Max Neunhoeffer <max@arangodb.com>
Date:   Mon May 29 23:39:47 2017 +0200

    Store term with compaction snapshot, recover again.

commit 2551a48b6fb513c9ea934bce755f8c364dae2f05
Author: Kaveh Vahedipour <kaveh@vahedipour.de>
Date:   Mon May 29 17:45:26 2017 +0200

    indices renamed to closely match RAFT documentation

commit e62dcdecf6e8650cfa5725d91b809d05591b48a4
Author: Max Neunhoeffer <max@arangodb.com>
Date:   Mon May 29 16:37:43 2017 +0200

    More cleanup.

commit 9f4787c46621375f0361138a8961431eb21ce5c0
Author: Max Neunhoeffer <max@arangodb.com>
Date:   Mon May 29 14:50:25 2017 +0200

    Revise loadLastCompactedSnapshot to return 0 without a snapshot.

commit 13285e1d70c8a4ac8c79a08de6f8fbc0f8d242bf
Merge: 3393c43 6c5f23e
Author: Max Neunhoeffer <max@arangodb.com>
Date:   Mon May 29 12:06:20 2017 +0200

    Merge branch 'agency-log-compaction-overhaul' of ssh://github.com/ArangoDB/ArangoDB into agency-log-compaction-overhaul

commit 3393c43c75520c74d20df09c74fbbbd8b1af5976
Author: Max Neunhoeffer <max@arangodb.com>
Date:   Mon May 29 12:03:47 2017 +0200

    More cleanup around Store::apply and friends.

commit 4ccb41d1839748c98e11403fa04f6a7d6af5e95b
Author: Max Neunhoeffer <max@arangodb.com>
Date:   Mon May 29 09:45:41 2017 +0200

    Document apply methods in Store with comments.

commit ea05c4880fedb6fe535e24761ac5cb3c26ccfc20
Author: Max Neunhoeffer <max@arangodb.com>
Date:   Mon May 29 09:45:18 2017 +0200

    Intentionally keep one log entry more to prevent empty log.

commit 67fb62f2259cc3c6368319917c7257ebcc177d3f
Author: Max Neunhoeffer <max@arangodb.com>
Date:   Mon May 29 09:44:42 2017 +0200

    Improve plausibility checks at compaction time.

commit 0bafc368785b15a94f8783c4c929f4208f87d09c
Author: Max Neunhoeffer <max@arangodb.com>
Date:   Thu May 25 00:51:51 2017 +0200

    Sort out (re-)building of agency K/V store(s) from compaction snapshots.

    This is in case of (re-)start, becoming a leader and when serving
    /_api/agency/store.

commit 46b0750bc6c597ec388aac0cdca32082c0cc54b8
Author: Max Neunhoeffer <max@arangodb.com>
Date:   Thu May 25 00:50:54 2017 +0200

    Set compaction interval to 200 and keep to 0.

    This is way to small but tests should run with it.
    Will later increase numbers again.

commit 024dc0846ae30248b464dd481a8bbc1134f56983
Author: Max Neunhoeffer <max@arangodb.com>
Date:   Wed May 24 23:29:57 2017 +0200

    Add a trivial test for agency log compaction.

commit e12fd3b46833419d7b436eeadd7246304324b891
Author: Max Neunhoeffer <max@arangodb.com>
Date:   Wed May 24 23:26:03 2017 +0200

    First part of cleanup of agency log compaction.

    Now the right compaction snapshots are taken and persisted.
    Furthermore, the right log entries and old snapshots are removed
    after compaction, both the volatile and the persisted ones.
    The readDB and spearHead stay unchanged at compaction time as it
    should be.

commit d59901aea0c3ca31ef253299d2adc3353b79e664
Author: Max Neunhoeffer <max@arangodb.com>
Date:   Wed May 24 12:18:26 2017 +0200

    Remove unused member variable.

commit 6c5f23eb7b42d9f20d4dadb2932a63add99f9c76
Author: Max Neunhoeffer <max@arangodb.com>
Date:   Mon May 29 09:45:41 2017 +0200

    Document apply methods in Store with comments.

commit 670899f72d215e0fcc0ca0389cea9250a291e83b
Author: Max Neunhoeffer <max@arangodb.com>
Date:   Mon May 29 09:45:18 2017 +0200

    Intentionally keep one log entry more to prevent empty log.

commit 660f61029917bbc2ce1fae3e4fc903095b023297
Author: Max Neunhoeffer <max@arangodb.com>
Date:   Mon May 29 09:44:42 2017 +0200

    Improve plausibility checks at compaction time.

commit e2802e4b36d1f67d8361c1d8b0c92fbff696f439
Author: Max Neunhoeffer <max@arangodb.com>
Date:   Thu May 25 00:51:51 2017 +0200

    Sort out (re-)building of agency K/V store(s) from compaction snapshots.

    This is in case of (re-)start, becoming a leader and when serving
    /_api/agency/store.

commit 12b43f1b91284a1185390d6dcfbd1e838522d392
Author: Max Neunhoeffer <max@arangodb.com>
Date:   Thu May 25 00:50:54 2017 +0200

    Set compaction interval to 200 and keep to 0.

    This is way to small but tests should run with it.
    Will later increase numbers again.

commit c8b9a37a690b8e7e8bfa1276a3f9ba4b6b5a9c27
Author: Max Neunhoeffer <max@arangodb.com>
Date:   Wed May 24 23:29:57 2017 +0200

    Add a trivial test for agency log compaction.

commit cf0c8c1fff666f76411082f87efe685a412ecebb
Author: Max Neunhoeffer <max@arangodb.com>
Date:   Wed May 24 23:26:03 2017 +0200

    First part of cleanup of agency log compaction.

    Now the right compaction snapshots are taken and persisted.
    Furthermore, the right log entries and old snapshots are removed
    after compaction, both the volatile and the persisted ones.
    The readDB and spearHead stay unchanged at compaction time as it
    should be.

commit 0a4255359a57b8686133e6014e2b82b8079f36fa
Author: Max Neunhoeffer <max@arangodb.com>
Date:   Wed May 24 12:18:26 2017 +0200

    Remove unused member variable.
2017-06-02 16:13:03 +02:00
jsteemann a4fde59fd2 some refactoring 2017-05-23 13:18:51 +02:00
Kaveh Vahedipour 1635ec3679 attempt at fixing occasional fatal errors accessing clustercomm instance on shutdown 2017-05-05 16:28:35 +02:00
Kaveh Vahedipour e2b5f334a2 this should be last we have seen of the funny fail rotation bug 2017-05-05 10:15:44 +02:00
Kaveh Vahedipour 0218607819 expanding agent pool 2017-05-03 17:50:33 +02:00
Kaveh Vahedipour 54c1183a38 expanding agent pool 2017-05-03 17:40:33 +02:00
Kaveh Vahedipour 7766c44aaa all agency threads shutdown in their destructors if not stopping yet 2017-04-25 09:34:08 +02:00
Kaveh Vahedipour 1f81ce28b0 merge in cpp & js from 3.1.18 yet to do tests 2017-04-21 15:41:05 +02:00
Jan Christoph Uhde b83ae2ab82 refactor some code to make use of arangodb::Result 2017-03-30 09:39:21 +02:00
jsteemann 8e51e3ba50 fix slow queries 2017-03-22 11:20:07 +01:00
Kaveh Vahedipour a87fb6d71e restructured the leadership takeover 2017-03-17 15:44:58 +01:00
Kaveh Vahedipour 870eef2f52 backport of 3.1 bug fixes and resilience improvements 2017-03-13 13:35:19 +01:00
jsteemann 666b2f8da9 renaming 2017-02-27 14:38:27 +01:00
Kaveh Vahedipour 4cc830b0df merge from 3.1 2017-02-20 20:05:52 +01:00
Kaveh Vahedipour 37472ddcdc revisited the appendEntries API 2017-02-14 15:18:07 +01:00
Kaveh Vahedipour 05250b6c7b back port of bug fixes in 3.1 2017-02-14 15:07:51 +01:00
Max Neunhoeffer 883c11ea45 Handle the case that ClusterComm is already shut down gracefully.
This touches every single place where ClusterComm is being used.
2017-02-07 15:31:40 +01:00
Kaveh Vahedipour f3cb1307a5 3.1 fixes backported to devel 2017-02-03 10:48:25 +01:00
jsteemann 00b1632ece factored out AccessMode from transaction.h 2017-01-25 11:57:21 +01:00