1
0
Fork 0
Commit Graph

100 Commits

Author SHA1 Message Date
Dan Larkin-York 663212ba19 [3.5] Check scheduler queue return value (#9759)
* Add to cmake.

* Backport changes from devel.

* added CHANGELOG entry, port adjustments from devel
2019-08-20 12:57:51 +03:00
Jan 79258e072a
Bug fix/remove io task (#9056) 2019-05-22 14:34:49 +02:00
Lars Maier 4fc2790863 [devel] Direct Exec Scheduler (#9004) 2019-05-20 11:38:57 +02:00
Jan 976dc2b726
Bug fix/issues 2019 05 06 (#8913) 2019-05-07 12:17:16 +02:00
Jan e36f7d429e
Bug fix/fix scheduler shutdown task assertion (#8727) 2019-04-10 19:51:45 +02:00
Jan Christoph Uhde c3f7961b88 apply unique log ids (#8561) 2019-03-25 20:26:51 +01:00
Manuel Pöter ecf4d9d62a Fix race conditions in thread management. (#8032) 2019-01-28 15:44:46 +01:00
Jan fa7de56cf8
upgrade to boost 1.69.0 (#7910) 2019-01-09 17:17:33 +01:00
Lars Maier 423cf7a8d4 cppcheck/Scheduler (#7909) 2019-01-08 16:39:56 +01:00
Lars Maier 12eebb15fe Feature/new server infra (#7733)
* Decoupled IO handling from Scheduler.

* Fixed SSL start up bug.

* Replaced Scheduler with new worker farm implementation.

* Added minimal statistics and info string for Scheduler.

* Added support for timed submissions.

* Updated delayed submission api. Updated code that used timers.

* Extracted new Scheduler into a virtual parent class. The implementation can now depend on the usecase.

* Signal handler now working.

* Changed threads names, `_stop` is atomic, check for failure during thread start + exception handling like old scheduler did.

* Commented on source code and added TODOs.

* Played around with start-stop-conditions

* Play around with start stop condition.

* start stop cond

* Sart Stop Conditions

* Removed bad cv_status check.

* Bug fix: now compare the actual objects instead of pointer values. Setup t1 and t2 depending on the thread id.

* Moved most of the stuff now unrelated to the Scheduler to GeneralServer. Got rid of JobGuard.

* Instead of waiting for a thread to terminate, put it on a clean up list and check for its termination in each supervisor run.

* Allow detaching long running threads.

* Fixed test mock.

* Updated the WorkHandle logic. Removed post functions.

* Fixed crash when obtaining shared_ptr from this in destructor.

* Added lost mutex.

* Fixed memory leak.

* Fixed merge bug.

* Changed a lot of code to optimize the scheduler.

* Fixed bug of invalidated iterator. Dont remove task on shutdown at different places. Let scheduler threads run until queue is empty.

* Only by value calls to queue.

* Added options again.

* Clean up of code.

* UI Request Lane added.

* Bug fixes in Scheduler.

* Applied reformat.

* Use sigaction.
2019-01-08 10:12:02 +01:00
Frank Celler ac9f375fb5 big reformat 2018-12-26 00:54:03 +01:00
jsteemann b4e888ef13 revert Scheduler changes 2018-11-26 09:57:53 +01:00
Jan c7869f1c46
Bug fix/remove shutdown assertion (#7388) 2018-11-22 15:35:55 +01:00
Matthew Von-Maszewski 0d39ff66f5
Bugfix: backport defensive Communicator change and revert constant change in Scheduler (#7214)
* revert accidental change to MIN_SECONDS

* from bugfix-3.4/mv-communicator-defensive:  simplify lambda usage to static functions.
2018-11-05 15:18:31 -06:00
Simon c72818a9dc Make ensureIndexOnCoordinator more robust (#7110) 2018-10-29 17:45:46 +01:00
Matthew Von-Maszewski 97ba8ca2be Bugfix: More 3.4 scheduler changes backported (#7091) 2018-10-26 17:09:20 +02:00
jsteemann ac19d0a627 pass by reference 2018-10-12 18:21:50 +02:00
Lars Maier fac7b48c74 [3.5] Feature/decoupled io (#6281)
* Decoupled IO from Scheduler.
* Fixed SSL start up bug.
* Updated messages and thread names. Fixed missing code from cherry-pick.
* Reintroduced checks for executing thread to be correct. Modifed default value for io-context depending on cores.
* Fixed memory leak caused by cyclic references.
* Actually distribute endpoints. Move handlers into function and do not copy them for each encapsulation.
* Inserted debug output.
* BUG FIXED! One has to call drain() on every queue as temporary work around.
* Added some flags and output for testing.
* More debug output!!!
* Manuel is right.
* Removed debug output.
2018-10-08 13:05:12 +02:00
Simon 5837291495 Debug logs for ActiveFailover (#6684) 2018-10-02 15:10:50 +02:00
Simon 912f109968 Add simple Future library (#6464) 2018-09-21 16:14:17 +02:00
Dan Larkin-York 0dfabd8f04 Fix several TSan warnings (#6473) 2018-09-14 11:16:45 +02:00
Jan 5873f63a72
Bug fix/fixes 2908 (#6279) 2018-08-31 17:26:54 +02:00
Lars Maier 889ce78dcf Batch Handler V8-lane bug (#6196) 2018-08-21 13:52:14 +02:00
Jan 86204ed0b8
fix memory leaks in arangosh connections (#6160) 2018-08-17 08:48:54 +02:00
Jan d6a3b66e2a
micro optimizations (#6162) 2018-08-16 08:50:16 +02:00
Frank Celler a688dc0962
Feature/remove job queue thread (#5986)
limiting V8 calls in flight
2018-08-10 12:17:43 +02:00
Jan 2cac8b8a51
honor some cppcheck recommendations (#5817) 2018-07-10 13:50:30 +02:00
Tobias Gödderz fc3e11dbbc Async AQL (#5806)
* Modified header to new initializeCursor API

* Adapted initializeCursor to DONE/WAITING API. Compiles but not tested and no one reacts to WAITING state, it is not returned anywhere yet

* Subqueries now expect a WAITING return from initilize cursor. However they will just return a nullptr and pretend the query is empty, this will be fixed later

* First attempt to simulate thread waiting over information within the query

* Small fix to allow for isDirect handlers to go to sleep.

* Waiting in the necessary places now for the async request to be send.

* Thank you auto-casting compiler, you are totally right i absolutely wanted to use this bool value as an index in may Array. How could i possibly not want to use it here?

* Include cond-var header

* Fixed mutex/cond_var usage

* Added oldAPI wrappers in AQL Blocks for get/skip some variants. This Commit compiles but is NOT tested

* Let getSome now return unique_ptr of AqlItemsBlocks. Also implemented the async variant of getSome in subqueries.

* Removed all references to OLD implementations in AQL. only the base wrappers are allowed to call OLD functions from now on. Now the testing part starts

* Fixed endless virtual recursion

* Implemented new getOrSkip API in SortBlock

* Implemented new getOrSkip API in LimitBlock

* Initilaize all variables

* Fixed logic bug in SubqueryBlock

* getBlock in ExecutionBlock now returns a state. All blocks need to handle this properly!

* Createad a wrapper getBlockOld that servers the old sync api and is used now in AQL. To be replaced overtime.

* Added IndexBlock::skipSome and IndexBlock::getSome

* getBlock now returns its old return value along with the state

* Switch from getBlockOld to getBlock in IndexBlock::skipSome

* Switch from getBlockOld to getBlock in IndexBlock::getSome

* ShortestPathBlock::skipSome is not implemented! Added a regression test

* Attempt to fix SubQueryResult memory management

* Fixed LIMIT Block

* Moved from ShortestPathBlock::getSomeOld to ::getSome

* Implemented ASYNC api on SingletonBlock

* Adapted EnumerateCollectionBlock to new async API

* Fixed FilterBlock and adapted return block to async API

* Adapted NORESULTS block to async AQL api.

* Adapted Modification Blocks to async API

* Fixed some initialize cursor functions to reset values required during get/skipSome

* First steps to adapt ClusterNodes to Async AQL api. Not there yet, need to implement the core still

* Added asnyc implementation for xxxForShard in ClusterBlocks. This commit changes internal logic of _doneForShard. Needs additional testing as soon as everything is in place.

* Adapted CalculationBlock to async API

* Adapted TraversalBlocks to ASYNC Aql. This is not optimal yet, we need a better decission if we are DONE or not on RETURN

* Adapted EnumerateListBlock to Async AQL api

* Adapted RemoteBlock to ASYNC API in getSome/skipSome. The whole thing is now LIVE in the cluster. Exetensive testing to be started now

* Fixed IndexBlock WAITING behaviour if Waiting occurs during a index processing

* Adapted IReasearchViewBlock to ASYNC AQL API

* Fixed SortingGatherBlock in WAITING state.

* Adapted IResearch ExecutionBlockMock to Async API

* Unified the HASMORE/DONE distinction. Code is much more readable now and harder to get incorrect 👍

* Implemented tonly heoretically reachable function of non void function.

* Fixed last commit

* Added inline TODO comments

* fix warning

* Fixed a clearing logic bug in RemoveNodes

* Fixed Error Handling in RemoteBlocks. Also fixed a logic bug (true/false simply has a 50% chance of getting it wrong) in Distribute and Scatter.

* remove unused methods

* Fixed failure test

* implement skipping

* Moved the Query Waiting out of the ExecutionEngine.

* changed one of the collect blocks

* Removed _upstreamState from ExecutionBlockMock, that is in the base-class now

* Added a Test Mock for a an ExecutionBlock that simulates the WAITING/HASMORE/DONE api.

* do not check "hasMore" if not necessary

* Added DistinctCollectBlock::getOrSkipSome from ~Old and changed its return type

(still uses getBlockOld)

* Save state to resume in DistinctCollectBlock::getOrSkipSome

* Extracted redundant code

* fixed some ops

* added one more test

* fix endless blocking

* fix compile error

* fix test

* Refactored HashedCollectBlock::getOrSkipSome

* Return blocks to the manager

* Replaced usage of getBlockOld in HashedCollectBlock::getOrSkipSome

* remove unused shutdown calls, simplify ownership for expressions

* Removed superfluous variable

* Capture const variable by value

* Removed SortedCollectBlock::getOrSkipSomeOld in favour of getOrSkipSome

* Added a working version of SortedCollectBlock::getOrSkipSome

Has yet to be cleaned up

* Removed isTotalAggregation special treatment

* On no input, return a group of nulls (instead of no group at all)

* Bugfixes

* Simplified code

* Move return to the end, eliminate duplicate code

* Corrected skipped count in HashedCollectBlock

* Aligned getNextRow() implementations

* Added comments

* some cleanup

* fix potential memleak

* Bugfix

* Fixed failure tests

* Removed usage of getBlockOld in ExecutionBlock::getOrSkipSome

* Replaced hasMore with an async implementation (mostly)

* Removed getBlockOld()

* Added hasMoreState to the AQL API (and renamed hasMore methods to hasMoreState)

* RemoteBlock now uses the async hasMoreState route

* remove job queue

* options

* Bugfixes in the async implementation of LimitBlock

* LimitBlock::getOrSkipSome now always skips when calculating the fullcount

* fix compile warnings

* restrict threads

* Fixed api of Waiting ExecBlockMock. Unused yet

* Made SortedGatherBlock async-capable

* Removed nonEmptyIndex hack

* Removed duplicate traceGetSome~ calls, moved all to getSome

* Added asserts before replacing getNr*Registers

* Added a TODO note and a comment

* Removed getSomeWithoutRegisterClearoutOld()

* Removed skip()

* Removed common code by using getNr*Registers()

* Use getNr*Registers() in the TraversalBlock as well

* started to add lane

* started to add lane

* added lane

* completed lane

* removed debug output

* fixed merge

* Began working on a test suite for AQL tracing/profiling

* Added more tests and asserts in aql-profiler

* Made some ExecutionBlocks final

* Added a type enum to all blocks and the per-block stats

* Add block type to stats nodes when tracing AQL on block level

* Removed initializeCursor call from instantiateFromPlan

* Avoided additional getSome calls after DONE

* Added more profiler tests

* Refactored ExecutionBlock::getOrSkipSome and fixed two bugs

- set _upstreamState also when skipping
- explicitly use xecutionBlock::getHasMoreState()

* Bugfix: update state

* Reuse parent _skipped wherever possible; rename where not (LimitBlock)

* Simplified SortedCollectBlock::getOrSkipSome and reused general pattern & code

* Implemented missing virtual function (with USE_FAILURE)

* Reset neccessary values during initializeCursor

* Simplified code in EnumerateListBlock a little

* Added a test for DistinctCollectBlock in aql-profiler

* Avoid redundant getSome calls in DistinctCollectBlock

* fix compilation

* Fixed DistinctCollectBlock profiler test

* Added a second profiler test for the DistinctCollectBlock

* Added a profiler test for EnumerateCollectionBlock

* Bugfix in EnumerateListBlock

* added --server.fifoN-size

* Simplified EnumerateCollectionBlock::getSome

* Simplified EnumerateCollectionBlock::getSome, and return HASMORE less often when DONE

* Fix testEnumerateCollectionBlock1 for mmfiles

* do not pass by reference

* Fixed compile error

* fixed merge conflicts

* Added profiler tests for EnumerateCollectionBlock

* Test fix for mmfiles

* Fixed IResearch tests

* Bugfix in DistinctCollectBlock and a regression test

* Updated comment

* Bugfix for query statistics in cluster

* Check plan in distinct test

* Fix aql-profiler tests in cluster

* Removed unused line / bugfix for single server test runs

* This commit implements waking up of AQL queries. (#5651)

* Non-compiling intermediate commit for handover.

* Make branch compile again

* Started implementation of continueable rest cursor handler by moving the callbacks to the outer part. This is not yet fully tested!

* Made finalizeExecute noexcept. We cannot react to this errors as the response was potentially written before. Also introduced continueExecution in the RestHandler engine.

* First successful query wakeup.

* The wakeup callback now posts on the scheduler directly. A resthandler only needs to provide a callback that encapsulates the continueExecution call on this handler

* renamed finalizeExecute to shutdownExecute

* Added a differentiation between Handler and Callback in Query continuation. Handler will be posted in IO service. Callback will be executed directly

* fix audit log

* Removed callback from deleteQueryCursor. This cannot be waiting

* use CONDITION_LOCKER

* removed yet another thread-local variable

* Fixed forward declaration

* Made RestAqlHandler repeatable

* Use defer to close the query in RestAqlHandler. Now waiting will close the query as well.

* Added a mutex in the RestHandlers to make sure if the callback over network is too fast that there is only one Thread running in the RestHandler

* Captured the GeneralCommTask if it is posted to a RestHandler. This is necessary in the PAUSED case

* Refactoring of _noLockHeader responsibilities. Now the BaseHandler selects them and resets them after it is done. Only Coordinators are allowed to define them if a query is loaded.

* Removed reaction to existing nolockheaders in Coordinator Query Planning Phase

* Removed incorrect assertion.

* Further refactoring of NoLockHeaders. Now there is a wrapper class around it which allows for debugging and logging. The state now seems to be better. Also all non-rest-handler triggered queries clean up the NoLockHeaders properly.

* Fixed UserManager, now deletes nolock headers properly

* Swing to the Symphony of Destruction

* Forgot about community build...

* Fixed compiling of Catch tests

* Fixed community build

* need thread for size

* Made the restSimpleHndler repeatable

* Implemented dump and dumpSync in Cursors, Sync will block a thread, dump allows to wait, only relevant for Streaming cursor

* Reactivated StreamingCursors

* Removed debug output.

* Fixed false query continuation

* Reset thread output to non-debug

* Added missing return statements

* Allow some CollectionMethods to hand-in a context that may contain a transaction. This is meant to honor nolock headers.

* Fixed hidden merge conflict

* Bugfix in aql-profiler.js: use plan.nodes order, not stats

* Added two profiler tests for filter

* Avoid too many getBlock calls in the FilterBlock

* Removed debug output

* RemoteBlock API will now send a done(bool) flag whenever we request documents from remote Servers. It is possible that we are DONE and have a result. The pre 3.4.0 API uses exhausted which is exclusive to a result. This API is still implemented for beckwards compatibility.

* Implemented an executeSync function in AqlQuery. This will block the thread until query execution is complete

* Added another test for FILTER, and one test for the HashedCollectBlock

* Added more tests for HashedCollectBlock; avoid unneccessary getSome calls

* Added an profiler IndexBlock test

* IndexBlock: avoid redundant getSome calls, added missing traceGetSomeEnd calls

* Added a second test profiling IndexBlock

* Added a third test for IndexBlock

* Moved general code to module

* Moved noncluster tests into a separate file

* Split aql-profiler testsuite into three files

* Added profiler tests for LimitBlock

* Added a test for NoResultsBlock

* Added profiler tests for TraversalBlock

* Shutdown of an AQL query is now asynchronous. However in Error-Cases it will be executed in a blocking way still

* Optimized TraversalBlock getSome calls due to new (nightly) test results

* Fixed std::min calls I broke

* Let shutdown calls in AQL wait, if the query is executed successfully.

* Fixed queryResult going out of scope

* fix compile error through merge conflict with devel

* Fixed compiler warning "mismatching tags"

* Removed debug log output

* Added TODO notes

* Fixed test fail due to devel merge

* Fixed some invalid sync waiting implementations

* Added a profiler test for SortBlock

* Added profiler tests for SortedCollectBlock

* Fixed bug introduced by devel merge

* Fixed Remoteblocks ignoring errors!

* Added some more continue Callbacks in used places. And removed debug log

* Removed debug log output

* Suppress clang warnings

* Bugfix: use of invalid stack pointer

* Bugfix: RemoteBlock::shutdown now sends code as int, not string

* Revert "Suppress clang warnings"

This reverts commit 05591649c59743c992edd5e78814edc8ca2a83e0.

* Bugfix: cleanup state in RemoteBlock ::shutdown, ::getSome and ::skipSome

* Bugfix in Subquery shutdown: don't skip subquery shutdown when main query shutdown failed

* Allow copy elision
2018-07-09 14:24:10 +02:00
Frank Celler efc030ea87 Feature/remove event loop (#5565) 2018-06-11 11:46:17 +02:00
Frank Celler c5ac519d1c Bug fix for Read/Write race [WIP] (#5534)
* added wrapper, added asio_ns
* Temporarily fix condition variable bug in job queue.
* preparation for 3.3 back-port
* clang-format
* removed unecessary check, this is now fixed by stand
* added missing RequestStatistics::SET_READ_END
* cosmetics
2018-06-08 10:51:54 +02:00
Frank Celler baec138715 clang-format 2018-06-07 09:26:11 +02:00
Simon ef87bb04a4 Allow RestHandler to pause execution (#5458) 2018-05-25 12:27:14 +02:00
Jan 8e6d5df129
fixed minor several compiler complaints (#5406) 2018-05-23 11:50:00 +02:00
Simon 17b1a2aafb Rest middleware refactoring (#5332) 2018-05-14 17:43:10 +02:00
jsteemann 7f8a1cc614 Merge branch 'bug-fix/add-missing-overrides-and-final' of https://github.com/arangodb/arangodb into devel 2018-05-07 23:02:46 +02:00
Simon fdee0544b7 Using asio::io_context::strands instead of locks (#5266)
* initial try adding strands

* working, stable amount of threads

* improve shell_client cluster

* Fixing some accounting for the scheduler

* Fix accounting

* Fixing wrong strand usage

* add missing return

* Fixing thread accounting

* More scheduler accounting issues

* Fixing various things

* Fixing some stuff

* Fixing some stuff

* Some more subtle bugfixes

* Some cleanup code

* fixing some stuff

* adding some more fixes

* Fixing possible issues

* Fixing missing _storeResult

* Fixing some stuff

* Reducing lambda stack, perhaps fixing hangups

* Fix writeunlocker

* Fixing possible issues

* adding some debugging stuff

* refactor sockets

* possible fixes

* Adding more job guards

* Fixin possible bug

* cleaning up some stuff

* working impl

* Remove debugging output

* Fixing build

* fixing import

* Fixing another bug

* removing debug log

* Removing examples

* Reverting scheduler working code

* Cleanup

* Addressing review comments
2018-05-07 15:58:19 +02:00
jsteemann 52de92d334 add missing override specifiers, add final specifiers 2018-05-04 09:01:50 +02:00
Jan 9c76613e63
fix premature unlock (#3802)
* fix some deadlocks found by evil lock manager (tm)

* fix duplicate lock

* fix indentation

* ensure proper lock dependencies

* fix lock acquisition

* removed useless comment

* do not lock twice

* create either a V8 transaction context or a standalone transaction context, depending on if we are called from within V8 or not

* AQL micro optimizations

* use explicit constructor

* only use V8DealerFeature's ConditionLocker for acquiring a free V8 context

entering and exiting the selected context is then done later on without having to hold the ConditionLocker

* remove some recursive locks

* Disable custom deadlock detection when Thread Sanitizer is enabled

* Changing ifdef's

* grr

* broke gcc

* Using atomic for ApplicationServer::_server

* fix premature unlock

* add some asserts

* honor collection locking in cluster

* yet one more lock fix

* removed assertion

* some more bugfixes

* Fixing assert

(cherry picked from commit 1155df173bfb67303077fbe04ee8d909517bfd21)
2017-12-13 13:27:42 +01:00
Jan 282be208cc
remove TRI_usleep and TRI_sleep, and use std::this_thread::sleep_for … (#3817) 2017-12-06 18:43:49 +01:00
Jan 057e87f919 fix shutdown in case no threads can be started (#3648) 2017-11-10 10:21:51 +01:00
Simon Grätzer ee8209943f Missing things for active / passive (#3578)
* Switching from ttl to supervision based failover mechanism

* Allowing canceling of ongoing actions

* refactored asyncjobmanager

* refactoring some code

* adding read-only flag

* catching some exceptions to reduce log pollution, removing unnecessary code, removing tests for _changeMode

* fixing "createsANewDatabaseWithAnInvalidUser"

* auth = off does not longer make everyone superuser

* Fixing cluster_sync and maybe resilience
2017-11-04 20:30:23 +01:00
Jan 1ace247273 Bug fix/scheduling et al (#3161)
* added V8 context lifetime control options `--javascript.v8-contexts-max-invocations` and `--javascript.v8-contexts-max-age`

* make thread scheduling take into account most of the tasks dispatched via the io service
2017-08-30 10:40:02 +02:00
Frank Celler 6d08d4f4aa Bug fix/scheduler delete (#3077)
* removed delete call

* cleanup

* lower cpu activity of log thread too

* fix log messages

* do not enter threads into unordered_set, as it is unneeded

* do not compile in calls to disabled plan cache

* moved AQL regex cache from thread local variables to a class of its own

* more sensible thread creation and destruction
2017-08-25 12:00:17 +02:00
Jan 6180fcfdd1 Bug fix/prevent multiple journals (#3027)
* prevent multiple journals

* fix documentation

* remove _nrDesired, as it is not used anymore
2017-08-15 23:02:08 +02:00
jsteemann cbba71bb00 change feature order around 2017-05-10 14:29:20 +02:00
jsteemann 217d41f6f5 fix shutdown races 2017-05-09 10:24:40 +02:00
jsteemann 1d22f7bb61 yield on shutdown 2017-04-28 10:06:27 +02:00
jsteemann aa521d5412 better error messages 2017-04-27 14:47:19 +02:00
Frank Celler 45690bbbdd fixed queue size 2017-04-24 18:47:44 +02:00
Frank Celler 1a4675dfbf added queue size to statistics 2017-04-24 18:47:44 +02:00