1
0
Fork 0
Commit Graph

367 Commits

Author SHA1 Message Date
Jan 70c31da560 port change from devel by @danielhlarkin (#9773) 2019-08-21 13:07:43 +03:00
Dan Larkin-York 663212ba19 [3.5] Check scheduler queue return value (#9759)
* Add to cmake.

* Backport changes from devel.

* added CHANGELOG entry, port adjustments from devel
2019-08-20 12:57:51 +03:00
Michael Hackstein d5840c125a Bug fix 3.5/min replication factor (#9524)
* Cherry-pick minReplicationFactor

* Bug fix/failover with min replication factor (#9486)

* Improve collection time of IResearchQueryOptimizationTest

* Added a minReplicationFactor field in Collections. It is not possible to modify it yet and noone cares for it

* Added some assertion son minReplicationFactor

* Transaction API will now reject writes as soon as minimal replication factor is NOT fulfilled

* added minReplicationFactor to the user interface, preparation for the collection api changes

* added minReplicationFactor to VocBaseCollection, RestReplicationHandler, RestCollectionHandler, ClusterMethods, ClusterInfo and ClusterCollectionCreationInfo

* added minReplicationFactor usage to tests

* TODO TEMOPORARY COMMIT FOR TESTING PLEASE REVERT ME

* minReplicationFactor now able to change via collection  properties route

* fixed wrongly assert

* added minReplicationFactor to the graph management ui

* added minReplicationFactor to the gharial api

* Fixed off-by-one error in minReplicationFactor. We actually enforced one more.

* adjusted description of minReplicationFactor

* FollowerInfo Refactoring

* added gharial api graph creation tests with minimal replication factor

* proper cleanup of shell collection tests, removed lots of duplicate code, preparation for some new tests

* added collection create tests using invalid/valid names, replicationFactor and minReplicationFactor

* Debug logging

* MORE Debug logging

* Included replication fast lane

* Use correct minreplicationfactor

* modified debug logging

* Fixed compileissues

* MORE Debug logging

* MORE Debug logging

* MORE Debug logging

* MORE Debug logging

* MORE Debug logging

* MORE Debug logging

* MORE Debug logging

* Revert "MORE Debug logging"

This reverts commit dab5af28c0.

* Revert "MORE Debug logging"

This reverts commit 6134b664bd.

* Revert "MORE Debug logging"

This reverts commit 80160bdf3b.

* Revert "MORE Debug logging"

This reverts commit 06aabcdfe1.

* Removed debug output

* Added replication fast lane. Also refactored the commands as i cannot take it any more...

* Put some requests of RocksDBReplication onto CATCHUP Lane.

* Put some requests of MMFilesReplication onto CATCHUP Lane.

* Adjusted Fast and MED lane usage in Supervised scheduler

* Added changelog entry

* Added new features entry

* A new leader will now keep old followers in case of failover

* Update arangod/Cluster/ClusterCollectionCreationInfo.cpp

Co-Authored-By: Tobias Gödderz <tobias@arangodb.com>

* Fixed JSLINT

* Unified lane handling of replication handlers

* Sorry forgotten in last commit

* replaced strings with static strings

* more use of static strings

* optimized min repl description in the ui

* decr initial loop variable

* clean up of the createWithId test

* more use of static strings

* Update js/apps/system/_admin/aardvark/APP/frontend/js/views/collectionsView.js

Co-Authored-By: Tobias Gödderz <tobias@arangodb.com>

* Added some comments on condition, renamed variable as suggested in review

* Added check for min replicationFactor to be non-zero

* Added assertion

* Added function to modify min and max replication factor in one go

* added missing semicolon

* rm log devel

* Added a second information to follower info that can keep track of followers that have been in sync before a failover has taken place

* Maintenance reports previous version now to follower info. instead of lying by itself. The Follower Info now gets a failover save mode to report insync followers

* check replFactor against nr dbservers

* Add lie reporting in CURRENT

* Reverted most of my recent commits about Failover situation. The intended plan simply does not work out

* move replication checks from logical collection to rest collection handler

* added more replication tests

* Include assert only if we are not in gtest

* jslint

* set min repl factor to zero if satellite collection

* check replication attributes in v8 collection

* Initial commit, old plan, does not yet work

* fixed ires tests

* Included FailoverCandidates key. Not fully implemented

* fixed wrong assert

* unified in sync follower reporting

* fixed compiler errors

* Cleanup locking, and fixed potential deadlocks

* Comments about locking order in FollowerInfo.

* properly check uint

* Keep old leader as potential failover candidate

* Transaction methods now use followerInfo to check if the leader can write, this might have the sideeffect that 'failoverCandidates' are updated

* Let agency check failoverCandidates if possible

* Initialize member variables

* Use unified follower reporting in DBServerAgencySync

* Removed obsolete variable, collecting it somewhere else

* repl factor attr check

* Reimplemented previous followers, second attempt now. PhaseOne and PhaseTwo can now synchronize on current.

* Fixed assertion, forgot an off-by-one

* adjusted test to be more preciese now

* Fixed failove candidates list

* Disable write on dropping too many followers

* Allow to run updateFailoerCandidates multiple times with same leader.

* Final fixes, resilience tests now green, crossing fingers for jenkins

* Fixed race on atomics comparison

* Fixed invalid number type

* added nullptr handling

* added nullptr handling

* Removed invalid assert

* Make takeover of leadership an atomic operation

* Update tests/js/common/shell/shell-cluster-collection.js

Co-Authored-By: Tobias Gödderz <tobias@arangodb.com>

* Review fixes

* Fixed creation code to use takeoverLeadership

* Update arangod/Cluster/FollowerInfo.h

Co-Authored-By: Tobias Gödderz <tobias@arangodb.com>

* Applied review fixes

* There is no timeout

* Moved AQL + Pregel to INTERNAL_AQL lane, which is medium priority, to avoid deadlocks with Sync replication

* More review fixes

* Use difference if you want to compare two vectors...

* Use std::string ...

* Now check if we are in recovery mode

* Added documentation for minReplicationFactor

* Added readme update as well in documenation

* Removed merge conflict leftovers 0o, i should not trust the IDE

* Update js/apps/system/_admin/aardvark/APP/frontend/js/views/collectionsView.js

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* Update js/apps/system/_admin/aardvark/APP/frontend/js/views/collectionsView.js

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* Update Documentation/Books/Manual/Architecture/Replication/README.md

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* Update CHANGELOG

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* Update Documentation/Books/Manual/DataModeling/Collections/DatabaseMethods.md

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* Update Documentation/Books/Manual/ReleaseNotes/NewFeatures35.md

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* Update Documentation/DocuBlocks/Rest/Collections/1_structs.md

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* Update js/apps/system/_admin/aardvark/APP/frontend/js/views/graphManagementView.js

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* Update js/apps/system/_admin/aardvark/APP/frontend/js/views/graphManagementView.js

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* Update Documentation/DocuBlocks/Rest/Graph/1_structs.md

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* Apply suggestions from code review

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* Adepted review requests, thanks for finding!

* Removed unnecessary const

* Apply suggestions from code review

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* Moved initilization of variable more downwards

* Apply lock before notify_all()

* Remove documentation except DocuBlocks, covered by PR in docs repo

* Remove accidental indent
2019-07-22 17:48:34 +03:00
Simon b3ffaff86c Pregel additional test & TSan error fix (#9357) 2019-07-04 09:36:40 +02:00
Tobias Gödderz 09e8aa8911 Fix two int size mismatches (#9226) 2019-06-09 07:35:03 +02:00
Jan Christoph Uhde 3f603f024f remove some containers from common.h (#9223)
* remove some containers from Common.h

* enterprise fixes
2019-06-07 13:27:24 +02:00
Wilfried Goesgens 863257f1d5 fix windows warnings / compile errors (#9215) 2019-06-06 18:08:44 +02:00
Jan Christoph Uhde 12ec86c2b1 add shardKeyAttribute to pregel start parameters (#9149) 2019-06-05 14:25:47 +02:00
Jan 1087c11fa4
Bug fix/pregel micro improvements (#9179) 2019-06-04 09:31:48 +02:00
Jan a4a8bb37c2
improve error message (#9158) 2019-06-03 17:06:51 +02:00
Frank Celler 1a71cc6898 fix for windows 2019-05-30 12:10:54 +02:00
Simon a5a72ef6d4 port Pregel segmented buffers (#9112) 2019-05-28 18:23:20 +02:00
Simon 93b2e64f37 Port pregel fixes (#9022) 2019-05-17 16:32:58 +02:00
Wilfried Goesgens 1907a7211b Bug fix/cleanup system includes (#8962) 2019-05-15 15:12:59 +02:00
Simon 6075fec35e Forward port some changes (#8949) 2019-05-09 19:42:06 +02:00
jsteemann 3c7c54985b fix compile warnings about unused variables/functions 2019-05-08 16:56:01 +02:00
Simon 0502a97abb forwardport changes from 3.4 (#8894) 2019-05-08 14:34:25 +02:00
jsteemann a15dd41113 initialize member 2019-04-11 19:12:53 +02:00
jsteemann c5571858ab fixed log uniqueness violation 2019-04-11 16:16:31 +02:00
Simon 937d743ba6 Bug fix/pregel stuff (#8733) 2019-04-11 15:58:28 +02:00
Jan e36f7d429e
Bug fix/fix scheduler shutdown task assertion (#8727) 2019-04-10 19:51:45 +02:00
Jan 7bf23bf238
add shutdown protection for PregelFeature (#8628) 2019-03-29 19:36:10 +01:00
Jan 9ab9cc7857
disambiguate internal exceptions (#8623) 2019-03-29 15:59:37 +01:00
Jan bccd2f3d58 acquire mutex when serializing conductor state (#8620) 2019-03-29 09:23:34 +01:00
Jan Christoph Uhde c3f7961b88 apply unique log ids (#8561) 2019-03-25 20:26:51 +01:00
Jan e078f35285
fixed typos, removed unneeded includes (#8547) 2019-03-25 12:09:37 +01:00
Simon 3ada15fc35 The Legendary El Cheapo (#8485) 2019-03-22 11:38:33 +01:00
Jan 3156e481de
fix test (#8402) 2019-03-13 15:24:55 +01:00
Jan 9d3327c6ea
Bug fix/rearm cursors (#8363) 2019-03-12 15:28:33 +01:00
Simon 5b4fd3d989 Nullptr checks (#8263) 2019-02-26 08:53:07 +01:00
Jan 1798036ea0
Bug fix/optimizations 18022019 (#8180) 2019-02-19 19:24:04 +01:00
Wilfried Goesgens 492d05c1f1 Feature/upgrade v8 7.1.302.28 (#8088) 2019-02-19 11:15:34 +01:00
Jan f7d94daba2
port some changes from 3.4 (#8011) 2019-01-29 09:26:57 +01:00
Lars Maier 12eebb15fe Feature/new server infra (#7733)
* Decoupled IO handling from Scheduler.

* Fixed SSL start up bug.

* Replaced Scheduler with new worker farm implementation.

* Added minimal statistics and info string for Scheduler.

* Added support for timed submissions.

* Updated delayed submission api. Updated code that used timers.

* Extracted new Scheduler into a virtual parent class. The implementation can now depend on the usecase.

* Signal handler now working.

* Changed threads names, `_stop` is atomic, check for failure during thread start + exception handling like old scheduler did.

* Commented on source code and added TODOs.

* Played around with start-stop-conditions

* Play around with start stop condition.

* start stop cond

* Sart Stop Conditions

* Removed bad cv_status check.

* Bug fix: now compare the actual objects instead of pointer values. Setup t1 and t2 depending on the thread id.

* Moved most of the stuff now unrelated to the Scheduler to GeneralServer. Got rid of JobGuard.

* Instead of waiting for a thread to terminate, put it on a clean up list and check for its termination in each supervisor run.

* Allow detaching long running threads.

* Fixed test mock.

* Updated the WorkHandle logic. Removed post functions.

* Fixed crash when obtaining shared_ptr from this in destructor.

* Added lost mutex.

* Fixed memory leak.

* Fixed merge bug.

* Changed a lot of code to optimize the scheduler.

* Fixed bug of invalidated iterator. Dont remove task on shutdown at different places. Let scheduler threads run until queue is empty.

* Only by value calls to queue.

* Added options again.

* Clean up of code.

* UI Request Lane added.

* Bug fixes in Scheduler.

* Applied reformat.

* Use sigaction.
2019-01-08 10:12:02 +01:00
Frank Celler ac9f375fb5 big reformat 2018-12-26 00:54:03 +01:00
Jan 59868fdf04
Bug fix/use lock for pregel stats (#7499) 2018-11-28 18:39:27 +01:00
Jan 1973022d00
Bug fix/refactor find emplace (#7197) 2018-11-02 17:18:47 +01:00
Jan 976cc38e7c
remove OpenFilesTracker (#7189) 2018-11-02 11:38:38 +01:00
Matthew Von-Maszewski 97ba8ca2be Bugfix: More 3.4 scheduler changes backported (#7091) 2018-10-26 17:09:20 +02:00
Simon 10dc287eb3 Silence Tsan warnings (#7075) 2018-10-25 15:50:39 +02:00
Simon 6628a4e55a Refactor stuff, add async batch extension task (#6875) (#6880) 2018-10-15 13:18:24 +02:00
Simon 0fa7f01c66 Resilience test failure points (#6539) 2018-09-20 01:05:10 +02:00
Simon 22b9c31c13 Removing ClusterComm ClientTransactionID (#6294) 2018-09-12 22:15:16 +02:00
jsteemann 397a607eb4 remove unused variable 2018-09-12 21:34:08 +02:00
Simon 41c550ce79 Fix Pregel Graph Loading Logic (#6419) 2018-09-10 11:58:14 +02:00
Simon 2a4e327d8c Fix pregel during import (#6355) 2018-09-05 16:26:31 +02:00
Jan 5f0403ed1c
Bug fix/add aql collection count cache (#6227) 2018-08-23 16:05:51 +02:00
Simon 392118bd62 Use RangeDelete where possible (#6121) 2018-08-15 18:52:09 +02:00
Vasiliy 6fd541d110 issue 427.5: use ApplicationServer reference instead of pointer (#6145)
* issue 427.5: use ApplicationServer reference instead of pointer

* address MSVC build failure
2018-08-15 12:16:02 +03:00
Frank Celler a688dc0962
Feature/remove job queue thread (#5986)
limiting V8 calls in flight
2018-08-10 12:17:43 +02:00