1
0
Fork 0
Commit Graph

1360 Commits

Author SHA1 Message Date
Kaveh Vahedipour 737dac3f3d remove 404-ed callbacks from agency (#9709)
* remove 404-ed callbacks from agency
* revert callback documents to published api :)
* array needs be inside so that multiple unobserves to same key are possible
2019-08-23 10:13:17 +02:00
Jan 00bcc4954c
AQL date functions improvements (#9714) 2019-08-22 12:50:08 +02:00
Lars Maier 75411971f5 [devel] Move Shard Bug 4567124 (#9746)
* Fixed abort in moveshard in the case new leader is not in Current.
* Fixed special case where to server is an old follower.
* Updated Changelog.
2019-08-20 11:37:56 +02:00
Max Neunhöffer dc095f70c7
Fix agency bugs. (#9718)
* More logging in agency client inquiry.
* Fix first bug, increase logging for second.
* Handle timeout better in agency tests.
* Fix wrong term bug.
* Fix transact similarly.
* CHANGELOG.
2019-08-16 11:16:28 +02:00
Kaveh Vahedipour e91cc66b9e forward ported node corrections from 3.4 (#9700)
* forward ported node corrections from 3.4
* tests fixed
2019-08-15 15:48:49 +02:00
Joerg Schad 487b8b3a45
Consistent punctuation. (#9689) 2019-08-12 18:10:36 +02:00
Frank Celler aa3d3f8e40
Feature/cleanup ccpcheck (#9665) 2019-08-12 11:11:49 +02:00
Lars Maier 492057d4f4 [devel] Resign Leadership (#9427)
* First version of ResignLeadership Job.
* Port some performance optimizations from CleanOutServerJob.
* Draft of resigning leadership on shutdown.
* Moved code into Maintenance Feature. Fixed beginShutdown.
2019-08-07 15:02:17 +02:00
Jan 5452b01990
Bug fix/optimizations 06 08 2019 (#9641) 2019-08-06 16:04:39 +02:00
Dan Larkin-York 3d0246cb18 Decentralize includes (#9623) 2019-08-06 15:32:09 +02:00
Lars Maier af03f81562 Fail the FailedLeader Job if the new leader fails. (#9456)
* Fail the FailedLeader Job if the new leader fails.
* In case of timeout do not rollback.
* Fixed catch tests.
2019-08-02 11:49:04 +02:00
Lars Maier ed496fe5dd Feature/hotbackup devel (#9495)
Hotbackup
2019-08-02 11:39:46 +02:00
Jan 4ea95bc109
ascii art (#9565) 2019-07-25 13:03:30 +02:00
Michael Hackstein 987ad41364
Forward Port of changes in 3.5 review (#9544)
* Bug fix 3.5/min replication factor (#9524)

* Cherry-pick minReplicationFactor

* Bug fix/failover with min replication factor (#9486)

* Improve collection time of IResearchQueryOptimizationTest

* Added a minReplicationFactor field in Collections. It is not possible to modify it yet and noone cares for it

* Added some assertion son minReplicationFactor

* Transaction API will now reject writes as soon as minimal replication factor is NOT fulfilled

* added minReplicationFactor to the user interface, preparation for the collection api changes

* added minReplicationFactor to VocBaseCollection, RestReplicationHandler, RestCollectionHandler, ClusterMethods, ClusterInfo and ClusterCollectionCreationInfo

* added minReplicationFactor usage to tests

* TODO TEMOPORARY COMMIT FOR TESTING PLEASE REVERT ME

* minReplicationFactor now able to change via collection  properties route

* fixed wrongly assert

* added minReplicationFactor to the graph management ui

* added minReplicationFactor to the gharial api

* Fixed off-by-one error in minReplicationFactor. We actually enforced one more.

* adjusted description of minReplicationFactor

* FollowerInfo Refactoring

* added gharial api graph creation tests with minimal replication factor

* proper cleanup of shell collection tests, removed lots of duplicate code, preparation for some new tests

* added collection create tests using invalid/valid names, replicationFactor and minReplicationFactor

* Debug logging

* MORE Debug logging

* Included replication fast lane

* Use correct minreplicationfactor

* modified debug logging

* Fixed compileissues

* MORE Debug logging

* MORE Debug logging

* MORE Debug logging

* MORE Debug logging

* MORE Debug logging

* MORE Debug logging

* MORE Debug logging

* Revert "MORE Debug logging"

This reverts commit dab5af28c0.

* Revert "MORE Debug logging"

This reverts commit 6134b664bd.

* Revert "MORE Debug logging"

This reverts commit 80160bdf3b.

* Revert "MORE Debug logging"

This reverts commit 06aabcdfe1.

* Removed debug output

* Added replication fast lane. Also refactored the commands as i cannot take it any more...

* Put some requests of RocksDBReplication onto CATCHUP Lane.

* Put some requests of MMFilesReplication onto CATCHUP Lane.

* Adjusted Fast and MED lane usage in Supervised scheduler

* Added changelog entry

* Added new features entry

* A new leader will now keep old followers in case of failover

* Update arangod/Cluster/ClusterCollectionCreationInfo.cpp

Co-Authored-By: Tobias Gödderz <tobias@arangodb.com>

* Fixed JSLINT

* Unified lane handling of replication handlers

* Sorry forgotten in last commit

* replaced strings with static strings

* more use of static strings

* optimized min repl description in the ui

* decr initial loop variable

* clean up of the createWithId test

* more use of static strings

* Update js/apps/system/_admin/aardvark/APP/frontend/js/views/collectionsView.js

Co-Authored-By: Tobias Gödderz <tobias@arangodb.com>

* Added some comments on condition, renamed variable as suggested in review

* Added check for min replicationFactor to be non-zero

* Added assertion

* Added function to modify min and max replication factor in one go

* added missing semicolon

* rm log devel

* Added a second information to follower info that can keep track of followers that have been in sync before a failover has taken place

* Maintenance reports previous version now to follower info. instead of lying by itself. The Follower Info now gets a failover save mode to report insync followers

* check replFactor against nr dbservers

* Add lie reporting in CURRENT

* Reverted most of my recent commits about Failover situation. The intended plan simply does not work out

* move replication checks from logical collection to rest collection handler

* added more replication tests

* Include assert only if we are not in gtest

* jslint

* set min repl factor to zero if satellite collection

* check replication attributes in v8 collection

* Initial commit, old plan, does not yet work

* fixed ires tests

* Included FailoverCandidates key. Not fully implemented

* fixed wrong assert

* unified in sync follower reporting

* fixed compiler errors

* Cleanup locking, and fixed potential deadlocks

* Comments about locking order in FollowerInfo.

* properly check uint

* Keep old leader as potential failover candidate

* Transaction methods now use followerInfo to check if the leader can write, this might have the sideeffect that 'failoverCandidates' are updated

* Let agency check failoverCandidates if possible

* Initialize member variables

* Use unified follower reporting in DBServerAgencySync

* Removed obsolete variable, collecting it somewhere else

* repl factor attr check

* Reimplemented previous followers, second attempt now. PhaseOne and PhaseTwo can now synchronize on current.

* Fixed assertion, forgot an off-by-one

* adjusted test to be more preciese now

* Fixed failove candidates list

* Disable write on dropping too many followers

* Allow to run updateFailoerCandidates multiple times with same leader.

* Final fixes, resilience tests now green, crossing fingers for jenkins

* Fixed race on atomics comparison

* Fixed invalid number type

* added nullptr handling

* added nullptr handling

* Removed invalid assert

* Make takeover of leadership an atomic operation

* Update tests/js/common/shell/shell-cluster-collection.js

Co-Authored-By: Tobias Gödderz <tobias@arangodb.com>

* Review fixes

* Fixed creation code to use takeoverLeadership

* Update arangod/Cluster/FollowerInfo.h

Co-Authored-By: Tobias Gödderz <tobias@arangodb.com>

* Applied review fixes

* There is no timeout

* Moved AQL + Pregel to INTERNAL_AQL lane, which is medium priority, to avoid deadlocks with Sync replication

* More review fixes

* Use difference if you want to compare two vectors...

* Use std::string ...

* Now check if we are in recovery mode

* Added documentation for minReplicationFactor

* Added readme update as well in documenation

* Removed merge conflict leftovers 0o, i should not trust the IDE

* Update js/apps/system/_admin/aardvark/APP/frontend/js/views/collectionsView.js

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* Update js/apps/system/_admin/aardvark/APP/frontend/js/views/collectionsView.js

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* Update Documentation/Books/Manual/Architecture/Replication/README.md

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* Update CHANGELOG

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* Update Documentation/Books/Manual/DataModeling/Collections/DatabaseMethods.md

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* Update Documentation/Books/Manual/ReleaseNotes/NewFeatures35.md

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* Update Documentation/DocuBlocks/Rest/Collections/1_structs.md

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* Update js/apps/system/_admin/aardvark/APP/frontend/js/views/graphManagementView.js

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* Update js/apps/system/_admin/aardvark/APP/frontend/js/views/graphManagementView.js

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* Update Documentation/DocuBlocks/Rest/Graph/1_structs.md

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* Apply suggestions from code review

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* Adepted review requests, thanks for finding!

* Removed unnecessary const

* Apply suggestions from code review

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* Moved initilization of variable more downwards

* Apply lock before notify_all()

* Remove documentation except DocuBlocks, covered by PR in docs repo

* Remove accidental indent

* Removed leftover merge conflict in documentation block
2019-07-23 13:14:38 +02:00
Michael Hackstein 36b1d290a9
Bug fix/failover with min replication factor (#9486)
* Improve collection time of IResearchQueryOptimizationTest

* Added a minReplicationFactor field in Collections. It is not possible to modify it yet and noone cares for it

* Added some assertion son minReplicationFactor

* Transaction API will now reject writes as soon as minimal replication factor is NOT fulfilled

* added minReplicationFactor to the user interface, preparation for the collection api changes

* added minReplicationFactor to VocBaseCollection, RestReplicationHandler, RestCollectionHandler, ClusterMethods, ClusterInfo and ClusterCollectionCreationInfo

* added minReplicationFactor usage to tests

* TODO TEMOPORARY COMMIT FOR TESTING PLEASE REVERT ME

* minReplicationFactor now able to change via collection  properties route

* fixed wrongly assert

* added minReplicationFactor to the graph management ui

* added minReplicationFactor to the gharial api

* Fixed off-by-one error in minReplicationFactor. We actually enforced one more.

* adjusted description of minReplicationFactor

* FollowerInfo Refactoring

* added gharial api graph creation tests with minimal replication factor

* proper cleanup of shell collection tests, removed lots of duplicate code, preparation for some new tests

* added collection create tests using invalid/valid names, replicationFactor and minReplicationFactor

* Debug logging

* MORE Debug logging

* Included replication fast lane

* Use correct minreplicationfactor

* modified debug logging

* Fixed compileissues

* MORE Debug logging

* MORE Debug logging

* MORE Debug logging

* MORE Debug logging

* MORE Debug logging

* MORE Debug logging

* MORE Debug logging

* Revert "MORE Debug logging"

This reverts commit dab5af28c0.

* Revert "MORE Debug logging"

This reverts commit 6134b664bd.

* Revert "MORE Debug logging"

This reverts commit 80160bdf3b.

* Revert "MORE Debug logging"

This reverts commit 06aabcdfe1.

* Removed debug output

* Added replication fast lane. Also refactored the commands as i cannot take it any more...

* Put some requests of RocksDBReplication onto CATCHUP Lane.

* Put some requests of MMFilesReplication onto CATCHUP Lane.

* Adjusted Fast and MED lane usage in Supervised scheduler

* Added changelog entry

* Added new features entry

* A new leader will now keep old followers in case of failover

* Update arangod/Cluster/ClusterCollectionCreationInfo.cpp

Co-Authored-By: Tobias Gödderz <tobias@arangodb.com>

* Fixed JSLINT

* Unified lane handling of replication handlers

* Sorry forgotten in last commit

* replaced strings with static strings

* more use of static strings

* optimized min repl description in the ui

* decr initial loop variable

* clean up of the createWithId test

* more use of static strings

* Update js/apps/system/_admin/aardvark/APP/frontend/js/views/collectionsView.js

Co-Authored-By: Tobias Gödderz <tobias@arangodb.com>

* Added some comments on condition, renamed variable as suggested in review

* Added check for min replicationFactor to be non-zero

* Added assertion

* Added function to modify min and max replication factor in one go

* added missing semicolon

* rm log devel

* Added a second information to follower info that can keep track of followers that have been in sync before a failover has taken place

* Maintenance reports previous version now to follower info. instead of lying by itself. The Follower Info now gets a failover save mode to report insync followers

* check replFactor against nr dbservers

* Add lie reporting in CURRENT

* Reverted most of my recent commits about Failover situation. The intended plan simply does not work out

* move replication checks from logical collection to rest collection handler

* added more replication tests

* Include assert only if we are not in gtest

* jslint

* set min repl factor to zero if satellite collection

* check replication attributes in v8 collection

* Initial commit, old plan, does not yet work

* fixed ires tests

* Included FailoverCandidates key. Not fully implemented

* fixed wrong assert

* unified in sync follower reporting

* fixed compiler errors

* Cleanup locking, and fixed potential deadlocks

* Comments about locking order in FollowerInfo.

* properly check uint

* Keep old leader as potential failover candidate

* Transaction methods now use followerInfo to check if the leader can write, this might have the sideeffect that 'failoverCandidates' are updated

* Let agency check failoverCandidates if possible

* Initialize member variables

* Use unified follower reporting in DBServerAgencySync

* Removed obsolete variable, collecting it somewhere else

* repl factor attr check

* Reimplemented previous followers, second attempt now. PhaseOne and PhaseTwo can now synchronize on current.

* Fixed assertion, forgot an off-by-one

* adjusted test to be more preciese now

* Fixed failove candidates list

* Disable write on dropping too many followers

* Allow to run updateFailoerCandidates multiple times with same leader.

* Final fixes, resilience tests now green, crossing fingers for jenkins

* Fixed race on atomics comparison

* Fixed invalid number type

* added nullptr handling

* added nullptr handling

* Removed invalid assert

* Make takeover of leadership an atomic operation

* Update tests/js/common/shell/shell-cluster-collection.js

Co-Authored-By: Tobias Gödderz <tobias@arangodb.com>

* Review fixes

* Fixed creation code to use takeoverLeadership

* Update arangod/Cluster/FollowerInfo.h

Co-Authored-By: Tobias Gödderz <tobias@arangodb.com>

* Applied review fixes

* There is no timeout

* Moved AQL + Pregel to INTERNAL_AQL lane, which is medium priority, to avoid deadlocks with Sync replication

* More review fixes

* Use difference if you want to compare two vectors...

* Use std::string ...

* Now check if we are in recovery mode

* Added documentation for minReplicationFactor

* Added readme update as well in documenation
2019-07-19 15:00:30 +02:00
Wilfried Goesgens ca0f2b8b86 All hail to the SI (#9445) 2019-07-19 13:52:12 +02:00
Jan 300b8e58f4
Bug fix/fix duplicate actions (#9452) 2019-07-19 09:17:19 +02:00
Simon e5507d840f Feature/comm task refactor (#9426) 2019-07-16 09:43:25 +02:00
Jan c52f2a8315
refactoring (#9411) 2019-07-09 11:15:52 +02:00
Jan 1d15b50d22
Bug fix/applicationserver stop (#9414) 2019-07-08 20:30:05 +02:00
Andrey Abramov a793a4749f
Bug fix/rename iresearch analyzer feature (#9424)
* rename feature

* ensure 'MMFilesLogFileManager::SafeToUseInstance' is safe to use in multithreaded context

* address mac compilation issue
2019-07-08 15:46:57 +03:00
Jan 1a58cc2213
add VelocyPackHelper::equal method (#9389) 2019-07-03 12:15:11 +02:00
Jan 9cb08ded92
make the comparison functions unambiguous (#9349)
* make the comparison functions unambiguous

* added @kaveh's suggestion
2019-07-01 16:35:28 +02:00
Lars Maier c215e30299 Precs to check if collection exists (#9278)
* Adding preconditions for jobs to check that the collection still exists.

* Make it compile.

* Fixed tests.
2019-07-01 13:24:53 +02:00
Lars Maier eb1aa6e024 Response compression (#9300)
* First draft of response compression.

* Cleanup.

* Removed compression from /_api/version.

* Added ruby test for response compression.
2019-06-28 10:02:48 +02:00
Max Neunhöffer d6d362bd3b
Fix agency election lock step bug. (#9351)
* Fix agency election lockstep bug.

Reset the base point for the random election timeout to now whenever we have
cast a vote, be it for us or for some other server.

* CHANGELOG.

* Fix compilation.
2019-06-27 22:06:26 +02:00
Lars Maier 50ce41e062 Release _to server when abort because of dropped collection or `something serious went wrong`. (#9267) 2019-06-17 15:04:44 +02:00
Jan Christoph Uhde 3f603f024f remove some containers from common.h (#9223)
* remove some containers from Common.h

* enterprise fixes
2019-06-07 13:27:24 +02:00
Kaveh Vahedipour 8a511934df [devel] fix ttl handling for object assigments (#9207)
* fix ttl handling for object assigments
* completed test handling
* CHANGELOG.
2019-06-06 15:48:40 +02:00
Lars Maier 1e94ecf414 Bug fix/supervision fixes4 (#9016)
* Try to fix agency problems with snapshots.

* Abort MoveShards jobs that have the failed server as fromServer.

* Report aborts.

* CHANGELOG.
2019-05-31 17:20:06 +02:00
Kaveh Vahedipour 773f3c8422 [devel] fix state clientlookuptable (#9066) 2019-05-30 04:24:46 +02:00
Jan 59b67cad40
fix various small annoyances (#9079) 2019-05-23 17:36:38 +02:00
Simon 394c070a4f Do not check isEnabled (#8997) 2019-05-17 16:33:17 +02:00
Wilfried Goesgens 1907a7211b Bug fix/cleanup system includes (#8962) 2019-05-15 15:12:59 +02:00
Andrey Abramov 8358b978be disable arangosearch analyzers in agency (#8989)
* disable arangosearch analyzers

* do nothing in `stop()` if feature is disabled
2019-05-14 17:37:38 +02:00
Matthew Von-Maszewski fb157b89ac Add protective barrier to sendWithFailover() in case MANAGER is nullptr (#8998) 2019-05-14 17:35:55 +02:00
Lars Maier d9ff6647d8 [devel] Do not skip the check if there is only a single agent. (#8943)
* Do not skip the check if there is only a single agent.
* Updated changelog.
2019-05-14 15:42:43 +02:00
Lars Maier 49c568e674 Added reason to job abort method. (#8877) 2019-05-14 15:39:53 +02:00
Jan c00442e31c
fix some issues found by cppcheck (#8959) 2019-05-10 16:31:22 +02:00
Jan 976dc2b726
Bug fix/issues 2019 05 06 (#8913) 2019-05-07 12:17:16 +02:00
Lars Maier c99e8e8973 [devel] ClientID Agency Transaction (#8652)
* Changed clientId to format <serverid>:<uuid>.
* Changed behavior if id is not known.
2019-04-30 10:39:23 +02:00
Jan 449ab1ed8e
Bug fix/cppcheck 13042019 (#8752) 2019-04-15 10:13:56 +02:00
Simon 937d743ba6 Bug fix/pregel stuff (#8733) 2019-04-11 15:58:28 +02:00
Max Neunhöffer 80bfb85695
Port agency performance tuning for many shards to devel. (#8647)
* Port agency performance tuning for many shards to devel.
* Add more IDs to LOG_TOPIC calls.
* Even more IDs for LOG_TOPIC.
* Fix a duplicate LOG_TOPIC ID.
* Fix an old merging bug in devel.
* Don't hesitate between phases one and two for small clusters.
2019-04-11 11:14:56 +02:00
Jan c6d3f8e052
Bug fix/pass on error messages (#8690) 2019-04-10 12:34:25 +02:00
Jan 6b9b9b0946
Bug fix/fix test muell (#8703) 2019-04-09 11:27:53 +02:00
Simon 7cd84a785a Remove Obsolete code (#8657) 2019-04-03 13:40:44 +02:00
Max Neunhöffer 02281d3be4
Handle InitDone correctly. (#8552)
* precondition plan / version in compaction / store TTL removal independent of local _ttl set
* Agency init loops break when shutting down.
* assertion failures in store on restarting following agents
* Minor porting fixes from 3.4
2019-04-01 17:01:05 +02:00
Jan d6d3e3daa4
initialize some member variables, added TODOs (#8545) 2019-03-26 12:57:32 +01:00
Jan 80a6e621ee
don't allocate memory so often in ClusterComm requests (#8550) 2019-03-26 00:31:56 +01:00