1
0
Fork 0
Commit Graph

393 Commits

Author SHA1 Message Date
Andrey Abramov f4e6538edd Bug fix/internal issue #647 (#10292) (#10334)
* Bug fix/internal issue #647 (#10292)

* extend replication tests

* ensure proper replication order

* fix tests

* address review comments

* address test failures

* extend dump tests

* fix analyzers tests

* more fixes

* extend tests

* enhance tests

* adjust tests

* use enum instead of flags (part 1)

* cleanup

* use enum instead of flags (part 2)

* get rid of flags for views

* get rid of flags for collections

* completely get rid of collection flags

* fix replication test

* refactor index flags

* fix tests and move AnalyzerPool out of class scope

* fix tests

* adjust log levels

* add tests

* remove debug logging

* remove noexcept from `equalAanalyzer`

* extend cluster tests

* fix cluster tests

* add tests for views and smart graphs

* address jslint errors

# Conflicts:
#	arangod/Cluster/ClusterMethods.cpp
#	arangod/Cluster/v8-cluster.cpp
#	arangod/IResearch/IResearchAnalyzerFeature.cpp
#	arangod/IResearch/IResearchAnalyzerFeature.h
#	arangod/IResearch/IResearchLinkMeta.cpp
#	arangod/RestHandler/RestAnalyzerHandler.cpp
#	arangod/StorageEngine/PhysicalCollection.cpp
#	arangod/VocBase/Methods/Indexes.cpp
#	tests/IResearch/IResearchAnalyzerFeature-test.cpp
#	tests/IResearch/IResearchFeature-test.cpp
#	tests/IResearch/IResearchLinkHelper-test.cpp
#	tests/IResearch/IResearchLinkMeta-test.cpp
#	tests/IResearch/IResearchQueryOptimization-test.cpp
#	tests/IResearch/IResearchViewDBServer-test.cpp
#	tests/IResearch/IResearchViewSort-test.cpp
#	tests/V8Server/v8-analyzers-test.cpp
#	tests/VocBase/LogicalDataSource-test.cpp

* Update CHANGELOG

* Update CHANGELOG
2019-10-30 17:01:18 +03:00
Lars Maier 5a97acc166 Fixed available. Fixed not found for list. (#10234)
* Fixed available. Fixed not found for list.

* Fixed error reporting.

* Updated changelog.

* Fixed logid.
2019-10-14 16:51:42 +03:00
Max Neunhöffer ff28647627 Improve timings for hotbackup lock. (#10230) 2019-10-11 17:42:12 +03:00
Max Neunhöffer 4a79205894 Fix hotbackup locking. (#10186)
* Fix dbserver locking and releasing for hotbackup.

* Fix fix.
2019-10-11 17:38:13 +03:00
Lars Maier 64ac6d7af3 Bug fix 3.5/backup list with bad dbserver (#10130)
* Added available field to indicate bad backups.

* Added nrPiecesPresent.

* Fix logids.

* Make Windows compilation happy.

* Fix log ids.
2019-10-02 11:24:54 +03:00
Jan 845701de6d fix broken & duplicate logIds (#10134) 2019-10-01 20:15:07 +03:00
Max Neunhöffer c067e17bb6 Port backup size to 3.5. (#10117)
* Port backup size to 3.5.

* Lars' fixes from devel.

* Fix compilation one more time.

* Lars' datetime fix from devel.
2019-10-01 16:45:28 +03:00
Kaveh Vahedipour 932def9cf1 [3.5] unitended multiple unlocks (#10115)
* unitended multiple unlocks

* Update CHANGELOG

* Update CHANGELOG
2019-10-01 16:37:28 +03:00
Kaveh Vahedipour 44232d856e corrected hot backup lock timings (#10075)
* corrected hot backup lock timings

* the lock timeout added to overall unlock timeout]

* Update CHANGELOG
2019-09-30 11:43:25 +03:00
Kaveh Vahedipour f15fe22c7c [3.5] coordinator proper wait for dbservers after hot restore (#10049)
* rebootIds instead of boot stamps

* noexcept is of course wrong

* wrong noexcept here. we're copying.

* change log

* Update CHANGELOG
2019-09-30 11:07:04 +03:00
Kaveh Vahedipour 999e4b8873 [3.5] agency lock left behind (#10022)
* short timeout issue and discarded agency lock removal

* short timeout issue and discarded agency lock removal

* no hot backup in 3.5.0
2019-09-17 00:13:54 +03:00
Max Neunhöffer 328f46e3d6 This merges hotbackup and atomic-db-creation into 3.5. (#9968)
* Squashed commit of feature-3.5/hotbackup_devel.

This puts hotbackup into 3.5.

* Port atomic-database-creation-2 to 3.5.

* Remove some wrongly ported code.

* Fix compilation.

* Fix a manual merge error.

* Remove a feature from the mocks which does not exist in 3.5.

* Add some code which was forgotten in manual merge.

* Fix a problem introduced in a manual merge.

* reuse function

* Address some whitespace issues that came up in review

* aardvark should not create the frontend collection

* create _frontend collection from c++

* recheckAndUpdate Callback in CollectionWatcher

* Wrong author ;)

* rm outdated todo

* Update lib/Basics/VelocyPackHelper.h

Co-Authored-By: Michael Hackstein <michael@arangodb.com>

* use logger unique id, use startup logger

* not needed

* optimized vector shardid method

* do not create _modules collection lazy anymre

* Formatting.

* Assert instead of if/TRI_ASSERT(false)

* Don't use exceptions as control structure

* Re-add READ_LOCKER that got lost in translation

* Fix audit log in case database creation fails early.

* legacy sharding

* Add CHANGELOG entry.

* Retry database cancellation indefinitely

* Do not use exceptions in UpgradeTask

* DropCollection is a FAST_LANE action and should not need much time or else retry.

* Remove superflous addition of LdapFeature

Proudly brought to you by ASAN tests

* Fixed check for distributShardsLike sharding on _system database

* Fixed compile issue on tests

* Removed assertion that seems to be not correct yet on devel.

* Sort out google cloud storage as remote. (#9918)

* Add successful method to ClusterCommResult.
* Improve error forwarding for cluster internal communication.

* Feature/hotbackup list retries (#9924)

* retry hot backup listing for 2 minutes in cluster before giving up

* Enable api by default.

* fix broken list of non existing id (#9957)

* Fix compilation after manual merge.

* Fix another compilation problem.

* Yet more fixes for compilation.

* More compilation fixes.
2019-09-11 13:13:54 +03:00
Jan 480f0e799a Bug fix 3.5/babies unknown shard (#9871)
* fixing cluster

* added tests for baby operations with custom sharding

* added errorMessage for UX

* updated CHANGELOG

* updated CHANGELOG

* fix error code handling
2019-09-02 10:58:25 +03:00
Michael Hackstein d5840c125a Bug fix 3.5/min replication factor (#9524)
* Cherry-pick minReplicationFactor

* Bug fix/failover with min replication factor (#9486)

* Improve collection time of IResearchQueryOptimizationTest

* Added a minReplicationFactor field in Collections. It is not possible to modify it yet and noone cares for it

* Added some assertion son minReplicationFactor

* Transaction API will now reject writes as soon as minimal replication factor is NOT fulfilled

* added minReplicationFactor to the user interface, preparation for the collection api changes

* added minReplicationFactor to VocBaseCollection, RestReplicationHandler, RestCollectionHandler, ClusterMethods, ClusterInfo and ClusterCollectionCreationInfo

* added minReplicationFactor usage to tests

* TODO TEMOPORARY COMMIT FOR TESTING PLEASE REVERT ME

* minReplicationFactor now able to change via collection  properties route

* fixed wrongly assert

* added minReplicationFactor to the graph management ui

* added minReplicationFactor to the gharial api

* Fixed off-by-one error in minReplicationFactor. We actually enforced one more.

* adjusted description of minReplicationFactor

* FollowerInfo Refactoring

* added gharial api graph creation tests with minimal replication factor

* proper cleanup of shell collection tests, removed lots of duplicate code, preparation for some new tests

* added collection create tests using invalid/valid names, replicationFactor and minReplicationFactor

* Debug logging

* MORE Debug logging

* Included replication fast lane

* Use correct minreplicationfactor

* modified debug logging

* Fixed compileissues

* MORE Debug logging

* MORE Debug logging

* MORE Debug logging

* MORE Debug logging

* MORE Debug logging

* MORE Debug logging

* MORE Debug logging

* Revert "MORE Debug logging"

This reverts commit dab5af28c0.

* Revert "MORE Debug logging"

This reverts commit 6134b664bd.

* Revert "MORE Debug logging"

This reverts commit 80160bdf3b.

* Revert "MORE Debug logging"

This reverts commit 06aabcdfe1.

* Removed debug output

* Added replication fast lane. Also refactored the commands as i cannot take it any more...

* Put some requests of RocksDBReplication onto CATCHUP Lane.

* Put some requests of MMFilesReplication onto CATCHUP Lane.

* Adjusted Fast and MED lane usage in Supervised scheduler

* Added changelog entry

* Added new features entry

* A new leader will now keep old followers in case of failover

* Update arangod/Cluster/ClusterCollectionCreationInfo.cpp

Co-Authored-By: Tobias Gödderz <tobias@arangodb.com>

* Fixed JSLINT

* Unified lane handling of replication handlers

* Sorry forgotten in last commit

* replaced strings with static strings

* more use of static strings

* optimized min repl description in the ui

* decr initial loop variable

* clean up of the createWithId test

* more use of static strings

* Update js/apps/system/_admin/aardvark/APP/frontend/js/views/collectionsView.js

Co-Authored-By: Tobias Gödderz <tobias@arangodb.com>

* Added some comments on condition, renamed variable as suggested in review

* Added check for min replicationFactor to be non-zero

* Added assertion

* Added function to modify min and max replication factor in one go

* added missing semicolon

* rm log devel

* Added a second information to follower info that can keep track of followers that have been in sync before a failover has taken place

* Maintenance reports previous version now to follower info. instead of lying by itself. The Follower Info now gets a failover save mode to report insync followers

* check replFactor against nr dbservers

* Add lie reporting in CURRENT

* Reverted most of my recent commits about Failover situation. The intended plan simply does not work out

* move replication checks from logical collection to rest collection handler

* added more replication tests

* Include assert only if we are not in gtest

* jslint

* set min repl factor to zero if satellite collection

* check replication attributes in v8 collection

* Initial commit, old plan, does not yet work

* fixed ires tests

* Included FailoverCandidates key. Not fully implemented

* fixed wrong assert

* unified in sync follower reporting

* fixed compiler errors

* Cleanup locking, and fixed potential deadlocks

* Comments about locking order in FollowerInfo.

* properly check uint

* Keep old leader as potential failover candidate

* Transaction methods now use followerInfo to check if the leader can write, this might have the sideeffect that 'failoverCandidates' are updated

* Let agency check failoverCandidates if possible

* Initialize member variables

* Use unified follower reporting in DBServerAgencySync

* Removed obsolete variable, collecting it somewhere else

* repl factor attr check

* Reimplemented previous followers, second attempt now. PhaseOne and PhaseTwo can now synchronize on current.

* Fixed assertion, forgot an off-by-one

* adjusted test to be more preciese now

* Fixed failove candidates list

* Disable write on dropping too many followers

* Allow to run updateFailoerCandidates multiple times with same leader.

* Final fixes, resilience tests now green, crossing fingers for jenkins

* Fixed race on atomics comparison

* Fixed invalid number type

* added nullptr handling

* added nullptr handling

* Removed invalid assert

* Make takeover of leadership an atomic operation

* Update tests/js/common/shell/shell-cluster-collection.js

Co-Authored-By: Tobias Gödderz <tobias@arangodb.com>

* Review fixes

* Fixed creation code to use takeoverLeadership

* Update arangod/Cluster/FollowerInfo.h

Co-Authored-By: Tobias Gödderz <tobias@arangodb.com>

* Applied review fixes

* There is no timeout

* Moved AQL + Pregel to INTERNAL_AQL lane, which is medium priority, to avoid deadlocks with Sync replication

* More review fixes

* Use difference if you want to compare two vectors...

* Use std::string ...

* Now check if we are in recovery mode

* Added documentation for minReplicationFactor

* Added readme update as well in documenation

* Removed merge conflict leftovers 0o, i should not trust the IDE

* Update js/apps/system/_admin/aardvark/APP/frontend/js/views/collectionsView.js

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* Update js/apps/system/_admin/aardvark/APP/frontend/js/views/collectionsView.js

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* Update Documentation/Books/Manual/Architecture/Replication/README.md

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* Update CHANGELOG

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* Update Documentation/Books/Manual/DataModeling/Collections/DatabaseMethods.md

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* Update Documentation/Books/Manual/ReleaseNotes/NewFeatures35.md

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* Update Documentation/DocuBlocks/Rest/Collections/1_structs.md

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* Update js/apps/system/_admin/aardvark/APP/frontend/js/views/graphManagementView.js

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* Update js/apps/system/_admin/aardvark/APP/frontend/js/views/graphManagementView.js

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* Update Documentation/DocuBlocks/Rest/Graph/1_structs.md

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* Apply suggestions from code review

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* Adepted review requests, thanks for finding!

* Removed unnecessary const

* Apply suggestions from code review

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* Moved initilization of variable more downwards

* Apply lock before notify_all()

* Remove documentation except DocuBlocks, covered by PR in docs repo

* Remove accidental indent
2019-07-22 17:48:34 +03:00
Jan 2185cb7309 fix races in cluster collection creation, fix return codes of collection deletion (#9505)
* fix races in cluster collection creation, fix return codes of collection deletion

* honor review comments (partially)

* produce agency dumps only in maintainer mode

* fix unit test failures
2019-07-18 18:33:08 +03:00
Jan c52f2a8315
refactoring (#9411) 2019-07-09 11:15:52 +02:00
Jan 9cb08ded92
make the comparison functions unambiguous (#9349)
* make the comparison functions unambiguous

* added @kaveh's suggestion
2019-07-01 16:35:28 +02:00
Simon cf7cf0131b Try to fix corruption error (#9258) 2019-06-25 10:18:26 +02:00
Michael Hackstein d135d55d55
Bug fix/collection babies (#9124)
* Bug fix 3.4/collection babies (#9033)

* Prepare API to create multiple collections in a single request to ClusterMethods to improve speedup

* Added counter on how many collections are successfully created

* Allow multi collection creation one level higher

* CollectionMethods now allow batch createion of Collections

* Improved array size assertions

* Now a graph is createad within a single roundtrip in the agency.

* Added new header files

* Insert collections in the AGENCY with TTL and a isBuilding flag, collections with this flag should not be visisible in the coordinator

* Added forgotten C++ file

* Fixed a rare race condition, and the failing IResearch Tests

* readded callback on DONE, otherwise lists are out of sync

* Fixed assertions to let mocked tests pass...

* Fixed community cluster

* Started fixing IResearch analyzer test, catch-tests are failing ;(

* Solved missed merge-conflict

* Added helper functions in AnalyzerFeature-test

* Refactoring AnalyzerTest Section-Auth

* Refactoring AnalyzerTest Section-Emplace-Duplicates

* Refactoring AnalyzerTest Section-Emplace-Error-Cases. Recovery-Test is now red, it seemed to be green because of invalid test case before.

* Refactoring AnalyzerTest, split GET test into multiple parts, still left 'cluster simulation'.

* Attempt to extract Coordinator / DBServer tests a little bit. This commit starts to break all Coordinator tests. However i am convinced that earlier version did NOT test a cluster situation at all, but some hybrid of SingleServer with full local storage that got told to be a Coordinator from now on, but without any Coordinator setup...

* Temporarly disabled some tests in AnalyzerFeature, as discussed with @gnusi.

* Fixed include guard.

* Temporarily deactivated failing tests

* You shall save your files before you commit...

* Fixed test asserting on plan version, which is now higher than before
2019-06-03 17:11:22 +02:00
Jan 0cbdfe9289
Bug fix/vpack update (#8875) 2019-04-30 12:33:26 +02:00
Max Neunhöffer 80bfb85695
Port agency performance tuning for many shards to devel. (#8647)
* Port agency performance tuning for many shards to devel.
* Add more IDs to LOG_TOPIC calls.
* Even more IDs for LOG_TOPIC.
* Fix a duplicate LOG_TOPIC ID.
* Fix an old merging bug in devel.
* Don't hesitate between phases one and two for small clusters.
2019-04-11 11:14:56 +02:00
Jan 9ab9cc7857
disambiguate internal exceptions (#8623) 2019-03-29 15:59:37 +01:00
Simon 417ee266d4 Fuse transaction begin request for non baby operations (#8566) 2019-03-27 11:31:39 +01:00
Jan Christoph Uhde c3f7961b88 apply unique log ids (#8561) 2019-03-25 20:26:51 +01:00
Jan 39a3f5bc4e
reintroduce smart joins after temporarily reverting them in devel (#8543) 2019-03-23 20:36:02 +01:00
Simon 3ada15fc35 The Legendary El Cheapo (#8485) 2019-03-22 11:38:33 +01:00
jsteemann dc381a99df Revert "Feature/ncc1701 (#8440)"
This reverts commit 59ad583796.
2019-03-21 19:18:46 +01:00
Jan 59ad583796
Feature/ncc1701 (#8440) 2019-03-21 15:05:36 +01:00
Dan Larkin-York 2eadab33e7 Index hints (#8431) 2019-03-19 09:14:18 +01:00
Simon 49cc3bcd1e Refactorings from cluster trx improvement branch (#8391) 2019-03-14 23:13:17 +01:00
Dan Larkin-York 413e90508f Named indices (#8370) 2019-03-13 18:20:32 +01:00
Jan 1798036ea0
Bug fix/optimizations 18022019 (#8180) 2019-02-19 19:24:04 +01:00
Jan 44c6a2d732
Feature/ttl index (#8169) 2019-02-19 14:12:21 +01:00
Simon 9622f0d13f Properly translate cluster comm errors (#8151) 2019-02-12 18:07:09 +01:00
Vasiliy 8b94be9bf1 issue 504: return Result instead of int from all ClusterInfo functions (#7954) 2019-01-16 18:07:27 +03:00
Frank Celler ac9f375fb5 big reformat 2018-12-26 00:54:03 +01:00
Lars Maier dd07d74d69 [devel] Bug fix/bad leader report current (#7585)
* Bug fix 3.4/bad leader report current (#7574)
* Initialize theLeader non-empty, thus not assuming leadership.
* Correct ClusterInfo to look into Target/CleanedServers.
* Prevent usage of to be cleaned out servers in new collections.
* After a restart, do not assume to be leader for a shard.
* Do nothing in phaseTwo if leader has not been touched. (#7579)
* Drop follower if it refuses to cooperate.

This is important since a dbserver that is follower for a shard will
after a reboot think that it is a leader, at least for a short amount
of time. If it came back quickly enough, the leader might not have
noticed that it was away.
2018-12-03 10:20:30 +01:00
Wilfried Goesgens 05a7d4e96e add alternative to ClusterInfo::getCollection() that doesn't throw (#7339) 2018-11-20 16:05:57 +01:00
Jan 1973022d00
Bug fix/refactor find emplace (#7197) 2018-11-02 17:18:47 +01:00
Max Neunhöffer 37359821cb
Fix arangorestore by adjusting timeouts in write ops. (#7083)
* Improve logging on coordinator when doing `arangorestore`.

* Return more error information in `mergeResults`.

* Longer timeout for communication coordinator -> leader for writes.

This is taking into account possible write stops from followers needed
to get in sync.

* Fix compilation.

* Get rid of numbers in exception log messages.

* Fix a typo.

* Fix compilation.
2018-10-31 14:39:58 +01:00
Simon 4c1e8819c2 Add engine specific collection APIs (#6977) 2018-10-19 17:46:33 +02:00
jsteemann cc21a938c7 fixed typos 2018-10-02 18:19:12 +02:00
Dan Larkin-York 1f63f16396 Move some logging off of general topic. 2018-10-01 13:28:11 -04:00
Simon 22b9c31c13 Removing ClusterComm ClientTransactionID (#6294) 2018-09-12 22:15:16 +02:00
Jan 3b16913b1b
fix cluster index selectivity (#6467) 2018-09-12 14:35:39 +02:00
Simon 1afe3bce98 Remove header from trx::methods (#6271)
* do not create header here

* move headers up
2018-08-28 17:31:00 +02:00
Kaveh Vahedipour 28754cbf15 Feature/schmutz plus plus (#5972)
- Schmutz now called "Maintenance" and completely implemented in C++
 - Fix index locking bug in mmfiles
 - Fix a bug in mmfiles with silent option and repsert
 - Slightly increase supervision okperiod and graceperiod
2018-08-24 12:15:35 +02:00
Dan Larkin-York 5f87f57cd0 Improved sharding algorithms (#6089) 2018-08-09 19:03:32 +02:00
Jan 93222b15d4
track last used keys in cluster key generators, track key on cluster document insert (#6101) 2018-08-08 14:32:16 +02:00
Jan e4d7f1c5f0
Bug fix/wenn der shard mann 2mal klingelt (#5890) 2018-07-26 15:37:40 +02:00