1
0
Fork 0
Commit Graph

220 Commits

Author SHA1 Message Date
Michael Hackstein d135d55d55
Bug fix/collection babies (#9124)
* Bug fix 3.4/collection babies (#9033)

* Prepare API to create multiple collections in a single request to ClusterMethods to improve speedup

* Added counter on how many collections are successfully created

* Allow multi collection creation one level higher

* CollectionMethods now allow batch createion of Collections

* Improved array size assertions

* Now a graph is createad within a single roundtrip in the agency.

* Added new header files

* Insert collections in the AGENCY with TTL and a isBuilding flag, collections with this flag should not be visisible in the coordinator

* Added forgotten C++ file

* Fixed a rare race condition, and the failing IResearch Tests

* readded callback on DONE, otherwise lists are out of sync

* Fixed assertions to let mocked tests pass...

* Fixed community cluster

* Started fixing IResearch analyzer test, catch-tests are failing ;(

* Solved missed merge-conflict

* Added helper functions in AnalyzerFeature-test

* Refactoring AnalyzerTest Section-Auth

* Refactoring AnalyzerTest Section-Emplace-Duplicates

* Refactoring AnalyzerTest Section-Emplace-Error-Cases. Recovery-Test is now red, it seemed to be green because of invalid test case before.

* Refactoring AnalyzerTest, split GET test into multiple parts, still left 'cluster simulation'.

* Attempt to extract Coordinator / DBServer tests a little bit. This commit starts to break all Coordinator tests. However i am convinced that earlier version did NOT test a cluster situation at all, but some hybrid of SingleServer with full local storage that got told to be a Coordinator from now on, but without any Coordinator setup...

* Temporarly disabled some tests in AnalyzerFeature, as discussed with @gnusi.

* Fixed include guard.

* Temporarily deactivated failing tests

* You shall save your files before you commit...

* Fixed test asserting on plan version, which is now higher than before
2019-06-03 17:11:22 +02:00
Dan Larkin-York d5ecdd143a Convert unit tests to googletest framework (#9034) 2019-05-21 09:17:46 +02:00
Simon 569198a089 Abort el-cheapo transactions if servers fail (#8799) 2019-04-22 19:31:24 +02:00
Kaveh Vahedipour 68178ba165 [devel] supervision bug fix backports (#8314)
* back ports for supervision fixes from 3.4 part 1

* back ports for supervision fixes from 3.4 part 2
2019-03-04 19:27:24 +01:00
Vasiliy 8b94be9bf1 issue 504: return Result instead of int from all ClusterInfo functions (#7954) 2019-01-16 18:07:27 +03:00
Frank Celler ac9f375fb5 big reformat 2018-12-26 00:54:03 +01:00
Andrey Abramov 6674a4282d
avoid calling cluster related functions while instantiating views on … (#7509)
* avoid calling cluster related functions while instantiating views on a db server

* minor cleanup
2018-11-29 15:43:53 +03:00
Max Neunhöffer ae29e5d2ba
Fix index creation in cluster. (#7440)
* Fix index creation in cluster.

Simplify and correct error handling logic in ensureIndexCoordinator.

* After index creation, wait until index appears.

We wait until the Supervision has removed the isBuilding flag and
the coordinator has reloaded the Plan.

* More index handling fixes.

* Directly remove isBuilding in ensureIndexCoordinator (again).

* Fix catch tests by holding mutex shorter.

* Better mutex handling in ClusterInfo.
2018-11-28 16:58:05 +01:00
Kaveh Vahedipour 9ec6619b84 Bug fix/index readiness (#6541)
* indexes are marked  while still missing in Current
* index handling getCollection
* supervision gets indexes from isbuilding, when coordinator is gone before finishing
* seems right now
* fixed broken views
* remove junk comments
* cleanup
* node / supervision adjustements
* supervision fixes
* neunhoef remarks part i
* neunhoef remarks part ii
* neunhoef remarks part ii
* neunhoef remarks part iiI
* collection's current version please
* no need to wait for current once again
* no longer necessary code
* clear comments
* delete left overs
* dead code revived
2018-11-21 14:42:58 +01:00
Wilfried Goesgens 05a7d4e96e add alternative to ClusterInfo::getCollection() that doesn't throw (#7339) 2018-11-20 16:05:57 +01:00
Simon c72818a9dc Make ensureIndexOnCoordinator more robust (#7110) 2018-10-29 17:45:46 +01:00
Simon 0fa7f01c66 Resilience test failure points (#6539) 2018-09-20 01:05:10 +02:00
Max Neunhöffer 84735955ea Add advertised endpoints. (#6104) 2018-09-13 16:30:55 +02:00
Kaveh Vahedipour 28754cbf15 Feature/schmutz plus plus (#5972)
- Schmutz now called "Maintenance" and completely implemented in C++
 - Fix index locking bug in mmfiles
 - Fix a bug in mmfiles with silent option and repsert
 - Slightly increase supervision okperiod and graceperiod
2018-08-24 12:15:35 +02:00
Jan e4d7f1c5f0
Bug fix/wenn der shard mann 2mal klingelt (#5890) 2018-07-26 15:37:40 +02:00
Max Neunhöffer 014c3f7f53
Only load Plan and Current in ClusterInfo when actually needed. (#5649)
* Only update Plan and Current from Agency if not already done.
* Add read protection for getPlanVersion and getCurrentVersion.
* Add a further check to loadPlan and loadCurrent.
* Fix tests to new behaviour.
* Try to increase Plan/Version and Current/Version with every change.
* Add two more increments of Plan/Version
* Add missing increments in tests for Plan/Version.
* Add changelog entry.
2018-07-16 12:20:13 +02:00
Dan Larkin-York 21e16a8a24 Add load balancer awareness for cursor API (#5682) 2018-07-03 14:29:09 +02:00
Vasiliy 7aaeab50fb issue 402.1: share sync thread between IResearchView and IResearchViewDBServer (#5733) 2018-07-02 15:03:00 +03:00
Andrey Abramov 5eef6cd618
Feature/test iresearch (#5610)
* start implementing arangosearch cluster tests.

* backport: ensure view lookup is done via collectionNameResover, ensure updateProperties returns current view properties

* first attempt to fix failing tests

* refactor cluster wide view creation logic

* if view is not found in the new plan then check the old plan too

* ensure the cluster-wide view is looked up in vocbase as well on startup/recovery

* do not store cluster-wide IResearchView in vocbase

* move stale view cleanup to the shared pointer deleter, address test failures

* do not print warning

* enable arangosearch tests by default

* fix catch tests

* address icorrect return value for cluster-wide links

* address some issues with test failures due to cluster-view allocated within TRI_vocbase_t

* simplify per-cid view name, address 'catch' test failures

* ensure IResearchViewNode volatility is properly calculated in cluster

* invoke callbacks directly in AgencyMock instead of waiting for timeout

* ensure view updates via JavaScript always use the latest view definition

* pass a list of shards to `IResearchViewDBServer::snapshot`

* extend cluster aql tests

* fixes after merge

* fix class/struct inconsistencies

* comment failing tests

* remove debug logging

* add debug function

* tests cleanup

* simplify upcoming merge: pass resolver from a side

* backport: move all transaction status callback logic to Methods

* add changes missed from previous commit

* fix js and ruby tests

* more tests for IResearchViewNode

* pass transaction to IResearchViewDBServer::snapshot, address IResearchViewDBServer tests segfault

* pass transaction to IResearchView::snapshot instead of transaction state

* temporarily add trace log output to tests to try to find the cause of the core dump on Jenkins

* add more temporary debug output to trace down the segfault on Jenkins

* add even more temporary debug output to trace down the segfault on Jenkins

* ensure Vieew related maps are cleared during shutdown

* reset ClusterInfo::instance() before DatabaseFeature::unprepare()

* remove extraneous debug output

* missed line from previous commit

* uncomment required line

* add nullptr checks to RocksDBIndexFactory::prepareIndexes(...) similar to the ones in MMFilesIndexFactory::prepareIndexes(...)

* attempt to fix deadlock in tests

* add comment as per reviewer request

* fix aql test suite name

* add some debug logging

* address deadlock between ClusterInfo::loadPlan() and CollectionNameResolver::localNameLookup(...)

* eplicitly state which index definition failed in the log message

* use vocbase from shard-view isntead just in case

* explicitly state which index definition failed in the log message

* do not create shard-view instances from cluster-link instances (only register existing ones)

* add some tests
2018-06-21 20:35:04 +03:00
Vasiliy 4253dca6aa issue 381.5: ensure the LogicalView definition that is persisted to the Agency matches the definition that gets created (#5518)
* issue 381.5: ensure the LogicalView definition that is persisted to the Agency matches the definition that gets created

* backport: correct comment
2018-06-02 17:21:55 +03:00
Andrey Abramov 4649b40b96
Coordinator ArangoSearch view + Execution nodes + AgencyMock (#5160)
* add initial implementation of scatter view rule and node

* add tests for `IResearchViewNode` and `IResearchViewScatterNode`

* add missing check

* modify IResearch execution nodes to use references instead of pointers

* use view id in searialized `ExecutionNode` representation instead of the name

* add cluster mode stubs and checks

* very first attempt to distribute IResearchViewNode

* further implementation of cluster-wide arangosearch views

* fix invalid json format

* add tests for coordinator iresearch view

* allow to retrieve a list of existing views on a coordinator

* more tests for coordinator iresearch view

* some fixes to enable query explanation

* remove Collection dependency from RemoteNode

* remove unnecessary remote ArangoSearch view scatter

* fix explanation appearance

* add some assertions

* minor fixes

* implement IResearchViewCoordinator::updateProperties

* fix view DDL issues

* handle link modifications in DDL operations

* add coordinator implementation of iresearch view links

* fix tests

* further coordinator based view DDL implementation

* further IResearchViewCoordinator implementation

* add initial implementation of AgencyMock

* fix some tests

* code cleanup

* extend test + some fixes

* more tests for IResearchViewCoordinator

* fix tests for IResearchLinkCoordinator

* some fixes after merge

* fix tests

* remove declaration of nonexistent (previously removed) method

* some fixes after review

* remove string duplication

* more tests and fixes

* more fixes and tests

* more tests

* one more test

* fix 'use-after-free' asan error

* fix non-enterprise tests issues
2018-05-02 00:15:11 +03:00
Max Neunhoeffer ce8db24975
Add methods in ClusterInfo to create and drop views. 2018-03-14 23:22:44 +01:00
Max Neunhoeffer a8a307b532
Report views in ClusterInfo.
This is incomplete as it is, because we do not yet parse the views
we see in the plan.
2018-03-08 14:24:22 +01:00
Andrey Abramov a1cfb3d72b Feature iresearch (#4105) 2018-01-19 14:23:58 +01:00
Jan 25af4d7f69
try to not fail hard when a collection is dropped while the WAL is tailed (#4226) 2018-01-04 16:31:11 +01:00
Heiko 61de1b6099 Bug fix/optimize shard distribution api and ui (#3921)
* UI: document/edge editor now remembering their modes (e.g. code or tree)

* changed shardDistribution api behaviour, added PUT route to only fetch collection based shard distribution

* ui: optimized shards view, added missing cleanup function in nodes view

* broken test

* shard distribution tests not fit the new api behaviour

* variables as reference

* CHANGELOG
2018-01-02 12:42:12 +01:00
Jan 17986ebc08
return error context for "some agency operation failed" (#3760) 2017-12-06 11:16:19 +01:00
Michael Hackstein 5c633f9fae Bug fix/speedup shard distribution (#3645)
* Added a more sophisticated test for shardDistribution format

* Updated shard distribution test to use request instead of download

* Added a cxx reporter for the shard distribuation. WIP

* Added some virtual functions/classes for Mocking

* Added a unittest for the new CXX ShardDistribution Reporter.

* The ShardDsitributionReporter now reports Plan and Current correctly. However it does not dare to find a good total/current value and just returns a default. Hence these tests are still red

* Shard distribution now uses the cxx variant

* The ShardDistribution reporter now tries to execute count on the shards

* Updated changelog

* Added error case tests. If the servers time out the mechanism will stop bothering after two seconds and just report default values.
2017-11-10 15:17:08 +01:00
Simon Grätzer 7c31960cf2 Feature/async failover (#3451) 2017-10-18 23:59:29 +02:00
Jan 0561bf45ce Bug fix/isrestore (#3283)
* Make isRestore work in the cluster.

This covers sharded collections with default sharding and non-default
sharding.

* always use locally generate revision ids for storing and looking up documents
2017-10-03 11:53:49 +02:00
Jan 5165155ed1 Bug fix/fixes 0609 (#3227)
* do not use V8 variant of AQL functions in early optimization stage when a C++ variant is available

* additionally, simplify AQL function definitions and aliases

* warn when more than 90% of max mappings are in use

* added C++ variant of replication catchup

* added `--log.role` option

* updated CHANGELOG

* removed non-existing scheduler.threads option from config

* removed useless __FILE__, __LINE__ invocations

* updated CHANGELOG

* allow a priority V8 context

* remove TRI_CORE_MEM_ZONE

* try to fix Windows errors & warnings

* cleanup

* removed memory zones altogether

* exclude system collections from collection tests
2017-09-13 16:28:21 +02:00
Max Neunhöffer f3acea797b Feature/cluster inventory version (#3152)
* Get rid of a compiler warning in community edition.

* Teach /_api/replication/clusterInventory to report Plan/Version and readiness.

This is first implemented in the ClusterInfo library.
Then the clusterInventory code uses it and checks readiness.
Readiness of a collection means that it is created and all shards
and all replications have been created and are in sync.
2017-08-30 13:34:23 +02:00
Kaveh Vahedipour 1d1e0f5a50 Feature/cluster id and extended health (#3046)
* added unique id to cluster, added access to Health

* added agents to health api

* added agents to health api

* added agents to health api

* transaction information for api

* agents listed like other servers

* missing line through merge conflict
2017-08-18 11:13:23 +02:00
Jan 07abf73bd6 fix parallel access to list of failed servers (#2777) 2017-07-12 22:12:25 +02:00
Frank Celler 2807ef559c Feature/move shard fix (#2626)
Major overhaul of handling of synchronous replication.
2017-06-26 16:55:01 +02:00
Mop 619eae9be5 Revert "Squashed commit of the following:"
This reverts commit 2252088572.
2017-06-01 18:37:45 +02:00
Andreas Streichardt 2252088572 Squashed commit of the following:
commit f3d0fd6584b0e451b8c97abcb4ba8d9f2fc6f560
Author: Andreas Streichardt <andreas@arangodb.com>
Date:   Thu Jun 1 17:31:36 2017 +0200

    fix unittest

commit 7cd3544a39e1b78af9d4175cb3b978799b9bbfff
Author: Andreas Streichardt <andreas@arangodb.com>
Date:   Thu Jun 1 17:10:00 2017 +0200

    Remove debug comment

commit fb6b10dac15be49a72dbff80030a7d22abdfc3e0
Merge: 055eb1d269 6b18cc64fe
Author: Andreas Streichardt <andreas@arangodb.com>
Date:   Thu Jun 1 17:00:21 2017 +0200

    Merge branch 'devel' into shardorganizer

commit 055eb1d2693a583d21ea59ec8b6ba95ab0db57ac
Merge: 1ff7998ebf 8ea89b7677
Author: Mop <andreas@arangodb.com>
Date:   Thu Jun 1 16:56:30 2017 +0200

    Merge branch 'shardorganizer' of https://github.com/arangodb/arangodb into shardorganizer

commit 8ea89b76777c75b6a77bf695c3f074a0c4643c29
Author: Andreas Streichardt <andreas@arangodb.com>
Date:   Thu Jun 1 16:55:41 2017 +0200

    Fix shardmapping bug

commit 1ff7998ebfd691598ec5b455ca5bc2bfd7020fb4
Author: Mop <andreas@arangodb.com>
Date:   Wed May 31 17:26:08 2017 +0200

    more output

commit 68e88aa0e14316c4929d05b2c151bee6421d754d
Merge: 0978ad1d9e 44a6a78ec3
Author: Mop <andreas@arangodb.com>
Date:   Wed May 31 17:03:33 2017 +0200

    Merge branch 'shardorganizer' of https://github.com/arangodb/arangodb into shardorganizer

commit 44a6a78ec338a1e7cabb15464500d96b84c68f1d
Author: Andreas Streichardt <andreas@arangodb.com>
Date:   Wed May 31 07:42:43 2017 -0700

    Fix namespace

commit 0978ad1d9e2f01b86204990e74b66958f25eba66
Merge: f98582ccff d74e5989ad
Author: Mop <andreas@arangodb.com>
Date:   Wed May 31 16:40:35 2017 +0200

    Merge branch 'shardorganizer' of https://github.com/arangodb/arangodb into shardorganizer

commit f98582ccff3448f6c2388dab4cc2dc38034271b0
Author: Mop <andreas@arangodb.com>
Date:   Wed May 31 16:39:03 2017 +0200

    Revert "Revert "Next attempt at merging ShardOrganizer...distributeShardsLike fixed""

    This reverts commit fed45b7b10.

commit d74e5989ad478efe7d66d196715c05f4f41c9c29
Author: Andreas Streichardt <andreas@arangodb.com>
Date:   Wed May 31 16:31:31 2017 +0200

    Make it an error

commit 0a6a9ef9464df4f24ad205bbab5b9f8ded50054f
Author: Andreas Streichardt <andreas@arangodb.com>
Date:   Wed May 31 12:42:51 2017 +0200

    distributeShardsLike has to be saved as a cidString
2017-06-01 17:32:40 +02:00
Andreas Streichardt 50876d0d3f Do not look at replicationFactor while creating...just look at the plan! it is great! 2017-05-24 16:25:49 +02:00
Andreas Streichardt b5fcd15214 Fix linter 2017-05-24 14:53:06 +02:00
Andreas Streichardt 9472ab821c Fix rolling back of indices 2017-05-15 15:48:01 +02:00
jsteemann 68b4b2f393 fix shutdown order 2017-05-11 20:59:36 +02:00
Andreas Streichardt 2e4f83fc08 Invalidate current coordinators on every 2nd heartbeatrun
needed for foxx resilience stuff
2017-05-11 18:35:33 +02:00
Andreas Streichardt dad5a1429e Add waitForSyncReplication as a _create() option 2017-04-26 09:57:40 +02:00
Max Neunhoeffer dc3c380904 Fix bug found by static analysis. 2017-01-24 12:30:32 +01:00
Max Neunhoeffer f35e3a7aaf Merge branch 'devel' into schmutz-ng 2017-01-16 09:54:09 +01:00
Andreas Streichardt 191f399ce2 Move AgencyCallback stuff to cluster so it is (hopefully) clear that this
is being used within the cluster and not within the agency.
2017-01-13 18:08:27 +01:00
Kaveh Vahedipour fe48bcb982 fixed for short names in frontend shard view 2017-01-11 16:31:19 +01:00
Kaveh Vahedipour 2b9c018817 fixed resilience 2016-12-09 16:35:32 +01:00
Andreas Streichardt 82682f8d25 Wait for synchronous replication to settle 2016-12-07 18:38:15 +01:00
Andreas Streichardt 11bd9381d5 Add satellite collections 2016-12-06 16:40:50 +01:00