1
0
Fork 0
Commit Graph

539 Commits

Author SHA1 Message Date
Tobias Gödderz 45e28ec5f2 Removed erroneous noexcept specifier (#10060) 2019-09-23 13:46:43 +03:00
Lars Maier e51bc5ca52 [3.4] Background Get Ids (#9785)
* Obtain new unique IDs via a background thread.

* Updated changelog.
2019-09-19 22:33:02 +03:00
Tobias Gödderz dc2e27db6c [3.4] Feature/rebootid notice changes, backport of #9523 (#9685)
* Feature/rebootid notice changes, backport of #9523

* Move to 3.4 and C++11-compatible code (except for test code)

* Backported tests/Mocks/Servers.{h,cpp}

* Rebuilt errorfiles

* Ported CallbackGuardTest from gtest to catch

* Ported RebootTrackerTest from gtest to catch

* Make sure the state method is called, so the overridden method is used during tests

* Fixed test to work with the old scheduler

* release version 3.4.8

* [3.4] fix agency lockup when removing 404-ed callbacks (#9839)

* Update arangod/Cluster/ServerState.cpp

Co-Authored-By: Markus Pfeiffer <markuspf@users.noreply.github.com>

* Instantiate the scheduler during ::prepare()

* Fix test crash introduced during backport

* Fix a compile error on Windows (thanks, V8)
2019-09-19 15:03:39 +03:00
Jan 3219e63381
less copying in ClusterInfo::loadPlan() (#9650) 2019-08-08 10:04:36 +02:00
Jan f95281471c
fix lagging agency callbacks (#9621) 2019-08-02 11:43:04 +02:00
Jan 1a812b4b4f
Bug fix 3.4/fix races in collection creation (#9504) 2019-07-19 13:29:24 +02:00
Matthew Von-Maszewski 4ef47fc7bd BugFix 3.4: Some error results have messages that are not reporting (#9455) 2019-07-11 13:16:15 +02:00
Max Neunhöffer 75f0a63549
Various error reporting fixes plus Maintenance Current fix. (#9398)
* Cleanup new logging.
* Hand on error message from getLocalCollections.
* Better behaviour if a database was announce but has vanished since.
* Fix catch tests.
* Switch on maintenance debugging output.
* Fix maintenance reporting bugs.
* CHANGELOG.
* No error if follower cannot be dropped.
* Improvement to avoid copying.
* Add preconditions to FollowerInfo agency operations.
* Adjust timeouts.
* Use isEqualString instead of compareString.
* Fix Windows compilation.
2019-07-05 13:38:44 +02:00
Jan 3cedbe4a67
replace potentially unsafe binary comparisons with logical ones (#9380) 2019-07-04 14:56:38 +02:00
Kaveh Vahedipour b97a62c0de if collection is gone in meantine ... (#9331) 2019-06-26 15:11:40 +02:00
Michael Hackstein f06706e53d Bug fix 3.4/create collections better preconditions (#9306)
* Backward port of #9296

* Updated changelog

* Updated changelog
2019-06-24 08:43:46 +02:00
Dan Larkin-York e4cc3ac776 [3.4] Add drop-check for index creation in cluster (#9220)
* Add drop-check for index creation in cluster.

* Fix return.

* Add changelog entry.

* Address review comments.

* Revert change of shared_ptr to plain atomic.
2019-06-12 20:02:01 +02:00
Michael Hackstein b14259b55e
Fix missed callbacks race condition (#9183)
* If we miss a callback in CollectionCreation also test if we missed later callbacks.

* Updated changelog

* Fixed callback lockers

* Remove lockers vector alltogether, we are protected by mutex anyways

* Fixed include guard

* Do not recheck the Collection that has woken us up

* Fixed recursive lock gathering
2019-06-06 14:07:41 +02:00
Michael Hackstein 4d4c23c302
Bug fix 3.4/collection babies (#9033)
* Prepare API to create multiple collections in a single request to ClusterMethods to improve speedup

* Added counter on how many collections are successfully created

* Allow multi collection creation one level higher

* CollectionMethods now allow batch createion of Collections

* Improved array size assertions

* Now a graph is createad within a single roundtrip in the agency.

* Added new header files

* Insert collections in the AGENCY with TTL and a isBuilding flag, collections with this flag should not be visisible in the coordinator

* Added forgotten C++ file

* Fixed a rare race condition, and the failing IResearch Tests

* readded callback on DONE, otherwise lists are out of sync

* Fixed assertions to let mocked tests pass...

* Fixed community cluster
2019-05-21 08:41:12 +02:00
Max Neunhöffer 54f84cab92 Performance tuning for many shards. (#8577) 2019-03-29 21:34:45 +01:00
Max Neunhöffer 46e479376d
Further supervision fixes. (#8259)
* Do not schedule Coordinators in Plan.

* Finish failed server when server is no longer in health.

* Fix removeServer checks.

Check that server is no longer in use before removing it. Give 60s
waiting time for condition to be met. Also observer agency lock.

* Finish FailedFollower job if server no longer follower.

This can happen because RemoveFollower was faster.

* Only use GOOD servers as replacement followers.

* Fix AddFollower for satellite collections.

* Fix RemoveServer for satellite collections.

* MoveShard handles moves from leader to followers

* Prepare CleanoutServer and FailedServer for satellite collections.

* More sorting out of AddFollower and RemoveFollower.

* Fix RemoveFollower job w.r.t. choice of follower to remove.

* Fix message.

* kill you own sub jobs, please

* Added preconditions to payloads for supervision's job finishers

* Improve logging.

* Add agency diagnostics to failed move shard test, start.

* Add coordinator agency diagnostics.

* Remove warning.

* Add changelog entry.

* Add agency diagnostics if things go sour with move shard.

* Add agency diags when things go wrong 2.

* API /_api/agency/state: back to old format.

* Fix Windows compilation.

* handle aborts in supervision and wait for the last Raft log to be committed

* tests compiling, 2 failing for valid reasons

* Correctly report TRI_ERROR_CLUSTER_CONNECTION_LOST as 503.

* FailedLeader /FailedFollower cannot continue, when aborting blocks
2019-03-04 11:43:35 +01:00
Frank Celler 9477af198b big reformat 2018-12-26 00:57:05 +01:00
Frank Celler 067606da3a
Bug fix 3.4/bad leader report current (#7574)
* Initialize theLeader non-empty, thus not assuming leadership.

* Correct ClusterInfo to look into Target/CleanedServers.

* Prevent usage of to be cleaned out servers in new collections.

* After a restart, do not assume to be leader for a shard.
2018-11-30 21:11:48 +01:00
Andrey Abramov 2c36657a9e improve logging in ClusterInfo::loadPlan (#7511) (#7532) 2018-11-29 20:08:11 +01:00
Andrey Abramov e67c2cac06
avoid calling cluster related functions while instantiating views on … (#7509) (#7528)
* avoid calling cluster related functions while instantiating views on a db server

* minor cleanup
2018-11-29 17:18:34 +03:00
Max Neunhöffer 10b6813f01
Fix index creation (port from devel). (#7443)
* Fix index creation in cluster.

Simplify and correct error handling logic in ensureIndexCoordinator.

* After index creation, wait until index appears.

We wait until the Supervision has removed the isBuilding flag and
the coordinator has reloaded the Plan.

* More index handling fixes.

* Explicitly remove isBuilding flag in coordinator (again).

* Fix order of arguments in REPLACE call.

* Take out debugging output again.

* Fix catch tests by holding mutex shorter.

* Better mutex handling in ClusterInfo.
2018-11-28 16:58:27 +01:00
Andrey Abramov 822e15e770
issue 153: ensure views are dropped in Agency when database is dropped in cluster, minor fixes (#7370) (#7451)
* issue 153: ensure views are dropped in Agency when database is dropped in cluster, minor fixes

* backport: add test to ensure views are dropped when database is dropped from plan, fix some issues in ClusterInfo

* optimize primary key lookups in ArangoSearch

* fix test

* Add JS tests

* temporary comment optimizations

# Conflicts:
#	arangod/Cluster/ClusterInfo.cpp
2018-11-26 00:33:58 +03:00
Wilfried Goesgens c50d346453 add alternative to ClusterInfo::getCollection() that doesn't throw (#7413)
* add alternative to ClusterInfo::getCollection() that doesn't throw (#7339)

* handle more potential nullptrs, fix try/catch scope
2018-11-23 11:15:25 +01:00
Kaveh Vahedipour 860fa21219 Bug fix 3.4/index readiness (#6716)
* backport of test data generation for maintenance from devel
* 3.4 working
* fixing index use in cluster while still being built
* fixed broken views
* correct 200 for ensureIndex
* merge with 3.4
* agency comm to handle replace in array
* supervision changes
* cluster info's exsureIndex
* 3.4 ready
* timeout
* missing files from origin
* neunhoef complaints
* bogus entry
* no need to wait for current once again
* no longer necessary. done in IndexFactory now
* correct comments
* left overs
* dead code revived
* Move CHANGELOG entry to the right place.
2018-11-21 14:41:36 +01:00
Max Neunhöffer 805f7a7621
Fix timeout in cluster operation in create and drop collections. (#7300)
* Fix loophole.
* Fix inquiry case of id not found: 404.
* Also handle correctly in AgencyComm.
* Fix agency tests.
* Fix error handling in dropCollectionOnCoordinator.
2018-11-14 10:02:26 +01:00
jsteemann bce1f51b8c simplify conditions 2018-11-12 11:14:19 +01:00
Simon f4a1f15964 Simplify dropDatabaseCoordinator & fix some bugs (#7211) (#7243) 2018-11-07 10:41:02 +01:00
Michael Hackstein b280142efa
Revert "fixes some misbehaviour within the coordinator agency callbacks (#7104)" (#7150)
This reverts commit 9ee7a0e955.
2018-10-30 16:48:56 +01:00
Heiko 9ee7a0e955 fixes some misbehaviour within the coordinator agency callbacks (#7104)
* fixes some misbehaviour within the coordinator agency callbacks

* changelog
2018-10-30 16:47:37 +01:00
Simon c073b9dbbe Make ensureIndexOnCoordinator more robust (#7110) (#7130) 2018-10-30 11:25:06 +01:00
Vasiliy e6a6025818 backport: switch scope of responsibility between a TRI_vocbase_t and a LogicalView in respect to view creation/deletion (#7106)
* backport: switch scope of responsibility between a TRI_vocbase_t and a LogicalView in respect to view creation/deletion

* backport: ensure arangosearch links get exported in the dump

* backport: ensure view is created during restore on the coordinator

* Updates for ArangoSearch DDL tests, IResearchView unregistration and known issues

* Add fix for internal issue 483
2018-10-30 12:50:29 +03:00
Max Neunhoeffer 015275a724
Emergency fix to compile on gcc 8. 2018-10-26 11:13:56 +02:00
Max Neunhöffer 8564a08bbb
Try to fix timeout in drop collection. (#7058)
* Try to fix timeout in drop collection.
* Fix compilation.
2018-10-25 16:51:16 +02:00
Simon 8b19d40136 Properly compare velocypack objects in Agency operations (#6922) 2018-10-23 11:52:22 +02:00
Jan 3adaf001c5
velocypack library update (#6850) 2018-10-12 12:46:52 +02:00
Jan 4dacd7c3b3
suppress some of these dreaded error messages (#6787) 2018-10-11 10:46:04 +02:00
Dan Larkin-York ff2ce5c846 Fix issue with colleciton/view name conflict checking in cluster. (#6779) 2018-10-10 12:40:28 +02:00
Lars Maier c5b67d217d Feature 3.4/static const strings cleanup (#6504)
* AgentConfiguration cleanup
* static strings in maintenance / agency
* fix windows build
* test bogus
* got rid of old inefficient create method
* completed with NonAction
* this works with osx / windows
* map creation can be outside function
* string init order fiasco
* startup init fiasco
* fix init-order fiasco with static strings (#6475)
* try to work around compile errors
* Removed broken and unused strings.
2018-09-21 13:18:37 +02:00
Simon 3c965ee48a Resilience test failure points (#6545) 2018-09-20 01:04:38 +02:00
Kaveh Vahedipour 2041e56f44 advertised endpoints (#6493) 2018-09-14 10:05:46 +02:00
Jan b4e6894830
Bug fix 3.4/fix cluster index estimates (#6487) 2018-09-13 23:14:07 +02:00
Jan a07467e7e0
fix cluster index selectivity estimates (#6470) 2018-09-12 15:55:50 +02:00
Simon 3eed525481 Hide links (#6348) 2018-09-03 15:36:37 +02:00
Lars Maier 63d9cfa081 Maintenance Fixes (#6284)
* Clean up for `FIXMEMAINTENANCE` comments: removed race condition, added errors and `notify()`s.
* Removed dublicated code.
* Added requested changes. Added error reporting for `UpdateCollection`.
* Make it compile. Add missing `notify()`.
* `CreateCollection` generates errors in all code paths.
* Fixed catch test.
2018-08-31 15:24:29 +02:00
Jan 5022ccc24d
Bug fix/fixes 2508 (#6254) 2018-08-27 21:36:39 +02:00
Vasiliy 5d14775de8 issue 459.3: ensure collection permissions are checked before updating/dropping an IResearch view (#6253)
* issue 459.3: ensure collection permissions are checked before updating/dropping an IResearch view

* backport: ensure collection permissions are checked before updating/dropping an IResearch view on cluster

* backport: address test failures

* backport: address more test failures

* reuse existing classes for scoping ExecContext
2018-08-26 18:00:16 +03:00
Kaveh Vahedipour 28754cbf15 Feature/schmutz plus plus (#5972)
- Schmutz now called "Maintenance" and completely implemented in C++
 - Fix index locking bug in mmfiles
 - Fix a bug in mmfiles with silent option and repsert
 - Slightly increase supervision okperiod and graceperiod
2018-08-24 12:15:35 +02:00
Matthew Von-Maszewski 86ea784372 bugfix: establish unique function name & implementation for communication retry status (#6150)
* initial checkin of isRetryOK().  Includes fixes to known code that has previously hung shutdowns by performing infinite retries.

* slight help on getting out of a loop faster during shutdown.  not essential.
2018-08-17 14:57:12 +02:00
Dan Larkin-York 5f87f57cd0 Improved sharding algorithms (#6089) 2018-08-09 19:03:32 +02:00
Kaveh Vahedipour fd60b359b6 fixed parallel creation of indexes in cluster (#6088)
* fixed parallel creation of indexes in cluster

* added tests
2018-08-07 10:00:15 +02:00