arangodb

Commit Graph

Author	SHA1	Message	Date
Lars Maier	4bf2302150	Do nothing in phaseTwo if leader has not been touched. (#7579 ) * Do nothing in phaseTwo if leader has not been touched. * Drop follower if it refuses to cooperate. This is important since a dbserver that is follower for a shard will after a reboot think that it is a leader, at least for a short amount of time. If it came back quickly enough, the leader might not have noticed that it was away.	2018-12-02 13:14:46 +01:00
Frank Celler	a86fd3dd67	fixed init	2018-11-30 21:17:38 +01:00
Frank Celler	067606da3a	Bug fix 3.4/bad leader report current (#7574 ) * Initialize theLeader non-empty, thus not assuming leadership. * Correct ClusterInfo to look into Target/CleanedServers. * Prevent usage of to be cleaned out servers in new collections. * After a restart, do not assume to be leader for a shard.	2018-11-30 21:11:48 +01:00
Jan	836954b8e3	allow using UTF8 filenames for UUID directory (#7569 )	2018-11-30 17:25:50 +01:00
Andrey Abramov	2c36657a9e	improve logging in ClusterInfo::loadPlan (#7511 ) (#7532 )	2018-11-29 20:08:11 +01:00
Tobias Gödderz	f61ccd4047	Reload Foxx routes during startup (#7531 )	2018-11-29 15:31:40 +01:00
Andrey Abramov	e67c2cac06	avoid calling cluster related functions while instantiating views on … (#7509 ) (#7528 ) * avoid calling cluster related functions while instantiating views on a db server * minor cleanup	2018-11-29 17:18:34 +03:00
Kaveh Vahedipour	3225a7b16d	[3.4] Feature/engine version added to agent configuration (#7481 ) * agents' is obtained from leader's configuration * corrections in Supervision for advertised endpoints * change log * Updated Documentation for cluster/health. * Unified naming convention. * Fixed missing update of volatile fields. * Set version in right order. * Removed debug output. * Fixed jslint - missing ;	2018-11-29 12:00:47 +01:00
Max Neunhöffer	804ac13db2	SynchronizeShard's potentially long running while loops yield for shutdown (#7523 )	2018-11-29 11:47:16 +01:00
Max Neunhöffer	b74358a3dd	Improve log messages. (#7520 )	2018-11-29 11:30:43 +01:00
Max Neunhöffer	10b6813f01	Fix index creation (port from devel). (#7443 ) * Fix index creation in cluster. Simplify and correct error handling logic in ensureIndexCoordinator. * After index creation, wait until index appears. We wait until the Supervision has removed the isBuilding flag and the coordinator has reloaded the Plan. * More index handling fixes. * Explicitly remove isBuilding flag in coordinator (again). * Fix order of arguments in REPLACE call. * Take out debugging output again. * Fix catch tests by holding mutex shorter. * Better mutex handling in ClusterInfo.	2018-11-28 16:58:27 +01:00
Lars Maier	154d449061	Export Version and Engine in Cluster Health. Additionally export `versionString` in registered Servers. (#7463 )	2018-11-27 09:15:38 +01:00
Jan	ffc823e1c8	Bug fix 3.4/backport optimizations (#7434 )	2018-11-26 19:16:05 +01:00
Tobias Gödderz	a83300dc29	Fix error handling in case ClusterCommResult.result == nullptr (#7355 )	2018-11-26 16:22:43 +01:00
Andrey Abramov	822e15e770	issue 153: ensure views are dropped in Agency when database is dropped in cluster, minor fixes (#7370 ) (#7451 ) * issue 153: ensure views are dropped in Agency when database is dropped in cluster, minor fixes * backport: add test to ensure views are dropped when database is dropped from plan, fix some issues in ClusterInfo * optimize primary key lookups in ArangoSearch * fix test * Add JS tests * temporary comment optimizations # Conflicts: # arangod/Cluster/ClusterInfo.cpp	2018-11-26 00:33:58 +03:00
Michael Hackstein	8098bb4eed	Bug fix 3.4/syncing of followers (#7377 ) * Added some DEBUG output for replication rest handler * Some more debug logging. * Increased the priority of the ReplicationHandler. This way we will not get stuck with locks that cannot be canceled. Also cancel the lock on the correct database. * Added extensive log output for replication thins * Added tombstones to RestReplicationHandler. In a very unlikely case the cancel of a lock can be executed BEFORE the code that actually registers the lock, in this case we will now write a tombstone and do not lock. * Revert "Added extensive log output for replication thins" This reverts commit 6d4e37ea1e59e3b3457336019cc7dbc4c979504d. * Added extensive log output for replication things, now in ERR level instead of MAINTAINER only * Now actually use hours for synchronization * React to errors under soft lock if they show up. * Added a retry loop to increase the read-lock timer. * Added more timeing output in RocksDB collection internals to figure out why the followers are dropped * Tweaked RocksDB options * Revert "Tweaked RocksDB options" This reverts commit 2bf9c43280beda4792c47d079387fe5154cdd896. * Removed debug output * Applied all requested changes by goedderz * Deleted unused variable	2018-11-23 16:08:27 +01:00
Wilfried Goesgens	4bbd6a02bb	Bug fix/less exceptions (#7385 ) (#7415 )	2018-11-23 11:15:36 +01:00
Wilfried Goesgens	c50d346453	add alternative to ClusterInfo::getCollection() that doesn't throw (#7413 ) * add alternative to ClusterInfo::getCollection() that doesn't throw (#7339) * handle more potential nullptrs, fix try/catch scope	2018-11-23 11:15:25 +01:00
Wilfried Goesgens	d4af8fe287	remove enterprise-gotos (#7375 ) (#7414 )	2018-11-23 11:14:21 +01:00
Kaveh Vahedipour	860fa21219	Bug fix 3.4/index readiness (#6716 ) * backport of test data generation for maintenance from devel * 3.4 working * fixing index use in cluster while still being built * fixed broken views * correct 200 for ensureIndex * merge with 3.4 * agency comm to handle replace in array * supervision changes * cluster info's exsureIndex * 3.4 ready * timeout * missing files from origin * neunhoef complaints * bogus entry * no need to wait for current once again * no longer necessary. done in IndexFactory now * correct comments * left overs * dead code revived * Move CHANGELOG entry to the right place.	2018-11-21 14:41:36 +01:00
Simon	5124633e6a	Faster index creation (#7348 )	2018-11-20 13:41:01 +01:00
Tobias Gödderz	3d1c643e23	[3.4] MMFiles replication: get followers under lock (#7298 ) * Fix resign order * Fixed a typo * Get followers later, add TODOs * Added a callback parameter to collection insert methods * Get followers under the lock if necessary * Extracted the replication of inserts into a separate method * Move shortcut into replicate method * Added callbacks for remove, replace and update * Added missing overrides * Extracted replication code from modifyLocal and removeLocal * Update followers under lock also during replace, update, remove * Fix changes from the last commit for update/replace * Update comments, add asserts * Remove changes for document-level locks that will be done in another PR * Unify replication * Adapt log messages to the devel ones * Move common methods from its descendants to TransactionCollection, fix Mock on the way * More IResearch test / mock fixes * Relax asserts for nested transactions * Reformat * Fix non-babies remove and modify replication	2018-11-19 13:03:07 +01:00
Max Neunhöffer	c005e0b0f0	Improve error reporting in maintenance. (#7340 ) * Improve error reporting from maintenance. * Fix compilation. * Tiny polishing fix.	2018-11-16 10:25:55 +01:00
Max Neunhöffer	805f7a7621	Fix timeout in cluster operation in create and drop collections. (#7300 ) * Fix loophole. * Fix inquiry case of id not found: 404. * Also handle correctly in AgencyComm. * Fix agency tests. * Fix error handling in dropCollectionOnCoordinator.	2018-11-14 10:02:26 +01:00
jsteemann	bce1f51b8c	simplify conditions	2018-11-12 11:14:19 +01:00
Dan Larkin-York	8bd754b9ad	[3.4] Fix nullptr dereference in SynchronizeShard. (#7267 )	2018-11-08 14:12:33 +01:00
Simon	f4a1f15964	Simplify dropDatabaseCoordinator & fix some bugs (#7211 ) (#7243 )	2018-11-07 10:41:02 +01:00
Matthew Von-Maszewski	d927e8ebeb	Bugfix 3.4: revert recently added condition variable in ClusterCommThread stop (#7239 ) * remove recent _activeThreadCondition. it made things worse. moved all ClusterCommThread methods to end of file to ease review. * attempt at avoiding Scheduler io_context being nullptr in late shutdown steps * manually revert last change since bug is realy about devel branch not 3.4 branch	2018-11-06 13:43:51 -06:00
Matthew Von-Maszewski	d4c8b43024	test to verify communication thread has fully exited before saying ClusterComm is stopped. (#7232 )	2018-11-05 16:31:33 -06:00
Vasiliy	d644561f1f	issue 496.4.1: backport 3.4: move StorageEngine-specific flag out of the genric API and closer to the storage engine (#7213 ) * issue 496.4.1: backport 3.4: move StorageEngine-specific flag out of the genric API and closer to the storage engine * address merge issue	2018-11-04 16:52:54 +03:00
Simon	cf86d9bbc8	Fix a crash in DBServerAgencySync (#7204 )	2018-11-03 20:19:04 +01:00
Max Neunhöffer	42fd0825ab	Fix timeouts for write operations from coordinator to leader. (#7081 ) * Improve logging on coordinator when doing `arangorestore`. * Return more error information in `mergeResults`. * Longer timeout for communication coordinator -> leader for writes. This is taking into account possible write stops from followers needed to get in sync. * Fix compilation. * Get rid of numbers in exception log messages. * Fix compilation. * Fix indentation.	2018-10-31 14:39:48 +01:00
Michael Hackstein	b280142efa	Revert "fixes some misbehaviour within the coordinator agency callbacks (#7104 )" (#7150 ) This reverts commit `9ee7a0e955`.	2018-10-30 16:48:56 +01:00
Heiko	9ee7a0e955	fixes some misbehaviour within the coordinator agency callbacks (#7104 ) * fixes some misbehaviour within the coordinator agency callbacks * changelog	2018-10-30 16:47:37 +01:00
Simon	c073b9dbbe	Make ensureIndexOnCoordinator more robust (#7110 ) (#7130 )	2018-10-30 11:25:06 +01:00
Simon	9271a11441	RocksDB replication thread safety (#7088 ) (#7131 )	2018-10-30 11:24:17 +01:00
Vasiliy	e6a6025818	backport: switch scope of responsibility between a TRI_vocbase_t and a LogicalView in respect to view creation/deletion (#7106 ) * backport: switch scope of responsibility between a TRI_vocbase_t and a LogicalView in respect to view creation/deletion * backport: ensure arangosearch links get exported in the dump * backport: ensure view is created during restore on the coordinator * Updates for ArangoSearch DDL tests, IResearchView unregistration and known issues * Add fix for internal issue 483	2018-10-30 12:50:29 +03:00
Tobias Gödderz	e9388ab710	[3.4] Stop curl from trying to POST stdin (#7097 ) * Stop libcurl from trying to POST stdin * Stop relocking every iteration in wait * Remove unimplemented function * Restrict setting of empty POSTFIELDS to POST requests * Revert locking change	2018-10-29 14:41:23 +01:00
Michael Hackstein	e05880895a	Bug fix 3.4/shorter foot in door (#7084 ) * Implement `syncCollectionCatchup` in DatabaseTailingSyncer. First stab, might not even compile. * Fixed a typo. * Fix a typo and a compilation problem. * Further compilation fix. * Implement two stage catchup. * Two small corrections. * Unified error messages in Synchronize shard job. * Improved a code comment. * Fixed autocasting bool->double and double->bool issue. That is truely one of the best features ever invented... </irony> * Renamed doHardLock => toSoftLockOnly and inverted default value * Merged soft/hard foot logic with Transaction splits * Use scopeguards to cancel readlocks	2018-10-26 16:16:52 +02:00
Max Neunhoeffer	015275a724	Emergency fix to compile on gcc 8.	2018-10-26 11:13:56 +02:00
Max Neunhöffer	8564a08bbb	Try to fix timeout in drop collection. (#7058 ) * Try to fix timeout in drop collection. * Fix compilation.	2018-10-25 16:51:16 +02:00
Jan	b903f1f8ff	Bug fix 3.4/fix catch test issues (#7045 )	2018-10-25 12:49:00 +02:00
Simon	e87b42a0c3	Silence tsan warnings (#7051 )	2018-10-24 23:58:47 +02:00
Simon	6eb9e38b08	Better agency pool update (#7036 )	2018-10-24 16:23:10 +02:00
Vasiliy	52e2c97693	backport missed changes (#7016 )	2018-10-24 15:43:45 +03:00
Simon	8b19d40136	Properly compare velocypack objects in Agency operations (#6922 )	2018-10-23 11:52:22 +02:00
Matthew Von-Maszewski	43016cf04f	Bugfix 3.4: address concerns from prior scheduler PR (#7005 )	2018-10-23 11:30:45 +02:00
Simon	c0455e9c60	Add engine specific collection APIs (#6962 )	2018-10-19 15:23:55 +02:00
Lars Maier	d7863b4583	Bug fix 3.4/cluster comm threads start stop (#6939 ) * Start ClusterComm threads in `ClusterFeature::start`. Stop ClusterComm threads in `ClusterFeature::stop`. * Do not free objects in `Scheduler::shutdown`. Let the `unique_ptr` do their job. Stop ClusterComm threads in `ClusterFeature::stop`, but free instance in `ClusterFeature::unprepare`. * `io_context` may contains lambdas that hold `shared_ptr`s to `Tasks` the required a functional `VocBase` in their destructor. * Clean up.	2018-10-19 13:12:51 +02:00
Jan	19e2dd87bd	Replace engine equality feature (#6931 ) (#6950 )	2018-10-17 20:34:19 +02:00

1 2 3 4 5 ...

2102 Commits