arangodb

Commit Graph

Author	SHA1	Message	Date
Lars Maier	51af263960	Added precondition to ensure that server is still as seen before. (#10468 )	2019-11-21 09:21:36 +01:00
Jan	46e98d7110	avoid string copies in several cases (#10317 )	2019-10-25 10:47:04 +02:00
Dan Larkin-York	a83c2323c9	Refactor ApplicationServer stack (#9965 )	2019-09-25 17:31:59 +02:00
jsteemann	8a812ec8c0	use StaticString	2019-09-23 18:17:37 +02:00
Markus Pfeiffer	753ff4aa67	Feature/atomic database creation 2 (#9826 )	2019-09-05 12:38:07 +02:00
Jan	7220af9602	cover more cases of "unique constraint violated" issues during replication (#9830 )	2019-08-30 10:37:32 +02:00
Frank Celler	aa3d3f8e40	Feature/cleanup ccpcheck (#9665 )	2019-08-12 11:11:49 +02:00
Jan	1a58cc2213	add VelocyPackHelper::equal method (#9389 )	2019-07-03 12:15:11 +02:00
Jan	9cb08ded92	make the comparison functions unambiguous (#9349 ) * make the comparison functions unambiguous * added @kaveh's suggestion	2019-07-01 16:35:28 +02:00
Lars Maier	1e94ecf414	Bug fix/supervision fixes4 (#9016 ) * Try to fix agency problems with snapshots. * Abort MoveShards jobs that have the failed server as fromServer. * Report aborts. * CHANGELOG.	2019-05-31 17:20:06 +02:00
Kaveh Vahedipour	773f3c8422	[devel] fix state clientlookuptable (#9066 )	2019-05-30 04:24:46 +02:00
Max Neunhöffer	80bfb85695	Port agency performance tuning for many shards to devel. (#8647 ) * Port agency performance tuning for many shards to devel. * Add more IDs to LOG_TOPIC calls. * Even more IDs for LOG_TOPIC. * Fix a duplicate LOG_TOPIC ID. * Fix an old merging bug in devel. * Don't hesitate between phases one and two for small clusters.	2019-04-11 11:14:56 +02:00
Max Neunhöffer	02281d3be4	Handle InitDone correctly. (#8552 ) * precondition plan / version in compaction / store TTL removal independent of local _ttl set * Agency init loops break when shutting down. * assertion failures in store on restarting following agents * Minor porting fixes from 3.4	2019-04-01 17:01:05 +02:00
Jan	d6d3e3daa4	initialize some member variables, added TODOs (#8545 )	2019-03-26 12:57:32 +01:00
Jan Christoph Uhde	c3f7961b88	apply unique log ids (#8561 )	2019-03-25 20:26:51 +01:00
Max Neunhöffer	55706e3c74	Make addfollower jobs less aggressive. (#8490 ) * Make addfollower jobs less aggressive. * CHANGELOG.	2019-03-21 15:24:31 +01:00
Kaveh Vahedipour	5038dfe685	supervision must not copy snapshots into jobs (#8425 ) * supervision must not copy snapshots into jobs * CHANGELOG.	2019-03-20 17:07:54 +01:00
Kaveh Vahedipour	237e079614	leader check needs to sit inside waitfor loop (#8445 ) * leader check needs to sit inside waitfor loop * Do not wait in Supervision for commits of new writes. * CHANGELOG.	2019-03-20 16:34:54 +01:00
Simon	49cc3bcd1e	Refactorings from cluster trx improvement branch (#8391 )	2019-03-14 23:13:17 +01:00
Kaveh Vahedipour	fa98e94d23	Supervision must not waitfor if no longer leading (#8403 ) * Supervision must not waitfor if no longer leading * Supervision must not waitfor if no longer leading	2019-03-13 13:18:10 +01:00
Max Neunhöffer	2a4f606df2	Various agency improvements. (#8380 ) * Ignore satellite collections in shrinkCluster in agency. * Abort RemoveFollower job if not enough in-sync followers or leader failure. * Break quick wait loop in supervision if leadership is lost. * In case of resigned leader, set isReady=false in clusterInventory. * Fix catch tests.	2019-03-12 15:25:16 +01:00
Kaveh Vahedipour	ee751e8ba3	[devel] clear compilation warnings (#8345 )	2019-03-08 10:35:09 +01:00
Kaveh Vahedipour	4b464aeb97	oversight (#8324 ) * oversight of an abort * fix waitFor trap in supervision	2019-03-05 23:31:18 +01:00
Kaveh Vahedipour	68178ba165	[devel] supervision bug fix backports (#8314 ) * back ports for supervision fixes from 3.4 part 1 * back ports for supervision fixes from 3.4 part 2	2019-03-04 19:27:24 +01:00
Manuel Pöter	ecf4d9d62a	Fix race conditions in thread management. (#8032 )	2019-01-28 15:44:46 +01:00
Frank Celler	ac9f375fb5	big reformat	2018-12-26 00:54:03 +01:00
Simon	a2a0b03f43	Rdb index background (preliminary) (#7644 )	2018-12-21 19:24:10 +01:00
Lars Maier	908df47cd7	[devel] Bug fix/cluster health ui timestamp (#7562 )	2018-11-30 16:26:21 +01:00
Lars Maier	52cff7ad55	Feature/engine version added to agent configuration (#7481 ) (#7524 ) * agents' is obtained from leader's configuration * corrections in Supervision for advertised endpoints * change log * Updated Documentation for cluster/health. * Unified naming convention. * Fixed missing update of volatile fields. * Set version in right order. * Removed debug output. * Fixed jslint - missing ;	2018-11-29 14:25:40 +01:00
Lars Maier	f3ade0f860	Version/Engine Cluster Health (#7474 ) * Export Version and Engine in Cluster Health. Additionally export `versionString` in registered Servers. * Updated Changelog.	2018-11-27 14:56:00 +01:00
Kaveh Vahedipour	9ec6619b84	Bug fix/index readiness (#6541 ) * indexes are marked while still missing in Current * index handling getCollection * supervision gets indexes from isbuilding, when coordinator is gone before finishing * seems right now * fixed broken views * remove junk comments * cleanup * node / supervision adjustements * supervision fixes * neunhoef remarks part i * neunhoef remarks part ii * neunhoef remarks part ii * neunhoef remarks part iiI * collection's current version please * no need to wait for current once again * no longer necessary code * clear comments * delete left overs * dead code revived	2018-11-21 14:42:58 +01:00
Jan	7306cdaa03	try not to throw so many exceptions from Supervision (#7227 )	2018-11-07 15:36:41 +01:00
Max Neunhöffer	2452dcc5d0	Remove a relic from early days in /Target/FailedServers. (#6690 ) * Remove a relic from early days in /Target/FailedServers. * Fix a test.	2018-10-09 13:52:32 +02:00
Lars Maier	6546b908be	Bug fix/cleanup lost collection inc plan v (#6720 ) * Increase the current version rather than the plan version.	2018-10-04 15:38:41 +02:00
Lars Maier	14d1487710	Catch all exceptions to prevent maintenance workers from crashing. (#6645 ) * Catch all exceptions to prevent maintenance workers from crashing. * Please don't free this. * Unified code paths. * Remove dub comment. * Removed debug output. * Deleted unneeded constructors. * Assignment operator deleted.	2018-09-28 17:10:44 +02:00
Lars Maier	3dbb0558f3	Clean lost collections in supervision (#6592 ) * Working draft: clean lost collections in supervision. * Added early exit as in spec. * Finished test. Fixed logging.	2018-09-26 16:54:29 +02:00
Simon	0a9afccde5	Fix crash on Agency / DBserver with user JWT tokens (#6594 )	2018-09-26 14:26:35 +02:00
Max Neunhöffer	84735955ea	Add advertised endpoints. (#6104 )	2018-09-13 16:30:55 +02:00
Kaveh Vahedipour	28754cbf15	Feature/schmutz plus plus (#5972 ) - Schmutz now called "Maintenance" and completely implemented in C++ - Fix index locking bug in mmfiles - Fix a bug in mmfiles with silent option and repsert - Slightly increase supervision okperiod and graceperiod	2018-08-24 12:15:35 +02:00
Simon	468231efc5	AQL Profiling code (#5165 ) * initial start of profiling * adding profiling code * Fixing remote block tracing, fixing width and units * Fixing some tests * Various fixes * adressing review comments	2018-04-24 16:17:30 +02:00
Matthew Von-Maszewski	a84f7805ad	Feature/mv thread death logging (#5111 ) * Initial low level interface for thread crash reporting (and management). * Add a member version of isClusterRole() * isolate heartbeat thread creation to new StartHeartbeatThread(). create heartbeat thread even if not a cluster or if an agent. * update runDBServer() and runCoordinator() to shutdown more quickly by polling isStopping() at additional locations. * copying updates from different branch / PR * basic thread crash logging. Not yet tied into Agency arangod or have any specific threads posting crashes * make Supervision thread a CriticalThread * sandwich CriticalThread between Thread and other classes to create long term, repeating thread crash reporting. * restore code lost upon branch update relating to new startHeartbeatThread() function * add CriticalThread.cpp to build * add new runAgentServer() function to loop for Agents. Make Heartbeat thread derive from CriticalThread. * remove debug line	2018-04-23 15:50:14 +02:00
Simon	45fbed497b	Supervision Job for Active Failover (#5066 )	2018-04-23 12:49:41 +02:00
Kaveh Vahedipour	3d043b35a3	Feature/supervsion maintenance mode (#5108 ) * Supervision goes to Maintenance mode, when /arango/Supervision/Maintenance exists * coordinator route stands * stop updates in transient, when supervision off	2018-04-20 13:23:22 +02:00
Matthew Von-Maszewski	c0c149cf5b	Create non-throwing wrappers for Node access in Agency (#4598 ) * safety checkin of Node throw reduction. * final round of Node throw protection. Common accessors now protected to force code to hasAsXXX() functions.	2018-04-17 10:21:14 +02:00
Kaveh Vahedipour	f4edcc7ba8	Bug fix/supervision engine starting early on leadership change (#5062 ) * supervision must not work as long as agent is still preparing * leadersince atomic and pushed to end of leader preparation * More consistent use of integer types. * Slightly change order of events in Supervision loop.	2018-04-10 15:28:26 +02:00
Kaveh Vahedipour	7f9786eb27	builder fixed for agency transaction. worked only for a single server. (#4436 )	2018-02-06 23:14:53 +01:00
Kaveh Vahedipour	7715c75c59	let's not miss failedserver removal (#4208 ) * let's not miss failedserver removal * remove resetting of FailedServers in test code * Only call abortRequestsToFailedServers at most every 3 seconds.	2018-01-03 21:55:40 +01:00
Matthew Von-Maszewski	ae77ff80c2	create independent executeLockedRead and executeLockedWrite to speed performance (#4177 )	2017-12-29 12:02:27 +01:00
Max Neunhöffer	7bae6980e8	Bug fix/agent lead hanger (#4147 ) * Really enforce the hidden option --server.maximal-threads if given. * Switch off --log.force-direct in scripts/startStandAloneAgency.sh * Lower the timeout for sending AppendEntriesRPC to 150s. * Erase _earliestPackage when becoming a leader. * Challenge leadership in agent main loop. * Use steady_clock for _earliestPackage. * Change _lastAcked and _leaderSince to steady_clock as well. * time difference calculations based on old readSystemClock to steadyClockToDouble * All system_clock transitioned to steady_clock in Agent. Remaining system_clock are user input / output or timestamps * Inception system_clock to steady_clock	2017-12-27 16:45:39 +01:00
Matthew Von-Maszewski	8723df7681	Fix supervisor thread crash (#4083 ) * Server short name could arrive too late for first health check. Would lead to supervisor thread crash. Add test for this condition and defense against other unknown throws in health check. * Correct capitalization of ShortName. Add spaces to two Log lines.	2017-12-27 16:10:47 +01:00

1 2 3 4 5

237 Commits