Lars Maier
c215e30299
Precs to check if collection exists ( #9278 )
...
* Adding preconditions for jobs to check that the collection still exists.
* Make it compile.
* Fixed tests.
2019-07-01 13:24:53 +02:00
Lars Maier
eb1aa6e024
Response compression ( #9300 )
...
* First draft of response compression.
* Cleanup.
* Removed compression from /_api/version.
* Added ruby test for response compression.
2019-06-28 10:02:48 +02:00
Max Neunhöffer
d6d362bd3b
Fix agency election lock step bug. ( #9351 )
...
* Fix agency election lockstep bug.
Reset the base point for the random election timeout to now whenever we have
cast a vote, be it for us or for some other server.
* CHANGELOG.
* Fix compilation.
2019-06-27 22:06:26 +02:00
Lars Maier
50ce41e062
Release _to server when abort because of dropped collection or `something serious went wrong`. ( #9267 )
2019-06-17 15:04:44 +02:00
Jan Christoph Uhde
3f603f024f
remove some containers from common.h ( #9223 )
...
* remove some containers from Common.h
* enterprise fixes
2019-06-07 13:27:24 +02:00
Kaveh Vahedipour
8a511934df
[devel] fix ttl handling for object assigments ( #9207 )
...
* fix ttl handling for object assigments
* completed test handling
* CHANGELOG.
2019-06-06 15:48:40 +02:00
Lars Maier
1e94ecf414
Bug fix/supervision fixes4 ( #9016 )
...
* Try to fix agency problems with snapshots.
* Abort MoveShards jobs that have the failed server as fromServer.
* Report aborts.
* CHANGELOG.
2019-05-31 17:20:06 +02:00
Kaveh Vahedipour
773f3c8422
[devel] fix state clientlookuptable ( #9066 )
2019-05-30 04:24:46 +02:00
Jan
59b67cad40
fix various small annoyances ( #9079 )
2019-05-23 17:36:38 +02:00
Simon
394c070a4f
Do not check isEnabled ( #8997 )
2019-05-17 16:33:17 +02:00
Wilfried Goesgens
1907a7211b
Bug fix/cleanup system includes ( #8962 )
2019-05-15 15:12:59 +02:00
Andrey Abramov
8358b978be
disable arangosearch analyzers in agency ( #8989 )
...
* disable arangosearch analyzers
* do nothing in `stop()` if feature is disabled
2019-05-14 17:37:38 +02:00
Matthew Von-Maszewski
fb157b89ac
Add protective barrier to sendWithFailover() in case MANAGER is nullptr ( #8998 )
2019-05-14 17:35:55 +02:00
Lars Maier
d9ff6647d8
[devel] Do not skip the check if there is only a single agent. ( #8943 )
...
* Do not skip the check if there is only a single agent.
* Updated changelog.
2019-05-14 15:42:43 +02:00
Lars Maier
49c568e674
Added reason to job abort method. ( #8877 )
2019-05-14 15:39:53 +02:00
Jan
c00442e31c
fix some issues found by cppcheck ( #8959 )
2019-05-10 16:31:22 +02:00
Jan
976dc2b726
Bug fix/issues 2019 05 06 ( #8913 )
2019-05-07 12:17:16 +02:00
Lars Maier
c99e8e8973
[devel] ClientID Agency Transaction ( #8652 )
...
* Changed clientId to format <serverid>:<uuid>.
* Changed behavior if id is not known.
2019-04-30 10:39:23 +02:00
Jan
449ab1ed8e
Bug fix/cppcheck 13042019 ( #8752 )
2019-04-15 10:13:56 +02:00
Simon
937d743ba6
Bug fix/pregel stuff ( #8733 )
2019-04-11 15:58:28 +02:00
Max Neunhöffer
80bfb85695
Port agency performance tuning for many shards to devel. ( #8647 )
...
* Port agency performance tuning for many shards to devel.
* Add more IDs to LOG_TOPIC calls.
* Even more IDs for LOG_TOPIC.
* Fix a duplicate LOG_TOPIC ID.
* Fix an old merging bug in devel.
* Don't hesitate between phases one and two for small clusters.
2019-04-11 11:14:56 +02:00
Jan
c6d3f8e052
Bug fix/pass on error messages ( #8690 )
2019-04-10 12:34:25 +02:00
Jan
6b9b9b0946
Bug fix/fix test muell ( #8703 )
2019-04-09 11:27:53 +02:00
Simon
7cd84a785a
Remove Obsolete code ( #8657 )
2019-04-03 13:40:44 +02:00
Max Neunhöffer
02281d3be4
Handle InitDone correctly. ( #8552 )
...
* precondition plan / version in compaction / store TTL removal independent of local _ttl set
* Agency init loops break when shutting down.
* assertion failures in store on restarting following agents
* Minor porting fixes from 3.4
2019-04-01 17:01:05 +02:00
Jan
d6d3e3daa4
initialize some member variables, added TODOs ( #8545 )
2019-03-26 12:57:32 +01:00
Jan
80a6e621ee
don't allocate memory so often in ClusterComm requests ( #8550 )
2019-03-26 00:31:56 +01:00
Jan Christoph Uhde
c3f7961b88
apply unique log ids ( #8561 )
2019-03-25 20:26:51 +01:00
Max Neunhöffer
55706e3c74
Make addfollower jobs less aggressive. ( #8490 )
...
* Make addfollower jobs less aggressive.
* CHANGELOG.
2019-03-21 15:24:31 +01:00
Lars Maier
4d49285754
[devel]agency appends entries with leaders timestamp ( #8478 )
...
* agency's append entries with leader's timestamp
* compatibility with appendEntries protocol without timestamps
* Updated changelog.
2019-03-21 09:52:58 +01:00
Kaveh Vahedipour
5038dfe685
supervision must not copy snapshots into jobs ( #8425 )
...
* supervision must not copy snapshots into jobs
* CHANGELOG.
2019-03-20 17:07:54 +01:00
Kaveh Vahedipour
237e079614
leader check needs to sit inside waitfor loop ( #8445 )
...
* leader check needs to sit inside waitfor loop
* Do not wait in Supervision for commits of new writes.
* CHANGELOG.
2019-03-20 16:34:54 +01:00
Simon
49cc3bcd1e
Refactorings from cluster trx improvement branch ( #8391 )
2019-03-14 23:13:17 +01:00
Kaveh Vahedipour
fa98e94d23
Supervision must not waitfor if no longer leading ( #8403 )
...
* Supervision must not waitfor if no longer leading
* Supervision must not waitfor if no longer leading
2019-03-13 13:18:10 +01:00
Max Neunhöffer
2a4f606df2
Various agency improvements. ( #8380 )
...
* Ignore satellite collections in shrinkCluster in agency.
* Abort RemoveFollower job if not enough in-sync followers or leader failure.
* Break quick wait loop in supervision if leadership is lost.
* In case of resigned leader, set isReady=false in clusterInventory.
* Fix catch tests.
2019-03-12 15:25:16 +01:00
Max Neunhöffer
30adf5e2d9
Fix an agency crash. ( #8381 )
...
* Check if transaction failed before accessing the result.
* FailedFollower has the same bug.
* Add CHANGELOG.
2019-03-12 15:24:37 +01:00
Kaveh Vahedipour
098d2d086c
fix compaction behaviour of followers ( #8348 )
2019-03-08 10:39:49 +01:00
Kaveh Vahedipour
ee751e8ba3
[devel] clear compilation warnings ( #8345 )
2019-03-08 10:35:09 +01:00
Kaveh Vahedipour
4b464aeb97
oversight ( #8324 )
...
* oversight of an abort
* fix waitFor trap in supervision
2019-03-05 23:31:18 +01:00
Kaveh Vahedipour
68178ba165
[devel] supervision bug fix backports ( #8314 )
...
* back ports for supervision fixes from 3.4 part 1
* back ports for supervision fixes from 3.4 part 2
2019-03-04 19:27:24 +01:00
Kaveh Vahedipour
22639f53f1
Failed servers now transactionally adds along the leader/follower jobs ( #8096 )
...
* failed servers now transactionally adds along the leader/follower jobs
* one pending still left
* typos
2019-02-05 13:56:27 +01:00
Manuel Pöter
ecf4d9d62a
Fix race conditions in thread management. ( #8032 )
2019-01-28 15:44:46 +01:00
Tobias Gödderz
a1d3bc3e94
Foxx queue jobs hanging after Foxxmaster crash ( #7922 )
...
* Fixed bug where the Foxxmaster doesn't reset jobs after a crash when it should, or a non-master coordinator removes jobs in progress during startup
* Added a regression test
* Updated CHANGELOG
* Fixed non-maintainer compile
2019-01-14 16:08:08 +01:00
Frank Celler
ac9f375fb5
big reformat
2018-12-26 00:54:03 +01:00
Simon
a2a0b03f43
Rdb index background (preliminary) ( #7644 )
2018-12-21 19:24:10 +01:00
Kaveh Vahedipour
2e680a7f9b
Agency not starting in log level trace ( #7684 )
2018-12-10 15:30:57 +01:00
Jan
5bae3742e5
Feature/internal 3306 ( #7683 )
2018-12-06 16:19:28 +01:00
Lars Maier
dd07d74d69
[devel] Bug fix/bad leader report current ( #7585 )
...
* Bug fix 3.4/bad leader report current (#7574 )
* Initialize theLeader non-empty, thus not assuming leadership.
* Correct ClusterInfo to look into Target/CleanedServers.
* Prevent usage of to be cleaned out servers in new collections.
* After a restart, do not assume to be leader for a shard.
* Do nothing in phaseTwo if leader has not been touched. (#7579 )
* Drop follower if it refuses to cooperate.
This is important since a dbserver that is follower for a shard will
after a reboot think that it is a leader, at least for a short amount
of time. If it came back quickly enough, the leader might not have
noticed that it was away.
2018-12-03 10:20:30 +01:00
Lars Maier
908df47cd7
[devel] Bug fix/cluster health ui timestamp ( #7562 )
2018-11-30 16:26:21 +01:00
Lars Maier
52cff7ad55
Feature/engine version added to agent configuration ( #7481 ) ( #7524 )
...
* agents' is obtained from leader's configuration
* corrections in Supervision for advertised endpoints
* change log
* Updated Documentation for cluster/health.
* Unified naming convention.
* Fixed missing update of volatile fields.
* Set version in right order.
* Removed debug output.
* Fixed jslint - missing ;
2018-11-29 14:25:40 +01:00
Lars Maier
f3ade0f860
Version/Engine Cluster Health ( #7474 )
...
* Export Version and Engine in Cluster Health. Additionally export `versionString` in registered Servers.
* Updated Changelog.
2018-11-27 14:56:00 +01:00
Max Neunhöffer
d72e51ed8f
Fix move leader shard. ( #7445 )
...
* Ungreylist move shard test.
* Move leader shard: wait until all but the old leader are in sync.
* Increate moveShard timeout to 10000 seconds.
* Add CHANGELOG entry.
* Fix compilation.
* Fix a misleading comment.
2018-11-26 15:04:04 +01:00
Kaveh Vahedipour
9ec6619b84
Bug fix/index readiness ( #6541 )
...
* indexes are marked while still missing in Current
* index handling getCollection
* supervision gets indexes from isbuilding, when coordinator is gone before finishing
* seems right now
* fixed broken views
* remove junk comments
* cleanup
* node / supervision adjustements
* supervision fixes
* neunhoef remarks part i
* neunhoef remarks part ii
* neunhoef remarks part ii
* neunhoef remarks part iiI
* collection's current version please
* no need to wait for current once again
* no longer necessary code
* clear comments
* delete left overs
* dead code revived
2018-11-21 14:42:58 +01:00
Max Neunhöffer
f720703c38
Supervision bug fix to start with clean transient store. ( #7325 )
...
* Supervision bug fix to start with clean transient store.
* Add CHANGELOG entry.
2018-11-15 11:24:34 +01:00
Markus Pfeiffer
39bdebf851
Port bug-fix-3.4/timeout-create-coll to devel ( #7307 )
...
* Fix loophole in error handling.
* Fix inquiry case of id not found: 404.
* Also handle correctly in AgencyComm.
* Fix agency tests.
* Fix error handling in dropCollectionOnCoordinator.
2018-11-14 10:03:55 +01:00
Jan
7306cdaa03
try not to throw so many exceptions from Supervision ( #7227 )
2018-11-07 15:36:41 +01:00
Simon
c72818a9dc
Make ensureIndexOnCoordinator more robust ( #7110 )
2018-10-29 17:45:46 +01:00
Simon
10dc287eb3
Silence Tsan warnings ( #7075 )
2018-10-25 15:50:39 +02:00
Heiko
a13f68bc5b
Bug fix/agency loop wrong credentials ( #7039 )
...
* arangod now exits when used wrong credentials during the startup process
* CHANGELOG
2018-10-25 14:15:50 +02:00
Simon
d23aaa2198
Better agency pool update ( #7040 )
2018-10-24 16:23:21 +02:00
Simon
8b7a4099b8
Properly compare velocypack objects in Agency operations ( #6921 )
...
* Properly compare velocypack objects in Agency operations
* Add changelog
* added option for VPackDumper
2018-10-17 20:03:53 +02:00
jsteemann
5f951840a9
fix compilation
2018-10-12 17:56:55 +02:00
Kaveh Vahedipour
d524ba616b
fixed hyperventing agent ( #6776 )
...
* fixed hyperventing agent
2018-10-12 17:03:08 +02:00
Max Neunhöffer
2452dcc5d0
Remove a relic from early days in /Target/FailedServers. ( #6690 )
...
* Remove a relic from early days in /Target/FailedServers.
* Fix a test.
2018-10-09 13:52:32 +02:00
Jan
e78d1aa541
Bug fix/even more ldap debugging ( #6736 )
2018-10-08 09:42:11 +02:00
Lars Maier
6546b908be
Bug fix/cleanup lost collection inc plan v ( #6720 )
...
* Increase the current version rather than the plan version.
2018-10-04 15:38:41 +02:00
jsteemann
b067d738e5
fixed indentation a bit
2018-10-03 13:25:32 +02:00
Simon
5837291495
Debug logs for ActiveFailover ( #6684 )
2018-10-02 15:10:50 +02:00
Jan
c06f2d77da
Feature/velocypack update ( #6678 )
2018-10-02 14:04:14 +02:00
Max Neunhöffer
a549dd9264
Increase Plan/Version if follower is removed in MoveShard. ( #6669 )
...
This was forgotten when we added the `remainsFollower` flag.
2018-10-01 16:55:04 +02:00
Lars Maier
14d1487710
Catch all exceptions to prevent maintenance workers from crashing. ( #6645 )
...
* Catch all exceptions to prevent maintenance workers from crashing.
* Please don't free this.
* Unified code paths.
* Remove dub comment.
* Removed debug output.
* Deleted unneeded constructors.
* Assignment operator deleted.
2018-09-28 17:10:44 +02:00
Max Neunhöffer
2fc368028b
Fix a crash found by the agency torturer. ( #6589 )
2018-09-28 15:15:26 +02:00
Kaveh Vahedipour
a73023e512
Bug fix/agency update endpoints ( #6519 )
...
* update endpoints in agency done the RAFT way
* fix mock interface
* tests functioning with new agent interfacwe
* handling non-leader
2018-09-28 15:14:48 +02:00
Lars Maier
3dbb0558f3
Clean lost collections in supervision ( #6592 )
...
* Working draft: clean lost collections in supervision.
* Added early exit as in spec.
* Finished test. Fixed logging.
2018-09-26 16:54:29 +02:00
Simon
0a9afccde5
Fix crash on Agency / DBserver with user JWT tokens ( #6594 )
2018-09-26 14:26:35 +02:00
Simon
b16af5ac71
Fix superfluous QueryRegistry::close, cleanup ( #6579 )
2018-09-24 13:10:07 +02:00
Simon
912f109968
Add simple Future library ( #6464 )
2018-09-21 16:14:17 +02:00
Lars Maier
5929cafaf9
cleanoutServer Bug Fix ( #6537 )
...
* Fixing bug: cleanoutServer will no longer add old leader as follower.
* Fixed rollback.
2018-09-21 10:16:14 +02:00
Simon
aa21ffdb7a
Properly check syncer erros, catch more exceptions ( #6520 )
2018-09-17 16:39:23 +02:00
Dan Larkin-York
0dfabd8f04
Fix several TSan warnings ( #6473 )
2018-09-14 11:16:45 +02:00
Max Neunhöffer
84735955ea
Add advertised endpoints. ( #6104 )
2018-09-13 16:30:55 +02:00
Simon
22b9c31c13
Removing ClusterComm ClientTransactionID ( #6294 )
2018-09-12 22:15:16 +02:00
Kaveh Vahedipour
6b2733625c
Feature/static const strings cleanup ( #6352 )
...
* AgentConfiguration cleanup
* static strings in maintenance / agency
* more strings unified
* fix windows build
2018-09-11 13:40:03 +02:00
Jan
17ea2d4ec9
suppress some messages which are expected on shutdown ( #6381 )
2018-09-05 14:15:35 +02:00
Vasiliy
5329f34771
issue 465.2.2: remove redudnant heap allocations and simplify API ( #6349 )
...
* issue 465.2.2: remove redudnant heap allocations and simplify API
* address merge issue
* address more merge issues
* address more merge issues
* address review comments
* do not deallocate non-allocated instances
2018-09-05 13:37:37 +03:00
Vasiliy
e862efdc3b
issue 458.4: retrieve the system database via the SystemDatabaseFeature ( #6299 )
2018-08-31 19:45:10 +02:00
Jan
5873f63a72
Bug fix/fixes 2908 ( #6279 )
2018-08-31 17:26:54 +02:00
Lars Maier
63d9cfa081
Maintenance Fixes ( #6284 )
...
* Clean up for `FIXMEMAINTENANCE` comments: removed race condition, added errors and `notify()`s.
* Removed dublicated code.
* Added requested changes. Added error reporting for `UpdateCollection`.
* Make it compile. Add missing `notify()`.
* `CreateCollection` generates errors in all code paths.
* Fixed catch test.
2018-08-31 15:24:29 +02:00
Kaveh Vahedipour
fe9b2fecdc
notifyInactive has been lying aroung in the agent without being used. relique of the time, when we thought, that we would have an pool of agents from which we'd draw, if an agent failed ( #6290 )
2018-08-31 10:48:39 +02:00
Kaveh Vahedipour
28754cbf15
Feature/schmutz plus plus ( #5972 )
...
- Schmutz now called "Maintenance" and completely implemented in C++
- Fix index locking bug in mmfiles
- Fix a bug in mmfiles with silent option and repsert
- Slightly increase supervision okperiod and graceperiod
2018-08-24 12:15:35 +02:00
Simon
229c09d434
Allow dirty-reads from passive ( #6136 )
2018-08-20 16:26:14 +02:00
Matthew Von-Maszewski
86ea784372
bugfix: establish unique function name & implementation for communication retry status ( #6150 )
...
* initial checkin of isRetryOK(). Includes fixes to known code that has previously hung shutdowns by performing infinite retries.
* slight help on getting out of a loop faster during shutdown. not essential.
2018-08-17 14:57:12 +02:00
Vasiliy
6fd541d110
issue 427.5: use ApplicationServer reference instead of pointer ( #6145 )
...
* issue 427.5: use ApplicationServer reference instead of pointer
* address MSVC build failure
2018-08-15 12:16:02 +03:00
Jan
a5bb50b0bf
remove methods from VelocyPackHelper that are also in VPackSlice ( #5946 )
2018-07-25 09:01:29 +02:00
Jan
ac1d5aac9b
allow starting agency with --console again (requires V8 then) ( #5927 )
2018-07-24 09:34:22 +02:00
Max Neunhoeffer
1c4beb4c34
Keep failed follower in followers list in Plan.
2018-07-23 11:25:10 +02:00
Kaveh Vahedipour
0080498e89
compaction index should not exceed local commit index ( #5900 )
2018-07-17 15:54:20 +02:00
Jan
006995a6a5
Bug fix/dont start v8 for agency ( #5891 )
...
* disable V8 for agency setups
* add missing section declaration (fixes unrelated Windows bug)
2018-07-17 11:24:53 +02:00
jsteemann
a0e9865181
typos
2018-07-16 20:49:22 +02:00
jsteemann
44c7b1b476
remove tabstops
2018-07-16 15:00:12 +02:00