1
0
Fork 0
Commit Graph

1359 Commits

Author SHA1 Message Date
Lars Maier 9a33122c5d [3.5] Added precondition to ensure that server is still as seen before. (#10477)
* Added precondition to ensure that server is still as seen before.

* Removed merge conflicts.
2019-11-20 13:39:57 +01:00
Kaveh Vahedipour 6169907f9a Bug fix 3.5/timestamp assert compatibility (#10354)
* assertion on compatibility, when timestamp missing

* assertion on compatibility, when timestamp missing

* assertion on compatibility, when timestamp missing

* assertion on compatibility, when timestamp missing

* Update CHANGELOG
2019-11-05 18:31:29 +03:00
Kaveh Vahedipour 9044f7de97 [3.5] yet another agency ttl bug (#10242)
* port from devel

* Update CHANGELOG
2019-10-14 16:46:06 +03:00
Jan b74971c9bb Improved performance of some agency helper functions. (#10222) 2019-10-11 18:30:04 +03:00
Jan 845701de6d fix broken & duplicate logIds (#10134) 2019-10-01 20:15:07 +03:00
Matthew Von-Maszewski 249899959f back port PR 9874 to 3.5. Add write amplification stat and option to disable statistics history. (#9947) 2019-09-16 12:57:11 +03:00
Max Neunhöffer 328f46e3d6 This merges hotbackup and atomic-db-creation into 3.5. (#9968)
* Squashed commit of feature-3.5/hotbackup_devel.

This puts hotbackup into 3.5.

* Port atomic-database-creation-2 to 3.5.

* Remove some wrongly ported code.

* Fix compilation.

* Fix a manual merge error.

* Remove a feature from the mocks which does not exist in 3.5.

* Add some code which was forgotten in manual merge.

* Fix a problem introduced in a manual merge.

* reuse function

* Address some whitespace issues that came up in review

* aardvark should not create the frontend collection

* create _frontend collection from c++

* recheckAndUpdate Callback in CollectionWatcher

* Wrong author ;)

* rm outdated todo

* Update lib/Basics/VelocyPackHelper.h

Co-Authored-By: Michael Hackstein <michael@arangodb.com>

* use logger unique id, use startup logger

* not needed

* optimized vector shardid method

* do not create _modules collection lazy anymre

* Formatting.

* Assert instead of if/TRI_ASSERT(false)

* Don't use exceptions as control structure

* Re-add READ_LOCKER that got lost in translation

* Fix audit log in case database creation fails early.

* legacy sharding

* Add CHANGELOG entry.

* Retry database cancellation indefinitely

* Do not use exceptions in UpgradeTask

* DropCollection is a FAST_LANE action and should not need much time or else retry.

* Remove superflous addition of LdapFeature

Proudly brought to you by ASAN tests

* Fixed check for distributShardsLike sharding on _system database

* Fixed compile issue on tests

* Removed assertion that seems to be not correct yet on devel.

* Sort out google cloud storage as remote. (#9918)

* Add successful method to ClusterCommResult.
* Improve error forwarding for cluster internal communication.

* Feature/hotbackup list retries (#9924)

* retry hot backup listing for 2 minutes in cluster before giving up

* Enable api by default.

* fix broken list of non existing id (#9957)

* Fix compilation after manual merge.

* Fix another compilation problem.

* Yet more fixes for compilation.

* More compilation fixes.
2019-09-11 13:13:54 +03:00
Kaveh Vahedipour 10d99b20cb url should be deleted not path (#9882) 2019-09-02 16:41:33 +03:00
Jan 2123fceb7a cover more cases of "unique constraint violated" issues during replication (#9829)
* cover more cases of "unique constraint violated" issues during
replication

* add more testing

* fix compile error
2019-08-30 12:42:58 +03:00
Max Neunhöffer 0599a1c79c
Fix unneeded return code. (#9853) 2019-08-30 08:44:07 +02:00
Kaveh Vahedipour 139c5d3839 [3.5] agency lockup when removing 404ed callbacks and leadership preparation (#9846)
* queue the write
* finishing 3.5
* commenting
2019-08-29 14:32:48 +02:00
Jan 6d53c1e02f Feature 3.5/curl 7.65.3 (#9790)
* upgrade curl to 7.65.3

* re-add modifications to curl

* remove curl 7.63.0

* update LICENSES file

* fix error handling
2019-08-26 15:54:42 +03:00
Kaveh Vahedipour 09d2745625 [3.5] clean up your crap, dbservers. alright, i'll do it. (#9722)
* clean up your crap, dbservers. alright, i'll do it.

* ts ts ts

* body is shared_ptr

* Update CHANGELOG

* revert callback bodies to API specification

* array needs be inside so that multiple unobserves to same key are possible
2019-08-22 14:26:44 +03:00
Lars Maier 87ae6165d7 [3.5] Move Shard Bug 4567124 (#9747)
* Fixed abort in moveshard in the case new leader is not in Current.

* Fixed special case + updated changelog.

* Update CHANGELOG
2019-08-19 19:23:12 +03:00
Max Neunhöffer b753c895e4
Fix an agency bug found in Windows tests. (#9728)
* Fix agency bug found in Windows tests.
* CHANGELOG.
2019-08-16 12:17:09 +02:00
KVS85 e64080e207
Merge 3.5.1 back to 3.5 (#9713)
* Bug fix 3.5/make arangosh reconnect (#9615)

* make arangosh reconnect

* added CHANGELOG entry

* fix lagging AgencyCallbacks (#9620)

* fix lagging AgencyCallbacks

* optimizations, discussed with @mchacki

* fix wording

* updated CHANGELOG

* fix yet another undefined behavior (#9629)

* [3.5.1] Fail the FailedLeader Job if the new leader fails. (#9628)

* Fail the FailedLeader Job if the new leader fails.

* Updated changelog.

* In case of timeout do not rollback.

* Fixed catch tests.

* Changed wording.

* DELETED rollback.

* reduce wait timeouts as a mitigation for notifying waiters without ho… (#9619)

* reduce wait timeouts as a mitigation for notifying waiters without holding the required mutex

this is a quick mitigation only, which reduces maximum wait time from 1
second to 100 milliseconds without changing other behavior.

the main problem of notifying pending writers without successfully
acquiring the required mutex still needs proper addressing.

* adjust timing-dependent test

* [3.5.1] Fast Controlled Leaderchange (#9634)

* First draft of keeping in sync during controlled leader change.

* Test if server is actually the leader in plan.

* Updated changelog.

* Added oldLeader check for set-the-leader request.

* Small fixes.

* Removed LOG_DEVEL.

* less copying, more moving! 🚚 (#9645)

* attempt to fix load_balancing tests in slow test environments (#9626)

* Bug fix/fix swagger datatype (#9045) (#9602)

* Bug fix/fix swagger datatype (#9045)

* remove http so https arangos will work

* verify that query parameters are proper swagger data types, fix offending documentation files

* return the actual type - not the list of available ones

* check formats

* there is no uint64 in swagger

* Fresh Swagger

* Port TakeoverShardLeadership from devel to 3.5.1 (#9659)

* Create TakeoverShardLeader job.
* Add TakeoverShardLeadership to Action factory.
* Add log message at level debug.
* Sort out LOG_TOPIC ids.
* Fix unit tests.
* CHANGELOG.

* Bug fix 3.5/hide mmfiles specific info in web ui (#9668)

* attempt to fix load_balancing tests in slow test environments (#9626)

* Bug fix/fix swagger datatype (#9045) (#9602)

* Bug fix/fix swagger datatype (#9045)

* remove http so https arangos will work

* verify that query parameters are proper swagger data types, fix offending documentation files

* return the actual type - not the list of available ones

* check formats

* there is no uint64 in swagger

* Fresh Swagger

* hide MMFiles-specific information when we don't need it

* Ported ResignLeadership to 3.5 (#9656)

* attempt to fix load_balancing tests in slow test environments (#9626)

* Bug fix/fix swagger datatype (#9045) (#9602)

* Bug fix/fix swagger datatype (#9045)

* remove http so https arangos will work

* verify that query parameters are proper swagger data types, fix offending documentation files

* return the actual type - not the list of available ones

* check formats

* there is no uint64 in swagger

* Fresh Swagger

* Ported ResignLeadership to 3.5

* Add the actual http route.

* Aardvark: Add k Shortest Paths example graph to UI (#9491) (#9661)

* Aardvark: Add k Shortest Paths example graph to UI (#9491)

* Add example graph to UI

* Add kShortestPathsGraph to examples.js

* Update example-graph.js

* Update aardvark.js

* Regenerate UI

* add the ability to have cluster special examples (#9613) (#9663)

* add the ability to have cluster special examples

* Update get_cluster_health.md

* fix abort condition, fix negative filtering for cluster tests

* Test if job fails with unmet assertion

* Remove cluster test example

* germanize

* better skip reasons

* removing superfluous semicolons

* Revert skip reasons, too noisy

* various replication improvements: (#9675)

* attempt to fix load_balancing tests in slow test environments (#9626)

* Bug fix/fix swagger datatype (#9045) (#9602)

* Bug fix/fix swagger datatype (#9045)

* remove http so https arangos will work

* verify that query parameters are proper swagger data types, fix offending documentation files

* return the actual type - not the list of available ones

* check formats

* there is no uint64 in swagger

* Fresh Swagger

* various replication improvements:

- better debuggability (more log details)
- shorter minimum wait delay in active failover
- fixed too early pruning of WAL files on leaders

* Bug fix 3.5/fix rocksdb return code (#9692)

* attempt to fix load_balancing tests in slow test environments (#9626)

* Bug fix/fix swagger datatype (#9045) (#9602)

* Bug fix/fix swagger datatype (#9045)

* remove http so https arangos will work

* verify that query parameters are proper swagger data types, fix offending documentation files

* return the actual type - not the list of available ones

* check formats

* there is no uint64 in swagger

* Fresh Swagger

* fix return codes for concurrent writes to same documents

* [3.5] Feature/rebootid notice changes, backport of #9523 (#9684)

* Feature/rebootid notice changes, backport of #9523

* Fixed error code to not re-use an old one

* Bug fix 3.5/issue 9679 (#9682)

* attempt to fix load_balancing tests in slow test environments (#9626)

* Bug fix/fix swagger datatype (#9045) (#9602)

* Bug fix/fix swagger datatype (#9045)

* remove http so https arangos will work

* verify that query parameters are proper swagger data types, fix offending documentation files

* return the actual type - not the list of available ones

* check formats

* there is no uint64 in swagger

* Fresh Swagger

* fixed issue #9679

* bug-fix/issue-#9660 (#9704) (#9707)

* bug-fix/issue-#9660 (#9704)

* fix issue

* Update tests/js/common/aql/aql-view-arangosearch-cluster.inc

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* Update tests/js/common/aql/aql-view-arangosearch-noncluster.js

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* fix cluster tests

* Update CHANGELOG

* [3.5] agency node fixes (#9698)

* node fixes port from 3.4
* fixed change log

* update rocksdb statistics to deliver sums from column family instead of single value from default family. (#9706)

* Feature 3.5/geo functions (#9710)

* Add support for WGS84 on distances (#9672)

* Add area calculations (#9693)

* Update CHANGELOG
2019-08-14 20:24:47 +03:00
Michael Hackstein d5840c125a Bug fix 3.5/min replication factor (#9524)
* Cherry-pick minReplicationFactor

* Bug fix/failover with min replication factor (#9486)

* Improve collection time of IResearchQueryOptimizationTest

* Added a minReplicationFactor field in Collections. It is not possible to modify it yet and noone cares for it

* Added some assertion son minReplicationFactor

* Transaction API will now reject writes as soon as minimal replication factor is NOT fulfilled

* added minReplicationFactor to the user interface, preparation for the collection api changes

* added minReplicationFactor to VocBaseCollection, RestReplicationHandler, RestCollectionHandler, ClusterMethods, ClusterInfo and ClusterCollectionCreationInfo

* added minReplicationFactor usage to tests

* TODO TEMOPORARY COMMIT FOR TESTING PLEASE REVERT ME

* minReplicationFactor now able to change via collection  properties route

* fixed wrongly assert

* added minReplicationFactor to the graph management ui

* added minReplicationFactor to the gharial api

* Fixed off-by-one error in minReplicationFactor. We actually enforced one more.

* adjusted description of minReplicationFactor

* FollowerInfo Refactoring

* added gharial api graph creation tests with minimal replication factor

* proper cleanup of shell collection tests, removed lots of duplicate code, preparation for some new tests

* added collection create tests using invalid/valid names, replicationFactor and minReplicationFactor

* Debug logging

* MORE Debug logging

* Included replication fast lane

* Use correct minreplicationfactor

* modified debug logging

* Fixed compileissues

* MORE Debug logging

* MORE Debug logging

* MORE Debug logging

* MORE Debug logging

* MORE Debug logging

* MORE Debug logging

* MORE Debug logging

* Revert "MORE Debug logging"

This reverts commit dab5af28c0.

* Revert "MORE Debug logging"

This reverts commit 6134b664bd.

* Revert "MORE Debug logging"

This reverts commit 80160bdf3b.

* Revert "MORE Debug logging"

This reverts commit 06aabcdfe1.

* Removed debug output

* Added replication fast lane. Also refactored the commands as i cannot take it any more...

* Put some requests of RocksDBReplication onto CATCHUP Lane.

* Put some requests of MMFilesReplication onto CATCHUP Lane.

* Adjusted Fast and MED lane usage in Supervised scheduler

* Added changelog entry

* Added new features entry

* A new leader will now keep old followers in case of failover

* Update arangod/Cluster/ClusterCollectionCreationInfo.cpp

Co-Authored-By: Tobias Gödderz <tobias@arangodb.com>

* Fixed JSLINT

* Unified lane handling of replication handlers

* Sorry forgotten in last commit

* replaced strings with static strings

* more use of static strings

* optimized min repl description in the ui

* decr initial loop variable

* clean up of the createWithId test

* more use of static strings

* Update js/apps/system/_admin/aardvark/APP/frontend/js/views/collectionsView.js

Co-Authored-By: Tobias Gödderz <tobias@arangodb.com>

* Added some comments on condition, renamed variable as suggested in review

* Added check for min replicationFactor to be non-zero

* Added assertion

* Added function to modify min and max replication factor in one go

* added missing semicolon

* rm log devel

* Added a second information to follower info that can keep track of followers that have been in sync before a failover has taken place

* Maintenance reports previous version now to follower info. instead of lying by itself. The Follower Info now gets a failover save mode to report insync followers

* check replFactor against nr dbservers

* Add lie reporting in CURRENT

* Reverted most of my recent commits about Failover situation. The intended plan simply does not work out

* move replication checks from logical collection to rest collection handler

* added more replication tests

* Include assert only if we are not in gtest

* jslint

* set min repl factor to zero if satellite collection

* check replication attributes in v8 collection

* Initial commit, old plan, does not yet work

* fixed ires tests

* Included FailoverCandidates key. Not fully implemented

* fixed wrong assert

* unified in sync follower reporting

* fixed compiler errors

* Cleanup locking, and fixed potential deadlocks

* Comments about locking order in FollowerInfo.

* properly check uint

* Keep old leader as potential failover candidate

* Transaction methods now use followerInfo to check if the leader can write, this might have the sideeffect that 'failoverCandidates' are updated

* Let agency check failoverCandidates if possible

* Initialize member variables

* Use unified follower reporting in DBServerAgencySync

* Removed obsolete variable, collecting it somewhere else

* repl factor attr check

* Reimplemented previous followers, second attempt now. PhaseOne and PhaseTwo can now synchronize on current.

* Fixed assertion, forgot an off-by-one

* adjusted test to be more preciese now

* Fixed failove candidates list

* Disable write on dropping too many followers

* Allow to run updateFailoerCandidates multiple times with same leader.

* Final fixes, resilience tests now green, crossing fingers for jenkins

* Fixed race on atomics comparison

* Fixed invalid number type

* added nullptr handling

* added nullptr handling

* Removed invalid assert

* Make takeover of leadership an atomic operation

* Update tests/js/common/shell/shell-cluster-collection.js

Co-Authored-By: Tobias Gödderz <tobias@arangodb.com>

* Review fixes

* Fixed creation code to use takeoverLeadership

* Update arangod/Cluster/FollowerInfo.h

Co-Authored-By: Tobias Gödderz <tobias@arangodb.com>

* Applied review fixes

* There is no timeout

* Moved AQL + Pregel to INTERNAL_AQL lane, which is medium priority, to avoid deadlocks with Sync replication

* More review fixes

* Use difference if you want to compare two vectors...

* Use std::string ...

* Now check if we are in recovery mode

* Added documentation for minReplicationFactor

* Added readme update as well in documenation

* Removed merge conflict leftovers 0o, i should not trust the IDE

* Update js/apps/system/_admin/aardvark/APP/frontend/js/views/collectionsView.js

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* Update js/apps/system/_admin/aardvark/APP/frontend/js/views/collectionsView.js

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* Update Documentation/Books/Manual/Architecture/Replication/README.md

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* Update CHANGELOG

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* Update Documentation/Books/Manual/DataModeling/Collections/DatabaseMethods.md

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* Update Documentation/Books/Manual/ReleaseNotes/NewFeatures35.md

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* Update Documentation/DocuBlocks/Rest/Collections/1_structs.md

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* Update js/apps/system/_admin/aardvark/APP/frontend/js/views/graphManagementView.js

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* Update js/apps/system/_admin/aardvark/APP/frontend/js/views/graphManagementView.js

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* Update Documentation/DocuBlocks/Rest/Graph/1_structs.md

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* Apply suggestions from code review

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* Adepted review requests, thanks for finding!

* Removed unnecessary const

* Apply suggestions from code review

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* Moved initilization of variable more downwards

* Apply lock before notify_all()

* Remove documentation except DocuBlocks, covered by PR in docs repo

* Remove accidental indent
2019-07-22 17:48:34 +03:00
Jan c52f2a8315
refactoring (#9411) 2019-07-09 11:15:52 +02:00
Jan 1d15b50d22
Bug fix/applicationserver stop (#9414) 2019-07-08 20:30:05 +02:00
Andrey Abramov a793a4749f
Bug fix/rename iresearch analyzer feature (#9424)
* rename feature

* ensure 'MMFilesLogFileManager::SafeToUseInstance' is safe to use in multithreaded context

* address mac compilation issue
2019-07-08 15:46:57 +03:00
Jan 1a58cc2213
add VelocyPackHelper::equal method (#9389) 2019-07-03 12:15:11 +02:00
Jan 9cb08ded92
make the comparison functions unambiguous (#9349)
* make the comparison functions unambiguous

* added @kaveh's suggestion
2019-07-01 16:35:28 +02:00
Lars Maier c215e30299 Precs to check if collection exists (#9278)
* Adding preconditions for jobs to check that the collection still exists.

* Make it compile.

* Fixed tests.
2019-07-01 13:24:53 +02:00
Lars Maier eb1aa6e024 Response compression (#9300)
* First draft of response compression.

* Cleanup.

* Removed compression from /_api/version.

* Added ruby test for response compression.
2019-06-28 10:02:48 +02:00
Max Neunhöffer d6d362bd3b
Fix agency election lock step bug. (#9351)
* Fix agency election lockstep bug.

Reset the base point for the random election timeout to now whenever we have
cast a vote, be it for us or for some other server.

* CHANGELOG.

* Fix compilation.
2019-06-27 22:06:26 +02:00
Lars Maier 50ce41e062 Release _to server when abort because of dropped collection or `something serious went wrong`. (#9267) 2019-06-17 15:04:44 +02:00
Jan Christoph Uhde 3f603f024f remove some containers from common.h (#9223)
* remove some containers from Common.h

* enterprise fixes
2019-06-07 13:27:24 +02:00
Kaveh Vahedipour 8a511934df [devel] fix ttl handling for object assigments (#9207)
* fix ttl handling for object assigments
* completed test handling
* CHANGELOG.
2019-06-06 15:48:40 +02:00
Lars Maier 1e94ecf414 Bug fix/supervision fixes4 (#9016)
* Try to fix agency problems with snapshots.

* Abort MoveShards jobs that have the failed server as fromServer.

* Report aborts.

* CHANGELOG.
2019-05-31 17:20:06 +02:00
Kaveh Vahedipour 773f3c8422 [devel] fix state clientlookuptable (#9066) 2019-05-30 04:24:46 +02:00
Jan 59b67cad40
fix various small annoyances (#9079) 2019-05-23 17:36:38 +02:00
Simon 394c070a4f Do not check isEnabled (#8997) 2019-05-17 16:33:17 +02:00
Wilfried Goesgens 1907a7211b Bug fix/cleanup system includes (#8962) 2019-05-15 15:12:59 +02:00
Andrey Abramov 8358b978be disable arangosearch analyzers in agency (#8989)
* disable arangosearch analyzers

* do nothing in `stop()` if feature is disabled
2019-05-14 17:37:38 +02:00
Matthew Von-Maszewski fb157b89ac Add protective barrier to sendWithFailover() in case MANAGER is nullptr (#8998) 2019-05-14 17:35:55 +02:00
Lars Maier d9ff6647d8 [devel] Do not skip the check if there is only a single agent. (#8943)
* Do not skip the check if there is only a single agent.
* Updated changelog.
2019-05-14 15:42:43 +02:00
Lars Maier 49c568e674 Added reason to job abort method. (#8877) 2019-05-14 15:39:53 +02:00
Jan c00442e31c
fix some issues found by cppcheck (#8959) 2019-05-10 16:31:22 +02:00
Jan 976dc2b726
Bug fix/issues 2019 05 06 (#8913) 2019-05-07 12:17:16 +02:00
Lars Maier c99e8e8973 [devel] ClientID Agency Transaction (#8652)
* Changed clientId to format <serverid>:<uuid>.
* Changed behavior if id is not known.
2019-04-30 10:39:23 +02:00
Jan 449ab1ed8e
Bug fix/cppcheck 13042019 (#8752) 2019-04-15 10:13:56 +02:00
Simon 937d743ba6 Bug fix/pregel stuff (#8733) 2019-04-11 15:58:28 +02:00
Max Neunhöffer 80bfb85695
Port agency performance tuning for many shards to devel. (#8647)
* Port agency performance tuning for many shards to devel.
* Add more IDs to LOG_TOPIC calls.
* Even more IDs for LOG_TOPIC.
* Fix a duplicate LOG_TOPIC ID.
* Fix an old merging bug in devel.
* Don't hesitate between phases one and two for small clusters.
2019-04-11 11:14:56 +02:00
Jan c6d3f8e052
Bug fix/pass on error messages (#8690) 2019-04-10 12:34:25 +02:00
Jan 6b9b9b0946
Bug fix/fix test muell (#8703) 2019-04-09 11:27:53 +02:00
Simon 7cd84a785a Remove Obsolete code (#8657) 2019-04-03 13:40:44 +02:00
Max Neunhöffer 02281d3be4
Handle InitDone correctly. (#8552)
* precondition plan / version in compaction / store TTL removal independent of local _ttl set
* Agency init loops break when shutting down.
* assertion failures in store on restarting following agents
* Minor porting fixes from 3.4
2019-04-01 17:01:05 +02:00
Jan d6d3e3daa4
initialize some member variables, added TODOs (#8545) 2019-03-26 12:57:32 +01:00
Jan 80a6e621ee
don't allocate memory so often in ClusterComm requests (#8550) 2019-03-26 00:31:56 +01:00
Jan Christoph Uhde c3f7961b88 apply unique log ids (#8561) 2019-03-25 20:26:51 +01:00