1
0
Fork 0
Commit Graph

239 Commits

Author SHA1 Message Date
Jan 2123fceb7a cover more cases of "unique constraint violated" issues during replication (#9829)
* cover more cases of "unique constraint violated" issues during
replication

* add more testing

* fix compile error
2019-08-30 12:42:58 +03:00
KVS85 e64080e207
Merge 3.5.1 back to 3.5 (#9713)
* Bug fix 3.5/make arangosh reconnect (#9615)

* make arangosh reconnect

* added CHANGELOG entry

* fix lagging AgencyCallbacks (#9620)

* fix lagging AgencyCallbacks

* optimizations, discussed with @mchacki

* fix wording

* updated CHANGELOG

* fix yet another undefined behavior (#9629)

* [3.5.1] Fail the FailedLeader Job if the new leader fails. (#9628)

* Fail the FailedLeader Job if the new leader fails.

* Updated changelog.

* In case of timeout do not rollback.

* Fixed catch tests.

* Changed wording.

* DELETED rollback.

* reduce wait timeouts as a mitigation for notifying waiters without ho… (#9619)

* reduce wait timeouts as a mitigation for notifying waiters without holding the required mutex

this is a quick mitigation only, which reduces maximum wait time from 1
second to 100 milliseconds without changing other behavior.

the main problem of notifying pending writers without successfully
acquiring the required mutex still needs proper addressing.

* adjust timing-dependent test

* [3.5.1] Fast Controlled Leaderchange (#9634)

* First draft of keeping in sync during controlled leader change.

* Test if server is actually the leader in plan.

* Updated changelog.

* Added oldLeader check for set-the-leader request.

* Small fixes.

* Removed LOG_DEVEL.

* less copying, more moving! 🚚 (#9645)

* attempt to fix load_balancing tests in slow test environments (#9626)

* Bug fix/fix swagger datatype (#9045) (#9602)

* Bug fix/fix swagger datatype (#9045)

* remove http so https arangos will work

* verify that query parameters are proper swagger data types, fix offending documentation files

* return the actual type - not the list of available ones

* check formats

* there is no uint64 in swagger

* Fresh Swagger

* Port TakeoverShardLeadership from devel to 3.5.1 (#9659)

* Create TakeoverShardLeader job.
* Add TakeoverShardLeadership to Action factory.
* Add log message at level debug.
* Sort out LOG_TOPIC ids.
* Fix unit tests.
* CHANGELOG.

* Bug fix 3.5/hide mmfiles specific info in web ui (#9668)

* attempt to fix load_balancing tests in slow test environments (#9626)

* Bug fix/fix swagger datatype (#9045) (#9602)

* Bug fix/fix swagger datatype (#9045)

* remove http so https arangos will work

* verify that query parameters are proper swagger data types, fix offending documentation files

* return the actual type - not the list of available ones

* check formats

* there is no uint64 in swagger

* Fresh Swagger

* hide MMFiles-specific information when we don't need it

* Ported ResignLeadership to 3.5 (#9656)

* attempt to fix load_balancing tests in slow test environments (#9626)

* Bug fix/fix swagger datatype (#9045) (#9602)

* Bug fix/fix swagger datatype (#9045)

* remove http so https arangos will work

* verify that query parameters are proper swagger data types, fix offending documentation files

* return the actual type - not the list of available ones

* check formats

* there is no uint64 in swagger

* Fresh Swagger

* Ported ResignLeadership to 3.5

* Add the actual http route.

* Aardvark: Add k Shortest Paths example graph to UI (#9491) (#9661)

* Aardvark: Add k Shortest Paths example graph to UI (#9491)

* Add example graph to UI

* Add kShortestPathsGraph to examples.js

* Update example-graph.js

* Update aardvark.js

* Regenerate UI

* add the ability to have cluster special examples (#9613) (#9663)

* add the ability to have cluster special examples

* Update get_cluster_health.md

* fix abort condition, fix negative filtering for cluster tests

* Test if job fails with unmet assertion

* Remove cluster test example

* germanize

* better skip reasons

* removing superfluous semicolons

* Revert skip reasons, too noisy

* various replication improvements: (#9675)

* attempt to fix load_balancing tests in slow test environments (#9626)

* Bug fix/fix swagger datatype (#9045) (#9602)

* Bug fix/fix swagger datatype (#9045)

* remove http so https arangos will work

* verify that query parameters are proper swagger data types, fix offending documentation files

* return the actual type - not the list of available ones

* check formats

* there is no uint64 in swagger

* Fresh Swagger

* various replication improvements:

- better debuggability (more log details)
- shorter minimum wait delay in active failover
- fixed too early pruning of WAL files on leaders

* Bug fix 3.5/fix rocksdb return code (#9692)

* attempt to fix load_balancing tests in slow test environments (#9626)

* Bug fix/fix swagger datatype (#9045) (#9602)

* Bug fix/fix swagger datatype (#9045)

* remove http so https arangos will work

* verify that query parameters are proper swagger data types, fix offending documentation files

* return the actual type - not the list of available ones

* check formats

* there is no uint64 in swagger

* Fresh Swagger

* fix return codes for concurrent writes to same documents

* [3.5] Feature/rebootid notice changes, backport of #9523 (#9684)

* Feature/rebootid notice changes, backport of #9523

* Fixed error code to not re-use an old one

* Bug fix 3.5/issue 9679 (#9682)

* attempt to fix load_balancing tests in slow test environments (#9626)

* Bug fix/fix swagger datatype (#9045) (#9602)

* Bug fix/fix swagger datatype (#9045)

* remove http so https arangos will work

* verify that query parameters are proper swagger data types, fix offending documentation files

* return the actual type - not the list of available ones

* check formats

* there is no uint64 in swagger

* Fresh Swagger

* fixed issue #9679

* bug-fix/issue-#9660 (#9704) (#9707)

* bug-fix/issue-#9660 (#9704)

* fix issue

* Update tests/js/common/aql/aql-view-arangosearch-cluster.inc

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* Update tests/js/common/aql/aql-view-arangosearch-noncluster.js

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* fix cluster tests

* Update CHANGELOG

* [3.5] agency node fixes (#9698)

* node fixes port from 3.4
* fixed change log

* update rocksdb statistics to deliver sums from column family instead of single value from default family. (#9706)

* Feature 3.5/geo functions (#9710)

* Add support for WGS84 on distances (#9672)

* Add area calculations (#9693)

* Update CHANGELOG
2019-08-14 20:24:47 +03:00
Tobias Gödderz 87e5fe7dd2 Bug fix 3.5/clean replication api wal tracking (#9503)
* Use int type for server id

Change serverId to an int

Pass syncerId only for synchronous replication

Added UrlBuilder

structs to classes, reordering

Added Location class, cleanup

Fixed initialization order

Use Location class

Use string for large ints

Documentation

Added clientInfo to ReplicationClientProgressTracker and corresponding rest handlers

Pass clientInfo string in sync replication

Pass clientInfo in addFollower, too

Updated docu

Renamed UrlBuilder to UrlHelper

Updated docu

Try to fix compile error on windows

Fixed a bug and a test

* Implemented @jsteeman's comments
2019-07-18 19:38:31 +03:00
Tobias Gödderz f501e00e9d Bug fix/add shard id to replication client identifier (#9366) 2019-07-08 14:03:42 +02:00
Dan Larkin-York 44a413a9af Miscellaneous fixes for named indices (#9100) 2019-05-31 17:00:56 +02:00
Jan 59b67cad40
fix various small annoyances (#9079) 2019-05-23 17:36:38 +02:00
Jan 79258e072a
Bug fix/remove io task (#9056) 2019-05-22 14:34:49 +02:00
Dan Larkin-York b029f67e68 Fix replication index conflicts. (#8659) 2019-04-04 16:41:06 +02:00
Jan Christoph Uhde c3f7961b88 apply unique log ids (#8561) 2019-03-25 20:26:51 +01:00
Jan 12e11a5197
port of replication improvements from 3.4 (#8308) 2019-03-11 13:37:18 +01:00
Jan 5d2ab0c901
port from 3.4 (#8275) 2019-02-28 14:36:29 +01:00
Lars Maier 12eebb15fe Feature/new server infra (#7733)
* Decoupled IO handling from Scheduler.

* Fixed SSL start up bug.

* Replaced Scheduler with new worker farm implementation.

* Added minimal statistics and info string for Scheduler.

* Added support for timed submissions.

* Updated delayed submission api. Updated code that used timers.

* Extracted new Scheduler into a virtual parent class. The implementation can now depend on the usecase.

* Signal handler now working.

* Changed threads names, `_stop` is atomic, check for failure during thread start + exception handling like old scheduler did.

* Commented on source code and added TODOs.

* Played around with start-stop-conditions

* Play around with start stop condition.

* start stop cond

* Sart Stop Conditions

* Removed bad cv_status check.

* Bug fix: now compare the actual objects instead of pointer values. Setup t1 and t2 depending on the thread id.

* Moved most of the stuff now unrelated to the Scheduler to GeneralServer. Got rid of JobGuard.

* Instead of waiting for a thread to terminate, put it on a clean up list and check for its termination in each supervisor run.

* Allow detaching long running threads.

* Fixed test mock.

* Updated the WorkHandle logic. Removed post functions.

* Fixed crash when obtaining shared_ptr from this in destructor.

* Added lost mutex.

* Fixed memory leak.

* Fixed merge bug.

* Changed a lot of code to optimize the scheduler.

* Fixed bug of invalidated iterator. Dont remove task on shutdown at different places. Let scheduler threads run until queue is empty.

* Only by value calls to queue.

* Added options again.

* Clean up of code.

* UI Request Lane added.

* Bug fixes in Scheduler.

* Applied reformat.

* Use sigaction.
2019-01-08 10:12:02 +01:00
Frank Celler ac9f375fb5 big reformat 2018-12-26 00:54:03 +01:00
Simon d0efd95a37 Bug fix/restore index refactor (#7470) 2018-11-27 20:22:36 +01:00
Simon c584527d79 Fix restore of views, add --view option (#7425) 2018-11-22 19:24:24 +01:00
Jan 0dd1776467
Make recovery more reliable (#7297) 2018-11-19 13:59:18 +01:00
Simon 0eb5142df8 Use shared_ptr for LogicalCollection (#7220) 2018-11-05 18:31:57 +01:00
Vasiliy 68953ae33a issue 496.4.1: move StorageEngine-specific flag out of the genric API and closer to the storage engine (#7212) 2018-11-04 16:52:28 +03:00
Vasiliy f088733420 issue 496.3: move more coordinator-related logic out of TRI_vocbase_t, rename some arangosearch view configuration parameters, remove some consolidation policies, update iresearch to revision 6fd9760d81b136f769e277ea5b8f53996ed7a1ca (#7166)
* issue 496.3: move more coordinator-related logic out of TRI_vocbase_t, rename some arangosearch view configuration parameters, remove some consolidation policies, update iresearch to revision 6fd9760d81b136f769e277ea5b8f53996ed7a1ca

* address potential deadlock between link creation and FlushThread

* remove code causing nullptr access

* add back lock around reader reopen

* revert: address potential deadlock between link creation and FlushThread

* invalidate payload for each field in FieldIterator before setting a value
2018-11-01 23:10:01 +03:00
Matthew Von-Maszewski 97ba8ca2be Bugfix: More 3.4 scheduler changes backported (#7091) 2018-10-26 17:09:20 +02:00
Jan a50468f4b1
attempt to fix https://github.com/arangodb/release-3.4/issues/96 (#6954)
* attempt to fix https://github.com/arangodb/release-3.4/issues/96

* address review comments
2018-10-18 14:51:16 +02:00
Simon fd81f52ab4 Allow WAL logger to split up transactions (#6800) (#6866) 2018-10-12 17:50:58 +02:00
Simon fcd7f49293 Fix error in dump test, create views before syncing data (#6636) 2018-09-28 16:59:38 +02:00
Vasiliy 5329f34771 issue 465.2.2: remove redudnant heap allocations and simplify API (#6349)
* issue 465.2.2: remove redudnant heap allocations and simplify API

* address merge issue

* address more merge issues

* address more merge issues

* address review comments

* do not deallocate non-allocated instances
2018-09-05 13:37:37 +03:00
Simon 568a09f177 Disable JS on DBServer, fix race in UserManager (#6244) 2018-08-24 22:20:49 +02:00
Simon 229c09d434 Allow dirty-reads from passive (#6136) 2018-08-20 16:26:14 +02:00
Simon 392118bd62 Use RangeDelete where possible (#6121) 2018-08-15 18:52:09 +02:00
Frank Celler a688dc0962
Feature/remove job queue thread (#5986)
limiting V8 calls in flight
2018-08-10 12:17:43 +02:00
Jan b278d6874a
allow master & slave to work in parallel for RocksDB WAL tailing (#6059) 2018-08-03 13:37:53 +02:00
Jan 2a416f2e33
Bug fix/3007 (#6019) 2018-07-30 17:16:47 +02:00
Vasiliy 62ca1235ac issue 430.3: remove redundant constructor from SingleCollectionTransaction (#5996) 2018-07-26 16:54:53 +03:00
Simon 2dd8593609 View Replication (#5915) 2018-07-26 10:28:46 +02:00
Jan 21064144c8
Bug fix/replication improvements (#5962) 2018-07-25 09:04:50 +02:00
jsteemann c54f14e1d6 don't use an assertion to indicate wrong usage 2018-07-19 22:41:13 +02:00
jsteemann c56f280a71 don't sleep so long while waiting for the result of /_api/replication/dump and /_api/replication/keys
Sleeping for shorter periods increases the chances that we can continue faster...
This will speed up the initial synchronization when collections contain only few or no documents,
but there are lots of collections to sync
2018-07-18 13:01:17 +02:00
Dan Larkin-York 820bfee329 Refactor syncer state and make notes for future parallelization (#5742) 2018-07-03 21:32:16 +02:00
Simon 3bec336aff TransactionState::addCollection refactoring (#5606) 2018-06-14 15:34:58 +02:00
Vasiliy d9cda9666f issue 389.8: remove redundant function from Methods, convert Syncer API to user TRI_ocbase_t& wherever possible (#5408) 2018-05-22 16:10:24 +03:00
Vasiliy 843e584746 issue 389.5: refactor StandaloneContext to be constructed with a TRI_vocbase_t& (#5370)
* issue 389.5: refactor StandaloneContext to be constructed with a TRI_vocbase_t&

* backport: address build issues
2018-05-17 01:15:50 +03:00
Simon 08e355aed8 Simple dump speedup (#5298) 2018-05-09 12:51:04 +02:00
Vasiliy 2ce20e86d7 issue 373.1: move globally-unique id generation from collection into data-source (#5182) 2018-05-07 22:14:40 +03:00
Vasiliy 9062c41592 issue 383.3: implement remainder of IResearchViewDBServer tests, use the data-source id (primary key) instead of an arbitrary instance for dropCollection()/dropView(), backport from iresearch upstream: ensure block is flushed if key index is full (#5176) 2018-04-23 00:33:46 +03:00
Simon 8be273efb8 Replication cleanup (#5105) 2018-04-17 08:17:42 +02:00
Vasiliy f392925903 issue 374.3: use a reference to vocbase instead of a pointer in DatabaseGuard 2018-04-13 09:56:49 +03:00
Andrey Abramov 04bb3da337
Merge branch 'devel' of https://github.com/arangodb/arangodb into bug-fix/internal-issue-#345 2018-03-20 19:04:54 +03:00
Michael Hackstein c1650702bf
Feature/aql server based locking (#4783)
* Started Implementing the ServerBasedlocking. There now is a container that can contain multiple query snippets. It now has to setup the necessary calls to the Servers

* Added backwards linking of QueryEngines, sth. DBServers can contact their Coordinators.

* Added LogTopic AQL

* Made AccessMode::Type Hashable

* Created a Mapping Server => LockLevel => Shard and createad a JSON object containing the Lock information for a complete AQL query per server

* Added code to build coordinator engines

* Finished with first draft of Coordinator-side of new DBServer based locking.

* Added a _api/aql/setup route that creates and locks all snippets/collections for one DBServer in a single go

* Fixed some Coordinator parts

* Index node now gracefully reports if it could not find it's collection when created from vpack. Otherwise it just hardly crashed...

* Modified the Coordinator Snippet collector to be able to handle subqueries properly.

* Started adding GraphNode handling. WIP. Need to deploy engines properly. Coordinator crashes on Graph tests

* Fixed compiler errors

* WIP: EngineInfoContainer

* Separated the EngineInfoContainers for Coordinator and DBServer into different files. They diverged more than anticipated

* Added forgotten files. THe DBServer container now creates the TraverserEngine Mapping and moves it into the Infos. They are not keeping it yet and need to add it to the message as well.

* The DBServer engine infos now persist the TraverserEngine infos. Need to add them to messages though.

* The new aql exec-engine now sends out traverserEngines as well

* Formatting and adding DEBUG level output

* Made the RestAQLHandler aware of the TraverserEngineRegistry. Also created the engines now. Return format changed server-side coordinator side needs fix.

* Adapted the Coordinator side for the DBServer based Shard Locking

* The DBServer based Locking now honors restrictions to certain shards

* Fixed a strange double lock bug in the new AQL Server based locking technique. Add some DEBUG output

* Fixed usage of MAINTAINERMODE macro. The assertion was never active

* Added TestCase for ContainerCoordinatorTest to cmake

* Added -DTEST_VIRTUAL to CMAKE. This is used to define virtual functions for mocking ONLY on test-builds.

* Fixed usage of ENABLE_MAINTAINER_MODE ifdef. CLANG format

* On non-enterprise builds ENTERPRISE_VERT defaults to TEST_VIRTUAL => virtual in test else non-virtual

* Added TEST_VIRTUAL to ExecutionEngine, Query and QueryRegistry

* Added first testcase for EngineInfoContainerCoordinator not yet ready.

* Mode CreateBlock a member function of engine, we have the engine in our hands anyways no need to make it static. Included some more TEST_VIRTUAL functions.

* Fixed clang/MacOs compile error. Added some more TEST_VIRTUAL declarations

* Finally fixed the first buildEngines UnitTest \o/

* Added a unit-test for backward linking of dependencies in CoordinatorPlanner

* Added multi-snippet test for EngineInfoContainerCoordinator

* Removed QueryRegistry.h from central header files and replaced by a forward declartion.

* Added a createBlocks method on the ExecutionEngine. It should be responsible to create all those blocks at once. Adapted the UnitTests as well. Not included Tests for the new createBlocks functionality. Need to mock the options feature first

* Added another test that Coordinator Snippets of queries can be created correctly

* Fixed Coordinator-site cleanup of QueryRegistry, if any of the query creations fails with error, incl UnitTest.

* Added first test for RestAqlHandler::setup. It does only test the setup and gives prepartion for real testing.

* Added a assertion of http return code. Still no creation of queries is tested. Requires a huge amount of mocking.

* fix some deadlocks found by evil lock manager (tm)

* fix duplicate lock

* fix indentation

* ensure proper lock dependencies

* fix lock acquisition

* removed useless comment

* do not lock twice

* create either a V8 transaction context or a standalone transaction context, depending on if we are called from within V8 or not

* AQL micro optimizations

* use explicit constructor

* only use V8DealerFeature's ConditionLocker for acquiring a free V8 context

entering and exiting the selected context is then done later on without having to hold the ConditionLocker

* remove some recursive locks

* Disable custom deadlock detection when Thread Sanitizer is enabled

* Changing ifdef's

* grr

* broke gcc

* Using atomic for ApplicationServer::_server

* fix premature unlock

* add some asserts

* honor collection locking in cluster

* yet one more lock fix

* removed assertion

* Allow the clustercomm to send nolock headers on count. This is used form within AQL

* IsLocked on transactions will now always yield true IF LOCK_NEVER is set. We simply assume someone else holds the lock for us. Also LOCK_NEVER is now set on collection/count if noLock header is send.

* Moved the flag if collections need to be locked into the TraverserEngines.

* Added enterprise-satellite hooks in EngineInfoContainerDBServer

* Removed now obsolete code

* Replaced throwing of Exception by an ResultObject

* Added some more tests and moved adding snippet to query engine more to the outside.

* Added the AQL result type

* Make the branch compile again

* Register WITH collections for Graphs in the new Collector.

* Fixed test code for failing query clone. Idea was to once clone successfully and second time to fail, we verify that first clone is cleaned up properly. However test failed on first clone...

* Removed a double builderClose

* Added Changelog entry

* Removed empty if

* Removed obsolete todo

* Properly initialize the AqlResult with nullptr on error case

* Updated comment

* Simplified Assertion

* Removed debug output object...

* Added additional catch case for std::exception to get some more error info

* Clarified evaluation order for move case

* Added Explicit

* Fixed cleanup of Coordinator if Registry fails to insert query.

* Allow to use other locks than Read/Write for AQL collections. Not yet in API.

* Updated Comments for other Locks on DBSide. Adapted Destruction CatchTests

* Fixed double builderClose and removed unnecessary double commits

* Added a comment to clarify the state

* Moved error output to trace. Leftover from debugging

* Added some tests for complex subquery patterns

* Added a 'fireAndForgetRequests' methods to cluster comm which allows to send out a bunch of messages but does not wait for their results

* Properly cleanup leftovers of queries if the instantiation step already failed

* Added code comment for fireAndForget

* Added indexes to subquery test to make the plan a bit easier

* The cleanup on DBServerEngines in error case now also cleans up traverser engines.

* Removed unnecessary includes

* Removed debug logging

* Fixed hidden merge conflict
2018-03-20 16:52:19 +01:00
Andrey Abramov 01d9baf359 remove TRI_ERROR_ARANGO_VIEW_NOT_FOUND, rename TRI_ERROR_ARANGO_COLLECTION_NOT_FOUND to TRI_ERROR_ARNANGO_DATA_SOURCE_NOT_FOUND 2018-03-17 19:36:14 +03:00
Vasiliy 06eb8ade01 issue 344.7: remove more redundant functions (#4863)
* issue 344.7: remove more redundant functions

* backport: fix missed functions under USE_ENTERPRISE
2018-03-15 17:10:28 +01:00
Vasiliy 148bdb7158 issue 344.6: remove some redundant functions (#4842) 2018-03-15 11:03:35 +01:00
Vasiliy c8739cd3cd manually-merge: cache data-sources in CollectionNameResolver 2018-03-14 10:11:50 +03:00