* added missing return statements
* only spend up to 10 seconds for initially fetching the list of collections in arangosh
fetching the list of collections is a blocking operation, and the default timeout for this is very high.
If the server is blocked by whatever reason, then the shell is unusable until the collections list request returns.
To avoid this, the initial request is limited to 10 seconds, so the shell can be used afterwards.
* if an index cannot be used for sorting, its sort
cost was previously returned as 0. this will in fact favor
indexes that can be used for filtering but not for sorting
over indexes that can be used for both.
this change is to report the sort cost for indexes that
cannot be used for sorting to n * log(n), where n is the
number of documents that optimizer expects to come out of the
index after filtering
* Backport active-failover fix for Windows into 3.4
* Backport stop/resume for Windows from devel
* Backport changes from devel into tests also
* Fix tests
* Remove forgotten whitespaces
* Fixed bug where the Foxxmaster doesn't reset jobs after a crash when it should, or a non-master coordinator removes jobs in progress during startup
* Added a regression test
* Removed foxxmaster test from greylist
* Updated CHANGELOG
* Fixed non-maintainer compile
* added check for empty scheduler
* removed log, old is 1 not 0
* require running in this thread
* test
* added isDirect to callback
* signature fixed
* added drain
* added allowDirectHandling
* disabled for testing
* Add ExecContextScope object to direct call.
* try alternate initialization of ExecContextScope
* remove ExecContextScope, no help. try _fifoSize as part of direct decision.
* strand management to minimize reuse of same strand per listen socket
* blind attempt to address Jenkins shutdown lock up. may remove quickly.
* add filename and line to existing error log message
* Adjust queueOperation() to stop accepting items once isStopping() becomes true.
* revert previous check-in to MMFilesCollectorThread.cpp
* big reformat
* fixed merge conflicts
* Add CHANGELOG entry.
* should not neglect the initial async request for read lock acquisition
* fixed nullptr
* correct timeout
* corrected error handling in getReadLock
* reverted "test fix"
* should remove async request from ClusterCom
* Do nothing in phaseTwo if leader has not been touched.
* Drop follower if it refuses to cooperate.
This is important since a dbserver that is follower for a shard will
after a reboot think that it is a leader, at least for a short amount
of time. If it came back quickly enough, the leader might not have
noticed that it was away.
* Initialize theLeader non-empty, thus not assuming leadership.
* Correct ClusterInfo to look into Target/CleanedServers.
* Prevent usage of to be cleaned out servers in new collections.
* After a restart, do not assume to be leader for a shard.
* agents' is obtained from leader's configuration
* corrections in Supervision for advertised endpoints
* change log
* Updated Documentation for cluster/health.
* Unified naming convention.
* Fixed missing update of volatile fields.
* Set version in right order.
* Removed debug output.
* Fixed jslint - missing ;
* Fix index creation in cluster.
Simplify and correct error handling logic in ensureIndexCoordinator.
* After index creation, wait until index appears.
We wait until the Supervision has removed the isBuilding flag and
the coordinator has reloaded the Plan.
* More index handling fixes.
* Explicitly remove isBuilding flag in coordinator (again).
* Fix order of arguments in REPLACE call.
* Take out debugging output again.
* Fix catch tests by holding mutex shorter.
* Better mutex handling in ClusterInfo.
* issue 506.3: backport 3.4: issue 506.3: use camel-case configuration parameter names consistntly, add a configuration version property to iresearch view meta
* backport: ensure meta version is supported
* backport: hide 'version' property from non-persistence json
* issue 506.2: backport 3.4: add optimization to not reexecute a primary-key filter if a match was already found
* backport: explicitly check type of instance of the primary-key filter
* backport: return non-null prepared filter and convert check to assert
* Ungreylist move shard test.
* Move leader shard: wait until all but the old leader are in sync.
* Increate moveShard timeout to 10000 seconds.
* Add CHANGELOG.
* Fix compilation.
* Fix a misleading comment.
* issue 153: ensure views are dropped in Agency when database is dropped in cluster, minor fixes
* backport: add test to ensure views are dropped when database is dropped from plan, fix some issues in ClusterInfo
* optimize primary key lookups in ArangoSearch
* fix test
* Add JS tests
* temporary comment optimizations
# Conflicts:
# arangod/Cluster/ClusterInfo.cpp
* Added some DEBUG output for replication rest handler
* Some more debug logging.
* Increased the priority of the ReplicationHandler. This way we will not get stuck with locks that cannot be canceled. Also cancel the lock on the correct database.
* Added extensive log output for replication thins
* Added tombstones to RestReplicationHandler. In a very unlikely case the cancel of a lock can be executed BEFORE the code that actually registers the lock, in this case we will now write a tombstone and do not lock.
* Revert "Added extensive log output for replication thins"
This reverts commit 6d4e37ea1e59e3b3457336019cc7dbc4c979504d.
* Added extensive log output for replication things, now in ERR level instead of MAINTAINER only
* Now actually use hours for synchronization
* React to errors under soft lock if they show up.
* Added a retry loop to increase the read-lock timer.
* Added more timeing output in RocksDB collection internals to figure out why the followers are dropped
* Tweaked RocksDB options
* Revert "Tweaked RocksDB options"
This reverts commit 2bf9c43280beda4792c47d079387fe5154cdd896.
* Removed debug output
* Applied all requested changes by goedderz
* Deleted unused variable
* backport of test data generation for maintenance from devel
* 3.4 working
* fixing index use in cluster while still being built
* fixed broken views
* correct 200 for ensureIndex
* merge with 3.4
* agency comm to handle replace in array
* supervision changes
* cluster info's exsureIndex
* 3.4 ready
* timeout
* missing files from origin
* neunhoef complaints
* bogus entry
* no need to wait for current once again
* no longer necessary. done in IndexFactory now
* correct comments
* left overs
* dead code revived
* Move CHANGELOG entry to the right place.
* Fix resign order
* Fixed a typo
* Get followers later, add TODOs
* Added a callback parameter to collection insert methods
* Get followers under the lock if necessary
* Extracted the replication of inserts into a separate method
* Move shortcut into replicate method
* Added callbacks for remove, replace and update
* Added missing overrides
* Extracted replication code from modifyLocal and removeLocal
* Update followers under lock also during replace, update, remove
* Fix changes from the last commit for update/replace
* Update comments, add asserts
* Remove changes for document-level locks that will be done in another PR
* Unify replication
* Adapt log messages to the devel ones
* Move common methods from its descendants to TransactionCollection, fix Mock on the way
* More IResearch test / mock fixes
* Relax asserts for nested transactions
* Reformat
* Fix non-babies remove and modify replication
* defense against the dark arts (nullptr in _ioContext)
* move incQueued() so that we can imply race state of _ioContext.
* adjust to meet Jans expectations
* jsteeman noticed that queue count is not considered before shutdown ... bad
* add JobGuard object to manage working count. should hold shutdown a tad longer.
* TEMPORARY HACK: need to validate problem that is randomly occurring in Jenkins automation
* TEMPORARY HACK 2: trying to isolate an acceptable sequence.
* TEMPORARY HACK 3: trying to isolate an acceptable sequence.
* TEMPORARY HACK 4: so close ... seem to have all the moving parts isolated. Come on Jenkin!
* shutdown now orderly finishes everything already in fifo queues and active on threads. Then forces any late requests to execute on callers thread.
* refactor arangosearch pks
* minor refactoring
* store PK as BigEndian since it leads to more compact index representation
* force iresearch to not to use libbfd
* fix tests
* Fix loophole.
* Fix inquiry case of id not found: 404.
* Also handle correctly in AgencyComm.
* Fix agency tests.
* Fix error handling in dropCollectionOnCoordinator.
* remove recent _activeThreadCondition. it made things worse. moved all ClusterCommThread methods to end of file to ease review.
* attempt at avoiding Scheduler io_context being nullptr in late shutdown steps
* manually revert last change since bug is realy about devel branch not 3.4 branch
* issue 496.5: backport 3.4: minor API cleanup and error reportin enhancements, update iresearch to commit d69f7bd184e009da7bf0a478efd34a0c85b74291
* add workaround for shell-collection-rocksdb-noncluster.js::testSystemSpecial test failure
* fix typo
* issue 496.3: backport 3.4: move more coordinator-related logic out of TRI_vocbase_t, rename some arangosearch view configuration parameters, remove some consolidation policies, update iresearch to revision 6fd9760d81b136f769e277ea5b8f53996ed7a1ca
* address merge issue
* backport: remove code causing nullptr access
* invalidate payload for each field in FieldIterator before setting a value
* address compilation issues
* Improve logging on coordinator when doing `arangorestore`.
* Return more error information in `mergeResults`.
* Longer timeout for communication coordinator -> leader for writes.
This is taking into account possible write stops from followers needed
to get in sync.
* Fix compilation.
* Get rid of numbers in exception log messages.
* Fix compilation.
* Fix indentation.
* Feature/arangosearch speedup removals (#7134)
* speedup document removals and optimize data model
* fix invalid constexpr
* reduce number of heap allocations for removals (#7157)
* open up connect limit to allow one DNS retry
* add CHANGELOG entry for dns retry fix
* update warning message wording
* Add lock line that was missed in recent scheduler merge
* backport: switch scope of responsibility between a TRI_vocbase_t and a LogicalView in respect to view creation/deletion
* backport: ensure arangosearch links get exported in the dump
* backport: ensure view is created during restore on the coordinator
* Updates for ArangoSearch DDL tests, IResearchView unregistration and known issues
* Add fix for internal issue 483
* Stop libcurl from trying to POST stdin
* Stop relocking every iteration in wait
* Remove unimplemented function
* Restrict setting of empty POSTFIELDS to POST requests
* Revert locking change
* Implement `syncCollectionCatchup` in DatabaseTailingSyncer.
First stab, might not even compile.
* Fixed a typo.
* Fix a typo and a compilation problem.
* Further compilation fix.
* Implement two stage catchup.
* Two small corrections.
* Unified error messages in Synchronize shard job.
* Improved a code comment.
* Fixed autocasting bool->double and double->bool issue. That is truely one of the best features ever invented... </irony>
* Renamed doHardLock => toSoftLockOnly and inverted default value
* Merged soft/hard foot logic with Transaction splits
* Use scopeguards to cancel readlocks
* Removed incorrect skipping of Batches in RocksDB Tailing syncer. This caused issues, whenever one transaction was spiltted.
* Added a test for Splitting a large transaction in RocksDB
* Reactivated skipping in RocksDB Wal Tailing (reverts initial fix)
* Actually include lastScannedTick in CollectionFinalize. Proper fix, kudos to @jsteemann.
* Fixed healFollower task in split-large-transaction test
* First attempt to not block the thread that requires the EXCLUSIVE sync-up lock
* Fixed insertion of query into registry in rest replication handler.
* Removed unnecessary / false asserts as suggested in review. Fixed code comments.
* Replaced auto with a correct type as suggested in review
* Added a helper function to validate if a query is in use in the registry
* Fixed logic bug in usage of query registry
* Fixed compile issue
* Implemented optional 'doHardLock' parameter in the replication acquire read-lock call. A hard-lock guarntees to stop all writes, a soft-lock may not.
* Fixed compile issue
* Automaticly transfrom int -> bool in initializerlist sucks...
* Inverted boolen logic bug hidden due to int->bool beeing logically inverted.
* Today it seems that bools are too complicated for my brain.
* Removed failure point, didn't write a test for it, and it is hard to write it in the current test environment. Need to find a better solution in future
* Applied chenges required by @goedderz in review
* Renamed doHardLock => toSoftLockOnly and inverted default value
* First attempt to not block the thread that requires the EXCLUSIVE sync-up lock
* Fixed insertion of query into registry in rest replication handler.
* Removed unnecessary / false asserts as suggested in review. Fixed code comments.
* Replaced auto with a correct type as suggested in review
* Added a helper function to validate if a query is in use in the registry
* Fixed logic bug in usage of query registry
* Fixed compile issue
* Automaticly transfrom int -> bool in initializerlist sucks...
* Inverted boolen logic bug hidden due to int->bool beeing logically inverted.
* Today it seems that bools are too complicated for my brain.
* Removed failure point, didn't write a test for it, and it is hard to write it in the current test environment. Need to find a better solution in future
* Applied chenges required by @goedderz in review
* Start ClusterComm threads in `ClusterFeature::start`. Stop ClusterComm threads in `ClusterFeature::stop`.
* Do not free objects in `Scheduler::shutdown`. Let the `unique_ptr` do their job. Stop ClusterComm threads in `ClusterFeature::stop`, but free instance in `ClusterFeature::unprepare`.
* `io_context` may contains lambdas that hold `shared_ptr`s to `Tasks` the required a functional `VocBase` in their destructor.
* Clean up.
* issue 485: ensure LogicalDataSource::drop() is called on vocbase drop
* add missed change
* backport: address race between make(...) and async job
* add another missed change
* backport: ensure recursive lock reports itself as locked correctly
* backport: address test failure on mmfiles
* backport: remove redundant lock already held by async task
* backport: reset reader before unlinking directory
* enable the ability to push results processing to threads
* have ClusterComm push libcurl response processing to Scheduler threads
* tuning changes from Matthew and Michael
* give new defaults to minimum thread count
* create multiple ClusterCommThreads, each with own Communicator object
* put PR notes in change log
* correct speling
* Also drain V8 queue.
* Add prio V8 to switch in canPostDirectly.
* Accept --server.minimal-threads even if maximal threads is not set.
* Reactivate stopping of threads.