1
0
Fork 0
Commit Graph

618 Commits

Author SHA1 Message Date
Jan 616ea94f24
Bug fix/cleanup 31032019 (#8632) 2019-04-01 17:14:11 +02:00
Jan Christoph Uhde c3f7961b88 apply unique log ids (#8561) 2019-03-25 20:26:51 +01:00
Jan 0a1ab07c64
address review concerns (#8544) 2019-03-23 22:10:19 +01:00
Simon 3ada15fc35 The Legendary El Cheapo (#8485) 2019-03-22 11:38:33 +01:00
Simon 49cc3bcd1e Refactorings from cluster trx improvement branch (#8391) 2019-03-14 23:13:17 +01:00
Max Neunhöffer 2a4f606df2
Various agency improvements. (#8380)
* Ignore satellite collections in shrinkCluster in agency.
* Abort RemoveFollower job if not enough in-sync followers or leader failure.
* Break quick wait loop in supervision if leadership is lost.
* In case of resigned leader, set isReady=false in clusterInventory.
* Fix catch tests.
2019-03-12 15:25:16 +01:00
Jan 7596bac39c
improve error messages when restoring from invalid JSON data (#8210) 2019-02-20 18:34:35 +01:00
Jan 1798036ea0
Bug fix/optimizations 18022019 (#8180) 2019-02-19 19:24:04 +01:00
Dan Larkin-York f4c2347fbd Make Result final (#8157) 2019-02-15 20:05:30 +01:00
Vasiliy 8b94be9bf1 issue 504: return Result instead of int from all ClusterInfo functions (#7954) 2019-01-16 18:07:27 +03:00
Lars Maier 12eebb15fe Feature/new server infra (#7733)
* Decoupled IO handling from Scheduler.

* Fixed SSL start up bug.

* Replaced Scheduler with new worker farm implementation.

* Added minimal statistics and info string for Scheduler.

* Added support for timed submissions.

* Updated delayed submission api. Updated code that used timers.

* Extracted new Scheduler into a virtual parent class. The implementation can now depend on the usecase.

* Signal handler now working.

* Changed threads names, `_stop` is atomic, check for failure during thread start + exception handling like old scheduler did.

* Commented on source code and added TODOs.

* Played around with start-stop-conditions

* Play around with start stop condition.

* start stop cond

* Sart Stop Conditions

* Removed bad cv_status check.

* Bug fix: now compare the actual objects instead of pointer values. Setup t1 and t2 depending on the thread id.

* Moved most of the stuff now unrelated to the Scheduler to GeneralServer. Got rid of JobGuard.

* Instead of waiting for a thread to terminate, put it on a clean up list and check for its termination in each supervisor run.

* Allow detaching long running threads.

* Fixed test mock.

* Updated the WorkHandle logic. Removed post functions.

* Fixed crash when obtaining shared_ptr from this in destructor.

* Added lost mutex.

* Fixed memory leak.

* Fixed merge bug.

* Changed a lot of code to optimize the scheduler.

* Fixed bug of invalidated iterator. Dont remove task on shutdown at different places. Let scheduler threads run until queue is empty.

* Only by value calls to queue.

* Added options again.

* Clean up of code.

* UI Request Lane added.

* Bug fixes in Scheduler.

* Applied reformat.

* Use sigaction.
2019-01-08 10:12:02 +01:00
Jan 28528973e3
added arangorestore option `--cleanup-duplicate-attributes` (#7877) 2019-01-04 15:25:45 +01:00
Jan 98ff9621bf
prevent duplicate attributes being generated by AQL queries (#7837) 2019-01-02 12:37:17 +01:00
Frank Celler ac9f375fb5 big reformat 2018-12-26 00:54:03 +01:00
Michael Hackstein 647a29d1ae
Fix a rare deadlock situation in Replication sync phase (#7759) 2018-12-12 22:05:01 +01:00
Michael Hackstein 2d73f04008
Bug fix 3.4/syncing of followers (#7377) (#7535)
* Added some DEBUG output for replication rest handler

* Some more debug logging.

* Increased the priority of the ReplicationHandler. This way we will not get stuck with locks that cannot be canceled. Also cancel the lock on the correct database.

* Added extensive log output for replication thins

* Added tombstones to RestReplicationHandler. In a very unlikely case the cancel of a lock can be executed BEFORE the code that actually registers the lock, in this case we will now write a tombstone and do not lock.

* Revert "Added extensive log output for replication thins"

This reverts commit 6d4e37ea1e59e3b3457336019cc7dbc4c979504d.

* Added extensive log output for replication things, now in ERR level instead of MAINTAINER only

* Now actually use hours for synchronization

* React to errors under soft lock if they show up.

* Added a retry loop to increase the read-lock timer.

* Added more timeing output in RocksDB collection internals to figure out why the followers are dropped

* Tweaked RocksDB options

* Revert "Tweaked RocksDB options"

This reverts commit 2bf9c43280beda4792c47d079387fe5154cdd896.

* Removed debug output

* Applied all requested changes by goedderz

* Deleted unused variable
2018-11-30 14:43:04 +01:00
Jan adc651e338
Bug fix/cxxcheck 28112018 (#7518) 2018-11-29 13:39:04 +01:00
Jan b2924057e7
cleanup (#7507) 2018-11-28 19:42:37 +01:00
Simon d0efd95a37 Bug fix/restore index refactor (#7470) 2018-11-27 20:22:36 +01:00
Michael Hackstein 16d0874da5
Bug fix/synchronous replication catchup (#7146)
* merged fixes from 3.4

* odd fix

* Bug fix 3.4/sync repl release thread (#6784)

* First attempt to not block the thread that requires the EXCLUSIVE sync-up lock

* Fixed insertion of query into registry in rest replication handler.

* Removed unnecessary / false asserts as suggested in review. Fixed code comments.

* Replaced auto with a correct type as suggested in review

* Added a helper function to validate if a query is in use in the registry

* Fixed logic bug in usage of query registry

* Fixed compile issue

* Automaticly transfrom int -> bool in initializerlist sucks...

* Inverted boolen logic bug hidden due to int->bool beeing logically inverted.

* Today it seems that bools are too complicated for my brain.

* Removed failure point, didn't write a test for it, and it is hard to write it in the current test environment. Need to find a better solution in future

* Applied chenges required by @goedderz in review

* Bug fix 3.4/shorter foot in door (#7084)

* Implement `syncCollectionCatchup` in DatabaseTailingSyncer.

First stab, might not even compile.

* Fixed a typo.

* Fix a typo and a compilation problem.

* Further compilation fix.

* Implement two stage catchup.

* Two small corrections.

* Unified error messages in Synchronize shard job.

* Improved a code comment.

* Fixed autocasting bool->double and double->bool issue. That is truely one of the best features ever invented... </irony>

* Renamed doHardLock => toSoftLockOnly and inverted default value

* Merged soft/hard foot logic with Transaction splits

* Use scopeguards to cancel readlocks

* Bug fix 3.4/sync replication allow soft and hard lock (#6864)

* First attempt to not block the thread that requires the EXCLUSIVE sync-up lock

* Fixed insertion of query into registry in rest replication handler.

* Removed unnecessary / false asserts as suggested in review. Fixed code comments.

* Replaced auto with a correct type as suggested in review

* Added a helper function to validate if a query is in use in the registry

* Fixed logic bug in usage of query registry

* Fixed compile issue

* Implemented optional 'doHardLock' parameter in the replication acquire read-lock call. A hard-lock guarntees to stop all writes, a soft-lock may not.

* Fixed compile issue

* Automaticly transfrom int -> bool in initializerlist sucks...

* Inverted boolen logic bug hidden due to int->bool beeing logically inverted.

* Today it seems that bools are too complicated for my brain.

* Removed failure point, didn't write a test for it, and it is hard to write it in the current test environment. Need to find a better solution in future

* Applied chenges required by @goedderz in review

* Renamed doHardLock => toSoftLockOnly and inverted default value
2018-11-23 16:16:34 +01:00
Wilfried Goesgens 05a7d4e96e add alternative to ClusterInfo::getCollection() that doesn't throw (#7339) 2018-11-20 16:05:57 +01:00
Vasiliy 68953ae33a issue 496.4.1: move StorageEngine-specific flag out of the genric API and closer to the storage engine (#7212) 2018-11-04 16:52:28 +03:00
Jan 1973022d00
Bug fix/refactor find emplace (#7197) 2018-11-02 17:18:47 +01:00
Vasiliy f088733420 issue 496.3: move more coordinator-related logic out of TRI_vocbase_t, rename some arangosearch view configuration parameters, remove some consolidation policies, update iresearch to revision 6fd9760d81b136f769e277ea5b8f53996ed7a1ca (#7166)
* issue 496.3: move more coordinator-related logic out of TRI_vocbase_t, rename some arangosearch view configuration parameters, remove some consolidation policies, update iresearch to revision 6fd9760d81b136f769e277ea5b8f53996ed7a1ca

* address potential deadlock between link creation and FlushThread

* remove code causing nullptr access

* add back lock around reader reopen

* revert: address potential deadlock between link creation and FlushThread

* invalidate payload for each field in FieldIterator before setting a value
2018-11-01 23:10:01 +03:00
Max Neunhöffer 37359821cb
Fix arangorestore by adjusting timeouts in write ops. (#7083)
* Improve logging on coordinator when doing `arangorestore`.

* Return more error information in `mergeResults`.

* Longer timeout for communication coordinator -> leader for writes.

This is taking into account possible write stops from followers needed
to get in sync.

* Fix compilation.

* Get rid of numbers in exception log messages.

* Fix a typo.

* Fix compilation.
2018-10-31 14:39:58 +01:00
Vasiliy 21a89c9270 issue 496.2: allow customization of segmentsize lmts on arangosearch view creation, minor code cleanup (#7140)
* issue 496.2: allow customization of segmentsize lmts on arangosearch view creation, minor code clanup

* fix incorrect condition
2018-10-30 23:41:30 +03:00
Vasiliy 8f44afb6cf issue 496.1: switch scope of responsibility between a TRI_vocbase_t and a LogicalView in respect to view creation/deletion (#7101)
* issue 496.1: switch scope of responsibility between a TRI_vocbase_t and a LogicalView in respect to view creation/deletion

* backport: address test failures

* backport: ensure arangosearch links get exported in the dump

* backport: ensure view is created during restore on the coordinator

* Updates for ArangoSearch DDL tests, IResearchView unregistration and known issues

* Add fix for internal issue 483
2018-10-30 12:50:35 +03:00
Simon c72818a9dc Make ensureIndexOnCoordinator more robust (#7110) 2018-10-29 17:45:46 +01:00
Simon 10dc287eb3 Silence Tsan warnings (#7075) 2018-10-25 15:50:39 +02:00
Jan 46efcff7d7
micro improvements (#6674) 2018-10-05 10:25:13 +02:00
Dan Larkin-York b922d260bc More cleanup and additional logging. 2018-10-02 07:50:26 -04:00
Jan 44448321e1
Bug fix/optimizations 210918 (#6573) 2018-09-24 20:16:16 +02:00
Simon 0fa7f01c66 Resilience test failure points (#6539) 2018-09-20 01:05:10 +02:00
Simon aa21ffdb7a Properly check syncer erros, catch more exceptions (#6520) 2018-09-17 16:39:23 +02:00
Dan Larkin-York 0dfabd8f04 Fix several TSan warnings (#6473) 2018-09-14 11:16:45 +02:00
Simon 22b9c31c13 Removing ClusterComm ClientTransactionID (#6294) 2018-09-12 22:15:16 +02:00
Lars Maier e2930ccba6 Completely export views dump restore (#6466) 2018-09-12 17:21:22 +02:00
Vasiliy 5329f34771 issue 465.2.2: remove redudnant heap allocations and simplify API (#6349)
* issue 465.2.2: remove redudnant heap allocations and simplify API

* address merge issue

* address more merge issues

* address more merge issues

* address review comments

* do not deallocate non-allocated instances
2018-09-05 13:37:37 +03:00
Jan 4baca1f51d
use less VPackBuilders when restoring data into a collection (#6365) 2018-09-04 16:55:14 +02:00
Jan 5022ccc24d
Bug fix/fixes 2508 (#6254) 2018-08-27 21:36:39 +02:00
Jan 791eaba873 Bug fix/planning 2865 (#6252)
* potential bugfix for planning/#2865

* speed up dump tests setup

* enable authentication for backup tests

* make arangodump provide a "serverId" to the server

this allows the server to track arangodump as an active dump client
so any data required for the dumping may be retained while the
dump is ongoing

* don't log binary stuff into the logfile
2018-08-27 10:21:34 +02:00
Simon 568a09f177 Disable JS on DBServer, fix race in UserManager (#6244) 2018-08-24 22:20:49 +02:00
Jan 5f0403ed1c
Bug fix/add aql collection count cache (#6227) 2018-08-23 16:05:51 +02:00
Simon 229c09d434 Allow dirty-reads from passive (#6136) 2018-08-20 16:26:14 +02:00
Jan a5ef080a8a
attempt to make replication_sync more reliable in tests for MMFiles (#6184) 2018-08-17 14:20:40 +02:00
Simon 2d29232c59 fix restoring iresearch view in cluster (#6159) 2018-08-17 09:28:20 +02:00
Simon 392118bd62 Use RangeDelete where possible (#6121) 2018-08-15 18:52:09 +02:00
Simon 42cabb858a Fix dumping of views in the cluster (#6024) 2018-08-02 17:25:49 +02:00
Vasiliy 62ca1235ac issue 430.3: remove redundant constructor from SingleCollectionTransaction (#5996) 2018-07-26 16:54:53 +03:00
Jan 21064144c8
Bug fix/replication improvements (#5962) 2018-07-25 09:04:50 +02:00