1
0
Fork 0
Commit Graph

444 Commits

Author SHA1 Message Date
Jan d78f4601ba
don't make the syncer fail when it is restarted (#4030) 2017-12-14 10:23:35 +01:00
Jan 9c5893e7a7
fix premature unlock (#3802) (#4027)
* fix some deadlocks found by evil lock manager (tm)

* fix duplicate lock

* fix indentation

* ensure proper lock dependencies

* fix lock acquisition

* removed useless comment

* do not lock twice

* create either a V8 transaction context or a standalone transaction context, depending on if we are called from within V8 or not

* AQL micro optimizations

* use explicit constructor

* only use V8DealerFeature's ConditionLocker for acquiring a free V8 context

entering and exiting the selected context is then done later on without having to hold the ConditionLocker

* remove some recursive locks

* Disable custom deadlock detection when Thread Sanitizer is enabled

* Changing ifdef's

* grr

* broke gcc

* Using atomic for ApplicationServer::_server

* fix premature unlock

* add some asserts

* honor collection locking in cluster

* yet one more lock fix

* removed assertion

* some more bugfixes

* Fixing assert

(cherry picked from commit 1155df173bfb67303077fbe04ee8d909517bfd21)
2017-12-13 18:46:14 +01:00
Jan e7f93d8b9d
potentially fix send request timeout (#4025)
* potentially fix send request timeout

* fix globallyUniqueIds
2017-12-13 18:45:31 +01:00
Simon Grätzer 3287d8706a Use uuid in initalsync to ensure proper collection mapping (#3972) 2017-12-13 14:29:21 +01:00
Jan d080350afa
make replication abortable (#4015) 2017-12-12 23:15:22 +01:00
Jan 3caa4c47bb
add a test for replication applier restarts (#3750) (#3933) 2017-12-08 15:48:15 +01:00
Jan 08f9326eed make sure we do not dereference a nullptr (#3906) 2017-12-07 10:36:13 +01:00
Jan 4397ebf3c7 improve compatibility when replicating from a 3.2 (#3915) 2017-12-07 10:30:27 +01:00
Jan 4104296fd3 fix revision id vs local document id usage in incremental sync (#3916) 2017-12-07 09:24:20 +01:00
Simon Grätzer 0e485f7441 Fixing collection name collection handling in Syncer (#3710) 2017-11-17 16:36:57 +01:00
Jan b4f6ee9273 Feature/improved index api for unique constraints and replication (#3715) 2017-11-16 21:02:01 +01:00
Jan fce593b724 removed `--recycle-ids` option for arangorestore (#3713) 2017-11-16 14:25:54 +01:00
Jan 3b0a8a9cdf make the replication applier auto-start for the RocksDB engine if con… (#3647) 2017-11-15 12:02:37 +01:00
Simon Grätzer c6fe726901 Removing unsafe asserts in wal tailing v3.3 (#3672) 2017-11-13 17:41:57 +01:00
Jan 8a467e74db try to not crash when encountering a collection that is not present (#3660) 2017-11-11 19:30:37 +01:00
Simon Grätzer 87f441753b RocksDB WAL tailing fixes (#3595) 2017-11-10 09:31:53 +01:00
Jan 720e6df82e Bug fix/fixes 1910 (#3471)
* properly initialize all properties

* use faster comparison

* properly detect and handle "method not allowed"

* code-style

* remove unused variable

* narrow variable scope

* handle non-existance of AuthenticationFeature

* remove dead code

* replace some C string handling with std::strings

* moved assertion to the correct place

* honor number of array members for IN operator

* slightly adjust error messages

* slighty adjust some error messages

* try to fix issue with lingering replication contexts on shutdown

* clean up heartbeat thread a little bit

* small fixes
2017-10-23 09:17:36 +02:00
Simon Grätzer fd3f9d99d9 Fixing webinterface access (#3464)
* intermediate commit

* Refactoring the ExecContext

* Fixing authentication

* Added start script

* some fixes

* fixed access to nullptr

* some c++

* fixed misleading message

* Made DatabaseGuard movable. Also adapted map insertions to _vocbase in Syncer classes, which failed to compile under older GCC versions

* added support for global flag to replication handler

* Started Refactoring in replication-static

* Fixing syncer code

* store applier configuration

* Static replication tests now test replication in a non system Database

* added flags to replication feature

* Adding some extra checks

* Fixing issue with rocksdb rest replication handler

* replication static now runs _system and otherdatabase replication tests.

* Fixing crash on startup

* Replication_sync now tests _system as well as other Database

* Fixing up heartbeat thread, adding global flag to rest handler

* Fixing wrong assert

* some cleanup, probably some tests are broken

* Made non-system db version of replication-ongoing tests

* fix determine-open-transaction

* Fixed ongoing tests. And added a test where we drop a database on slave while replication is still ongoing

* test fixes

* Activated ongoing other db tests. Also added a test that drops the DB on master, while the slave is still syncing.

* some better error reporting

* gradually switch to Result

* createCollection -> create

* re-activate using of collection ids for now

* enable auto-start

* Fixed create collection in replication ongoing test

* Added first draft of a test for global replication

* move to Result

* use system database for global applier

* improved error reporting

* fixed invalid URLs

* add test case filter

* load existing global applier configuration

* improve error reporting

* Added further tests for global replication

* Fixed global replication test, it now properly waits for replication. Timeouts after 10 seconds.

* Removed erronious assertion

* improve error reporting

* intermediate commit

* Added a test-case for global replication where the Master already has some data and the slave is clean

* fix deletion of replication contexts

* Fixed JSLint

* compiling code

* fix typo

* do not fail for global applier when no database is configured

* intermediate commit

* syncer supports switch for 3.3 / 3.2

* fixed errors

* Fixing some replication bugs

* Fixing some assertions

* Fixed missing commit markers

* Fixing assertion on database drop

* Attempt to fix deadlock in applier and assertion

* Fixing some stupid things

* Support for collection parameter

* Acidentally turned off some tests

* Grrr

* Fixing wrong method call

* Fixed startscript

* Fixed assignmet instead of equality check typo

* Added a test far interrupted replication. For now it justs tests basics on _system database.

* Improved index tests on replication.

* properly initialize variable

* fixed some replication problems

* MMFiles wal access support

* fix replication issues

* Started mmfiles replication support

* fixing a bug

* Fixing an issue

* fixing some mmfiles stuff

* fix test

* reload users

* prevent pure virtual method call

* intermediate commit

* Making from exclusive

* do not call getMasterState if child syncer

* some reformatting

* Adding global support for handleCommandSync

* Fixing assertion

* removing some debug logs

* Changing return codes

* Fixing some issues in the rest handler

* Make replication less susceptible to errors

* remove some debug output

* return last log tick

* remove waits from tests

* fix two tests

* changing header for open-transactions call

* some fixes

* fix test

* invalidate cached databases

* merging request and execcontext

* try to fix assertion error

* renamed method

* fix compile warning

* small changes

* Always use execcontext

* Fixing an assert

* fix replication issues

* try to fix collection lookups

* try to fix master/slave start

* Changing comments in heartbeat thread

* fix wrong signature of READ_LOCKER_EVENTUAL

* log server role in testing mode

* Fixed authentication, removed execContext in favor of request context

* Adding cluster rest api

* Fixing cluster rest handler

* Fixing cluster callback

* Some refactoring

* Queue creation is not a single operation

* Allowed for leader redirects

* Setting start of batch

* Disabling 2.8 compat tests

* fix start/stop bugs

* jslint

* various little changes

* add flag for exposing jwt

* indentation

* cleanup

* Some changed to guid

* fixing tcp to http, vst

* changed endpoint header

* small fixes

* Reorder servers by health status

* Higher timeout

* Changing error messages

* update the fromTick when fetching multiple batches from the coordinator

* more debug info

* Reducing copy pasted code

* change uid generation

* reducing logspam

* more exceptions for redirects

* more exceptions

* attempt to fix uniqids in cluster

* centralize printing of HTTP errors in replication

* debug output

* fix messages for authentication

* cleanup

* removing --cluster.my-id, --cluster.my-local-info

* Added leadership race to bootstrap, determine foxxmaster on boostrap, removing obsolete code

* improve error reporting in RestAqlHandler

* Changing heartbeat thread, fixing cluster_sync

* some more debug output

* added master

* attempt to make tests more deterministic

* added logging about indexes

* added some safety checks to the logger

* slighty better error messages

* fix location header for SSL

* fix error message

* try to make tests more deterministic

* change error code from TRI_ERROR_INTERNAL (which we want to avoid) to TRI_ERROR_FAILED

* Fixing broken webinterface access

* reverting groovy change

* Fixing read-only internal users

* Using superuser rights for dashboard now

* Adding mode field to _admin/server/role

* added mode TRYAGAIN

* remove inventory lock (does not seem necessary here)

* remove invalid assertion

* fixing agency bugs

* Removing debug output

* return proper errors in case of "method not allowed"

* Fixed up some info messages

* jslint
2017-10-20 18:06:59 +02:00
Jan 7840d3f824 Bug fix/fixes 1810 (#3460)
* improve error reporting in RestAqlHandler

* added logging about indexes

* added some safety checks to the logger

* slighty better error messages

* fix location header for SSL

* fix error message

* try to make tests more deterministic

* change error code from TRI_ERROR_INTERNAL (which we want to avoid) to TRI_ERROR_FAILED
2017-10-19 11:28:01 +02:00
Simon Grätzer 7c31960cf2 Feature/async failover (#3451) 2017-10-18 23:59:29 +02:00
Jan 0561bf45ce Bug fix/isrestore (#3283)
* Make isRestore work in the cluster.

This covers sharded collections with default sharding and non-default
sharding.

* always use locally generate revision ids for storing and looking up documents
2017-10-03 11:53:49 +02:00
Jan 5165155ed1 Bug fix/fixes 0609 (#3227)
* do not use V8 variant of AQL functions in early optimization stage when a C++ variant is available

* additionally, simplify AQL function definitions and aliases

* warn when more than 90% of max mappings are in use

* added C++ variant of replication catchup

* added `--log.role` option

* updated CHANGELOG

* removed non-existing scheduler.threads option from config

* removed useless __FILE__, __LINE__ invocations

* updated CHANGELOG

* allow a priority V8 context

* remove TRI_CORE_MEM_ZONE

* try to fix Windows errors & warnings

* cleanup

* removed memory zones altogether

* exclude system collections from collection tests
2017-09-13 16:28:21 +02:00
Simon Grätzer ffc465433a No access collections Improvements (#3190)
* consolidated EdgeDocumentToken

* optimizing cluster traversal

* adding skip collection checks

* API cleanup

* copying AQLValue to avoid use-after-free bugs

* Fixing rocksdb SingleServerEdgeCursor

* Fixing a collection resolving issue
2017-09-07 14:55:07 +02:00
m0ppers 9a0bc716d0 Do not allow replication to create/drop collections (#2898)
In the cluster case the only one who is allowed to do this is the schmutz
2017-07-28 14:24:02 +02:00
m0ppers 589ffd5c59 Feature/improve logging (#2881)
* Improve logging in various places

* Fix jslint
2017-07-28 09:23:18 +02:00
Max Neunhöffer 2f874249bb Bug fix/adjust agency comm timeouts (#2765)
* Take out 503 timeouts altogether.
* Overhaul of AgencyComm::sendWithFailover loop.
* Let performRequests optionally ignore 404 coll not found.
* Fix error message "database not found" when AgencyComm failed.
* Add log entries in Agency if locks are acquired too slowly.
* Reexecute the javascript cluster sync stuff even if there was no plan/current change...So failed sync jobs can retry later...
* Cover callbacks in Communicator by lock. This fixes https://github.com/arangodb/planning/issues/370
* Put in delay in waiting for leader in agency test.
* Schmutz logging to heartbeat topic.
* Add more lock time diagnostic in agent.
* Switch on agencycomm tracing in coordinator.
2017-07-13 00:44:28 +02:00
Frank Celler 2807ef559c Feature/move shard fix (#2626)
Major overhaul of handling of synchronous replication.
2017-06-26 16:55:01 +02:00
Jan ebaeb639b2 Bug fix/incremental replication syncs too much (#2631)
* fix too much synchronization
* fix incremental sync for RocksDB engine
2017-06-21 14:56:33 +02:00
Max Neunhoeffer a97ec1cc29 Try to handle another shutdown hanger. 2017-06-09 00:42:31 +02:00
Max Neunhoeffer 99d6624fd4 Detect a tick zero response in logger-state in replication. 2017-06-08 22:45:24 +02:00
jsteemann 40cf731cd7 fixed typos in error messages 2017-06-08 15:54:57 +02:00
Simon Grätzer 2f2d07ab9a Multihreaded import 2017-05-24 18:37:45 +02:00
jsteemann 0366fba276 try to fix thread joining for replication applier 2017-05-18 21:05:57 +02:00
jsteemann 226920d7fa optimizations 2017-05-17 23:41:16 +02:00
jsteemann 6adebcb893 fix access mode in ReplicationTransaction 2017-05-17 11:53:34 +02:00
jsteemann c5a195d62c attempt to fix races in replication 2017-05-10 14:32:16 +02:00
jsteemann 62aa9a1b0a use exclusive transaction in syncer 2017-05-10 14:29:33 +02:00
Dan Larkin d77efe38dc Added explicit WAL file lifecycle management to reduce space overhead from replication. 2017-05-04 15:16:24 -04:00
jsteemann f2be898664 fix replication tests 2017-05-04 17:46:43 +02:00
Jan Christoph Uhde 3cb1cc7a52 Merge branch 'devel' of https://github.com/arangodb/arangodb into devel
* 'devel' of https://github.com/arangodb/arangodb:
  Fixing index markers
  exclusive locks for indexes
  better incremental sync
  grunt build
  Avoid log spam.
  Removed code paths that wrote objectIds into the Agency. This did break replication.
  WAL: honor tick end value
  WAL fiter after collection
  Reactivated client-side filtering of unnecessary markers
  Added an Assert when persisting an index it's objectId is not allowed to be 0
  The RestReplicationHandler now inserts new ObjectIds when replicating collections
  moved files
  add dependencies for TransactionManager
  fix include
  move engine-specific test into engine test
2017-05-02 16:41:43 +02:00
Jan Christoph Uhde da02fd36c6 move engine specific syncer code into engines 2017-05-02 16:36:27 +02:00
jsteemann 45096dd503 Merge branch 'devel' of https://github.com/arangodb/arangodb into devel 2017-05-02 16:09:35 +02:00
Simon Grätzer e335a29326 better incremental sync 2017-05-02 15:08:05 +02:00
jsteemann 910f5680b0 moved files 2017-05-02 12:34:28 +02:00
Jan Christoph Uhde a9d2b4dc7e redo: remove unnecessary code 2017-04-28 14:59:13 +02:00
Jan Christoph Uhde cb868d36a9 remove unnecessary code 2017-04-28 14:44:04 +02:00
jsteemann 3b48b6bd51 update safeResumeTick as well 2017-04-27 14:47:22 +02:00
Simon Grätzer 14a3168e40 Working initial sync 2017-04-26 17:01:35 +02:00
Simon Grätzer bdd002d6ec Fixing incremental sync deletion 2017-04-26 14:11:51 +02:00
Simon Grätzer 02e41de608 Working on replication 2017-04-26 13:32:12 +02:00