1
0
Fork 0
Commit Graph

16952 Commits

Author SHA1 Message Date
Kaveh Vahedipour a8ad22f206 assertion failures in store on restarting following agents (#8562) 2019-03-26 10:38:18 +01:00
Jan Christoph Uhde abeca4568c Feature 3.4/better aql error message (#8559) 2019-03-25 18:29:14 +01:00
Kaveh Vahedipour 799652d6d5 [3.4] precondition plan / version in compaction / store TTL removal (#8526)
* precondition plan / version in compaction / store TTL removal independent of local _ttl set
* Agency init loops break when shutting down.
2019-03-25 08:49:49 +01:00
Dan Larkin-York ea6a854b45 Add RocksDB background error listener. (#8537) 2019-03-23 20:46:07 +01:00
Max Neunhöffer 31b63a007b Avoid duplicate keys in agency callbacks. (#8531) 2019-03-22 18:30:03 +01:00
Jan Christoph Uhde a0394c4d3b better AQL error messages (#8505) 2019-03-22 17:50:14 +01:00
Simon 89a7591d00 Bug fix 3.4/rocksdb truncate asan (#8525) 2019-03-22 17:15:03 +01:00
Jan abd09e2e2c
fix /_api/cluster/endpoints URL for active failover (#8518) 2019-03-22 16:47:32 +01:00
Simon 776c0c0f22 Fix edge index selectivity (#8499) 2019-03-22 05:18:34 +01:00
jsteemann ccf2a1e59d fix two minor issues reported by cppcheck 2019-03-21 19:42:40 +01:00
Jan e8f4446c24
prevent assertion failures when creating smart graphs with invalid options (#8500) 2019-03-21 16:25:02 +01:00
Andrey Abramov 73cf9eb1f3
[3.4] bug-fix/issue-#8294 (#8430)
* fix invalid optimizations for multi-valued attributes

* fix broken optimizations for multivalued attributes

* adjust tests

* add `IN_RANGE` function

* add tests

* add some shallow integration tests

* fix optimization for IN operator
2019-03-21 18:23:12 +03:00
Max Neunhöffer 1365eebfac
Make AddFollower and RemoveFollower less aggressive. (#8477)
* Make AddFollower and RemoveFollower less aggressive.
* Adjust comment
* Early exit in count loop.
* Adjust comment in 2nd place.
* CHANGELOG.
2019-03-21 15:27:22 +01:00
Kaveh Vahedipour 8b851c35e3 [3.4] fix multimap to vpack agency store (#8481)
* fix store dump invalid json bug
* hitting ttl should also remove corresponding node
* fix ttl from disk
2019-03-21 15:21:40 +01:00
Kaveh Vahedipour 6529b38240 Feature 3.4/coordinator do full agency dumps (#8131)
* adding agent functionality
* agency route working
* coordinator route to acquiring agency dumps
* remove logging remains and add superuser requirements to API
* Update State.cpp
* guard route
* change we can believe in
2019-03-21 14:58:38 +01:00
Jan 3f3f0c6fe3
Feature 3.4/ncc1701 (#8441) 2019-03-21 14:53:28 +01:00
Jan dbfa483374
add RocksDB options `--rocksdb.allow-fallocate` and `--rocksdb.limit-open-files-at-startup` (#8492) 2019-03-21 12:05:57 +01:00
Jan db7fcdce7a
don't run compact() on a collection after a truncate() was done in the same transaction (#8471)
running compact() in the same transaction will only increase the data size on disk due to
  RocksDB not being able to remove any documents physically due to the snapshot that is
  taken at transaction start.

  This change also exposes db.<collection>.compact() in the arangosh, in order to manually
  run a compaction on the data range of a collection should it be needed for maintenance.
2019-03-20 17:44:06 +01:00
Kaveh Vahedipour ab3206486d [3.4] job must not copy snapshots (#8406)
* job must not copy snapshots
* Node correct empty children
* checked all hasAsChildren sites
* No copy in operator() for node.
* Don't spam log.
* const operator too
* full path to missing key in agency
* the key is missing
* Another info level to DEBUG from INFO.
* Increase timeouts of MoveShard and CleanOutServer agency jobs.
* CHANGELOG.
2019-03-20 17:03:19 +01:00
Jan 1c54830310
don't attempt to remove non-existing WAL files, because such attempt will trigger unnecessary error log messages in the RocksDB library (#8476) 2019-03-20 17:02:34 +01:00
Kaveh Vahedipour fbb9c3d75d [3.4] leader check needs to sit inside waitfor loop (#8444)
* leader check needs to sit inside waitfor loop
* Do not wait in Supervision for commits of new writes.
* CHANGELOG.
2019-03-20 13:16:00 +01:00
Kaveh Vahedipour 432bc2d26e [3.4] address compile warnings (#8343)
* warnings building
* from 3.4 a merge
2019-03-20 13:07:31 +01:00
Jan f621fe6076
revert a previous change that caused existing system collections on a slave to be truncated instead of being deleted (#8443)
truncating instead of deleting introduced the possibility of the collection's indexes continuing to exist with different ids on the slave than on the master, leading to potential follow-up problems
2019-03-18 20:56:50 +01:00
Wilfried Goesgens 68ce741b13 Bug fix 3.4/arrayindex unique update (#8422) 2019-03-14 18:13:05 +01:00
Jan 02d170e4c2
make geo-index-optimizer rule work with multiple loops (#8353) (#8360) 2019-03-13 13:22:12 +01:00
Kaveh Vahedipour bd8cf1e9ea supervision must check for leadership, when waiting for progess (#8401) 2019-03-13 13:01:14 +01:00
jsteemann 27e23f9236 make sure some tick invariants always hold true 2019-03-12 19:29:52 +01:00
Jan 30ddb98659
try an incremental sync when restarting a follower in active failover mode (#8364) 2019-03-12 15:28:00 +01:00
Max Neunhöffer 54c533c5f6
Some more agency fixes. (#8377)
* Ignore satellite collections in shrinkCluster in agency.
* Abort RemoveFollower job if not enough in-sync followers or leader failure.
* Break quick wait loop in supervision if leadership is lost.
* In case of resigned leader, set isReady=false in clusterInventory.
* Fix catch tests.
2019-03-12 14:09:56 +01:00
Lars Maier dbfcbdfe32 [3.4] Agency Failed leader crash (#8356)
* Check if transaction failed before accessing the result.
* FailedFollower had the same bug.
2019-03-12 11:18:07 +01:00
Jan 13148e661a simplify internal struct for edge index lookups (#8367) 2019-03-11 16:23:14 +01:00
Jan cd5c9edce1
various replication improvements (#8300) 2019-03-11 13:07:43 +01:00
Kaveh Vahedipour 3240185bf6 Bug fix 3.4/agency compaction overwrite (#8344)
* agents need to be able to update a compacted state, when wrong entry exists

* Do not compact a potentially wrong log.
2019-03-08 10:34:53 +01:00
Michael Hackstein fb956ccc67 PRUNE in AQL Traversal (#8286) 2019-03-07 21:13:26 +01:00
Jan 7e1a74c6c0
fix an assertion failure when plans are shut down in an invalid state (#8333) 2019-03-07 13:15:54 +01:00
Jan 6941eb941c
added option `--console.history` to arangosh (#8328) 2019-03-07 13:14:51 +01:00
Kaveh Vahedipour 1be2404297 [3.4] failed follower needs reporting aborts too (#8323)
* forgotten abort
* fix waitFor trap in supervision
2019-03-05 23:30:08 +01:00
Max Neunhöffer 46e479376d
Further supervision fixes. (#8259)
* Do not schedule Coordinators in Plan.

* Finish failed server when server is no longer in health.

* Fix removeServer checks.

Check that server is no longer in use before removing it. Give 60s
waiting time for condition to be met. Also observer agency lock.

* Finish FailedFollower job if server no longer follower.

This can happen because RemoveFollower was faster.

* Only use GOOD servers as replacement followers.

* Fix AddFollower for satellite collections.

* Fix RemoveServer for satellite collections.

* MoveShard handles moves from leader to followers

* Prepare CleanoutServer and FailedServer for satellite collections.

* More sorting out of AddFollower and RemoveFollower.

* Fix RemoveFollower job w.r.t. choice of follower to remove.

* Fix message.

* kill you own sub jobs, please

* Added preconditions to payloads for supervision's job finishers

* Improve logging.

* Add agency diagnostics to failed move shard test, start.

* Add coordinator agency diagnostics.

* Remove warning.

* Add changelog entry.

* Add agency diagnostics if things go sour with move shard.

* Add agency diags when things go wrong 2.

* API /_api/agency/state: back to old format.

* Fix Windows compilation.

* handle aborts in supervision and wait for the last Raft log to be committed

* tests compiling, 2 failing for valid reasons

* Correctly report TRI_ERROR_CLUSTER_CONNECTION_LOST as 503.

* FailedLeader /FailedFollower cannot continue, when aborting blocks
2019-03-04 11:43:35 +01:00
Vasiliy e536f2ab1f issue 525.1: backport 3.4: ensure RocksDB CreateIndex/DropIndex WAL markers are properly written during recovery (#8279)
* issue 525.1: backport 3.4: ensure RocksDB CreateIndex/DropIndex WAL markers are properly writen during recovery

* backport: skip writing DropIndex marker in recovery
2019-03-01 15:57:42 +03:00
Simon 9e7eb470b4 Fix Pregel nullptr checks (#8276) 2019-03-01 13:34:05 +01:00
Jan ca53f5b503
abort ongoing transactions in all cases (#8290) 2019-02-28 14:41:22 +01:00
Jan 8e3fb5dfc7
Feature 3.4/improve replication speed (#8268) 2019-02-28 14:37:40 +01:00
Michael Hackstein 3e02a726ee
Bug fix/clustercomm queue cleanup (#8191) (#8277)
Cleanup of unused queues in ClusterComm
2019-02-28 09:11:25 +01:00
Andrey Abramov 5815a7a2a8
Fix bug #8213 (#8262)
* properly process ArangoSearch view nodes located inside subqueries

* extend tests to cover failing case

* add integration tests
2019-02-26 18:50:46 +03:00
Simon a52e6fa3d3 Sync Foxx Queues (#8254) 2019-02-25 17:16:26 +01:00
Max Neunhöffer b87f362f27
The big supervision fix. (#8243)
* Updated CleanoutServerTests. Exclude servers in ToBeCleanedServers. Allow bad servers as new follower.
* Prefer good servers.
* Removed copy, sort and binary_search for a list of ~10 elements.
* Fix move shard bug with compare.
* MoveShard fixes, expansion of doForAllShards
* Count only GOOD servers in actualReplicationFactor.
* Make RemoveFollower remove broken servers.
* Precondition on Plan Version for updating Current as leader.
* CleanupServer to evict server from ToBeCleaned, when aborting
* cleanoutserver with payload in finish
* Use static string for ToBeCleanedOut.
* Fixed typo in log message.
* Change warning level. If a MoveShard job is aborted and we can no longer roll back, then we issue a WARNING rather than a DEBUG log message.
* Another typo and log level.
* Start to fix unit tests.
* Does not make sense for AddFollowerTest to have a FAILED leader.
* Only count GOOD followers in AddFollower.
* Fix AddFollowerTest.
* Report precondition failed in MoveShard follower case.
* Add CHANGELOG.
2019-02-25 08:12:18 -05:00
Jan 30c61a5c82
improve error messages when restoring from invalid JSON data (#8211) 2019-02-20 18:34:43 +01:00
Heiko 1ae99f08be do not allow edge definitions with empty from or to arrays, added als… (#8200) 2019-02-20 18:33:06 +01:00
Jan 68b112e45a
fixed issue #8165 (#8195) 2019-02-19 18:08:39 +01:00
Tobias Gödderz a83d42d0e9 [3.4] Forbid ambiguous casts to and from ResultT (#7368)
* Forbid ambiguous casts to and from ResultT

* Reformat

* Changed enabled_if checks to check for implicit casts to Result

* Added comments
2019-02-19 12:50:39 +01:00
Jan 304948f30d
when creating a new database with an initial user, set the database p… (#8185) 2019-02-18 16:16:23 +01:00
Jan fbc1e5b35f
use PutUntracked (#8126) 2019-02-13 16:04:27 +01:00
Simon 51734e363f Properly translate cluster comm errors (#8152)
(cherry picked from commit 9151921b4524f2775ae5c69cae3f49fa4f9d703b)
2019-02-12 18:07:43 +01:00
Jan e640e3f52a
issue #8137: NULL input field generates U_ILLEGAL_ARGUMENT_ERROR (#8139) 2019-02-12 17:56:38 +01:00
Jan 078e1d9b9d
fixed issue #8108 (#8111) 2019-02-05 15:50:47 +01:00
Kaveh Vahedipour e8d39666fd fixing failedserver/leader/follower chain for mishap (#8089)
* fixing failedserver/leader/follower chain for mishap
* change log mention
2019-02-05 13:55:19 +01:00
Jan 4ee7ff1932
yet some more replication tests adjustments (#8101) 2019-02-04 16:53:15 +01:00
Tobias Gödderz db70847ab1 [3.4] Sorted COLLECT: avoid nullptr deref when skipping and fix non-invalidated input variables (#8037)
* Added 2 regression tests

* Fixed test expected data

* Fix nullptr dereference

* Fix handling of non-invalidated input variables

* Try a less implicit fix

* Updated CHANGELOG
2019-02-04 16:38:26 +01:00
Jan 8a16a4b3ae update velocypack (#8075) 2019-01-31 17:31:54 +01:00
Jan 05bcb3d7d1
Bug fix 3.4/streaming cursors v8 issues (#8080) 2019-01-31 17:25:28 +01:00
Jan 675bb78552
more debugging for replication (#8062) 2019-01-30 21:23:00 +01:00
Jan 4a1f25ed46
use JobGuard when querying users from DB in cluster (#8057)
* use JobGuard when querying users from DB in cluster

* fix test crashes
2019-01-30 12:00:50 +01:00
Frank Celler 84802fdc0f Feature/maskings (#8006) 2019-01-28 15:04:23 +01:00
Simon 1c77bc37d6 Reduce timeout for write-lock (#8036) 2019-01-28 09:23:52 +01:00
Simran 79c3a34d6b Doc - Deprecate --server.jwt-secret startup option (3.4) (#7961) 2019-01-28 09:12:43 +01:00
Jan 75ab4ac3dc
Bug fix 3.4/misc issues (#8012)
* added missing return statements

* only spend up to 10 seconds for initially fetching the list of collections in arangosh

fetching the list of collections is a blocking operation, and the default timeout for this is very high.
If the server is blocked by whatever reason, then the shell is unusable until the collections list request returns.
To avoid this, the initial request is limited to 10 seconds, so the shell can be used afterwards.

* if an index cannot be used for sorting, its sort

cost was previously returned as 0. this will in fact favor
indexes that can be used for filtering but not for sorting
over indexes that can be used for both.

this change is to report the sort cost for indexes that
cannot be used for sorting to n * log(n), where n is the
number of documents that optimizer expects to come out of the
index after filtering
2019-01-28 08:55:06 +01:00
Jan 15852cb491
Bug fix 3.4/address jenkins fails (#7985) 2019-01-22 12:32:17 +01:00
Jan d394a6b988
fix scrambling of AstNodes that were in use multiple times (#7978) 2019-01-19 18:54:08 +01:00
Jan 3c828347dc
do not simplify non-deterministic conditions (#7927) 2019-01-19 18:52:17 +01:00
Jan 244eed0710
added "peakMemoryUsage" in query results figures, (#7981) 2019-01-19 18:50:55 +01:00
Simon 1f86b7a2b5 Fix stream cursor bug (#7958) 2019-01-16 13:27:57 +01:00
KVS85 dfad8906d9 Bug fix/active failover fix windows 3.4 (#7959)
* Backport active-failover fix for Windows into 3.4

* Backport stop/resume for Windows from devel

* Backport changes from devel into tests also

* Fix tests

* Remove forgotten whitespaces
2019-01-16 11:08:48 +01:00
Tobias Gödderz d48495c195 [3.4] Foxx queue jobs hanging after Foxxmaster crash (#7921)
* Fixed bug where the Foxxmaster doesn't reset jobs after a crash when it should, or a non-master coordinator removes jobs in progress during startup

* Added a regression test

* Removed foxxmaster test from greylist

* Updated CHANGELOG

* Fixed non-maintainer compile
2019-01-14 16:06:48 +01:00
Jan d5592def42
fix issue #7933: Regression on ISO8601 string compatibility in AQL (#7934) 2019-01-14 13:50:50 +01:00
Kaveh Vahedipour 7b37922f92 releveling logging in maintenance module (#7925) 2019-01-10 12:15:48 +01:00
Matthew Von-Maszewski 474f0cde31 Bug fix 3.4/scheduler empty reformat (#7872)
* added check for empty scheduler

* removed log, old is 1 not 0

* require running in this thread

* test

* added isDirect to callback

* signature fixed

* added drain

* added allowDirectHandling

* disabled for testing

* Add ExecContextScope object to direct call.

* try alternate initialization of ExecContextScope

* remove ExecContextScope, no help.  try _fifoSize as part of direct decision.

* strand management to minimize reuse of same strand per listen socket

* blind attempt to address Jenkins shutdown lock up.  may remove quickly.

* add filename and line to existing error log message

* Adjust queueOperation() to stop accepting items once isStopping() becomes true.

* revert previous check-in to MMFilesCollectorThread.cpp

* big reformat

* fixed merge conflicts

* Add CHANGELOG entry.
2019-01-08 20:39:42 +01:00
Kaveh Vahedipour 536f5f22a7 [3.4] fix create collection timeouts test agent (#7831)
* should not neglect the initial async request for read lock acquisition

* fixed nullptr

* correct timeout

* corrected error  handling in getReadLock

* reverted "test fix"

* should remove async request from ClusterCom
2019-01-08 17:02:24 +01:00
Lars Maier bc9f9ed14d Bug fix 3.4/jwt base64url encoded (#7904)
* Use base64url encoding and decoding for jwt header and body as specified in the rfc.

* Added changelog.
2019-01-08 16:55:17 +01:00
Jan 9c099ba5da
multiplex REPLICATION-APPLIER-STATE files for RocksDB engine (#7897) 2019-01-08 14:26:09 +01:00
Jan adf76491b0
added AQL function CHECK_DOCUMENT (#7841) 2019-01-04 15:33:20 +01:00
Jan a4a7867451
Bug fix 3.4/arangorestore add cleanup duplicate attributes (#7876) 2019-01-04 15:26:11 +01:00
Lars Maier e1dcad0153 Feature 3.4/jwt keyfile (#7864)
* Added jwt-keyfile option and warning for old option.
* CHANGELOG
* Add trimming to --auth.jwt-secret-keyfile
* Adjust some comments.
2019-01-02 21:45:18 +01:00
Jan a14f6dd573
prevent duplicate attributes being generated by AQL queries (#7836) 2019-01-02 12:37:47 +01:00
Jan 762c0fd7c6
fixed issue #7834 (#7845) 2019-01-02 11:09:06 +01:00
Frank Celler 9477af198b big reformat 2018-12-26 00:57:05 +01:00
Dan Larkin-York 05d158a689 Fix issue with geo iterator reset. (#7838) 2018-12-23 00:41:37 +01:00
Jan 7c42430f95
suppress a warning message about non-optimal MMFiles collection data structures while doing WAL recovery, not just while upgrading. (#7812) 2018-12-20 16:52:02 +01:00
Tobias Gödderz b8c2d0d01a [3.4] Fix heartbeat thread hanging during shutdown (#7678)
* Abort registering agency callbacks when stopping

* Reduce diff

* Reduce diff

* Added two nullptr checks
2018-12-20 16:40:58 +01:00
Simon 1498c08084 fix restrictCollections parameter on database level replication (#7808) 2018-12-19 18:00:10 +01:00
Jan b5e844fe33
fixed issue #7749 (#7797) 2018-12-19 14:17:54 +01:00
Jan 0574393d52
fixed issue #7757 (#7806) 2018-12-19 14:06:11 +01:00
Jan f1b0a803eb
do not use an internal error for JSON parse errors (#7800) 2018-12-19 12:54:20 +01:00
Jan f39c58e06c
fixed issue #7763 (#7796) 2018-12-19 09:56:05 +01:00
Andrey Abramov edb2ca4800
update ArangoSearch consolidation policy (#7801) (#7802) 2018-12-19 02:34:17 +03:00
Andrey Abramov 50bed56766
fix issue and add test (#7790) 2018-12-18 17:33:08 +03:00
Wilfried Goesgens 7fa0cdc41b Feature/drop before win7 support in compiler (#6681) (#7751) 2018-12-17 12:06:53 +01:00
Kaveh Vahedipour 9e83e1696d [3.4] allow for quicker start of actions when previusly completed (#7736)
* shard versioning
* one incompatible call fixed
2018-12-17 10:36:50 +01:00
Max Neunhöffer 1c4430afdf
Adjust two error codes (TRI_ERROR_SHUTTING_DOWN). (#7779) 2018-12-17 09:50:53 +01:00
Kaveh Vahedipour 92b7df5a1d [3.4] equalising devel and 34 (#7754)
* equalising devel and 3.4 in agency/cluster
* missing header
2018-12-17 09:06:35 +01:00
Michael Hackstein ddd5226d13
Feature3.4/improve edgeindex covered (#7750)
* Allow the RocksDB edge index to return the already covered opposite vertex.

* Updated tests
2018-12-12 22:03:47 +01:00
Michael Hackstein bc65e79b02
Bug fix 3.4/fix tombstones (#7758)
* fixes a rare situation where replication could require identical lock twice

* Updated changelog
2018-12-12 22:01:27 +01:00
Jan 3fa3170462
fix invalid handling of `_lastValue` in case of multiple coordinators (#7735) 2018-12-12 12:22:52 +01:00
Jan f43cc15bfc
fixed item 3 of issue #7009 (#7746) 2018-12-12 11:38:18 +01:00
Dan Larkin-York 6cc70cd615 [3.4] Ignore invaild geo coordinates when indexing. (#7723) 2018-12-11 09:24:50 +01:00
Jan e6983a35ed
backport missing changes to 3.4 (#7717) 2018-12-10 16:45:01 +01:00
Lars Maier b7313654da Someone (me) forgot to rename the field everywhere. (#7716) 2018-12-10 16:07:51 +01:00
jsteemann b64826065c fix LDAP tests 2018-12-10 15:57:21 +01:00
Kaveh Vahedipour 8557fdaf22 agency failed to start in log level trace (#7687) 2018-12-10 15:31:38 +01:00
Kaveh Vahedipour 1b75220a1b [3.4] Early sort out system collections for maintenance (#7589) 2018-12-10 15:16:35 +01:00
jsteemann f5f059e715 Merge branch '3.4' of https://github.com/arangodb/arangodb into 3.4 2018-12-10 13:35:13 +01:00
jsteemann 538a877c1f accept response code 201 (Created) as a valid response 2018-12-10 13:34:50 +01:00
Tobias Gödderz a1b925c655 Added tests for parseVersion (#7676) 2018-12-10 12:47:20 +01:00
Jan d894986cd9
do not optimize away sort clauses when it is unsafe to do so (#7694) 2018-12-07 16:23:13 +01:00
Jan 1a4a6e7d2b
minor fixes (#7691) 2018-12-07 16:22:32 +01:00
Jan 2f48d03c19
speed up remove ops for RocksDB engine (#7638)
* speed up remove ops for RocksDB engine

* add more tests and improve queries a bit for single remote AQL operations
2018-12-06 17:48:42 +01:00
Jan 677522991e
Feature/internal 3306 (#7683) (#7688) 2018-12-06 17:46:58 +01:00
Jan b43b600a33
try again in case of a conflict when updating users (#7615) 2018-12-06 16:36:05 +01:00
Max Neunhöffer 3a7df19189 Fix super user JWT token behaviour with non-ex. db. (#7656) 2018-12-05 16:50:41 +01:00
Heiko 8d171b2c51 parse version fix (#7652) 2018-12-05 13:01:24 +01:00
Heiko a94a402dde added MultiPolygon GeoJSON constructor function (#7634) 2018-12-04 17:54:34 +01:00
Dan Larkin-York 743ede87ea Disable warning about persistent IDs during upgrade procedure. (#7611) 2018-12-04 09:02:41 +01:00
jsteemann 0b9a8da7af fix segfault 2018-12-04 08:56:40 +01:00
Jan a228644f21 dont keep JS module directory (#7619) 2018-12-03 18:10:22 +01:00
Max Neunhöffer dd5c830f2f
Call license key check. (#7594)
* Call license key check.

* Add CHANGELOG entry.
2018-12-03 17:23:32 +01:00
jsteemann 405de24c54 improve debug messages 2018-12-03 16:40:48 +01:00
Jan 574d9c2d26
dont fail when restoring a cluster dump into a single server (#7596) 2018-12-03 16:18:27 +01:00
Lars Maier 4bf2302150 Do nothing in phaseTwo if leader has not been touched. (#7579)
* Do nothing in phaseTwo if leader has not been touched.

* Drop follower if it refuses to cooperate.

This is important since a dbserver that is follower for a shard will
after a reboot think that it is a leader, at least for a short amount
of time. If it came back quickly enough, the leader might not have
noticed that it was away.
2018-12-02 13:14:46 +01:00
Andrey Abramov de96a89ba7
treat all iresearch scores as float_t for 3.4 (#7573)
* treat all iresearch scores as float_t for 3.4

* cleanup

* attempt to fix tests

* another attempt to fix tests
2018-12-01 01:19:34 +03:00
Frank Celler a86fd3dd67 fixed init 2018-11-30 21:17:38 +01:00
Frank Celler 067606da3a
Bug fix 3.4/bad leader report current (#7574)
* Initialize theLeader non-empty, thus not assuming leadership.

* Correct ClusterInfo to look into Target/CleanedServers.

* Prevent usage of to be cleaned out servers in new collections.

* After a restart, do not assume to be leader for a shard.
2018-11-30 21:11:48 +01:00
Jan 836954b8e3
allow using UTF8 filenames for UUID directory (#7569) 2018-11-30 17:25:50 +01:00
Andrey Abramov 5d8c07286a ensure UnorderedRefKeyMap keys are valid after copying (#7549) (#7558) 2018-11-29 20:08:27 +01:00
Andrey Abramov 2c36657a9e improve logging in ClusterInfo::loadPlan (#7511) (#7532) 2018-11-29 20:08:11 +01:00
Dan Larkin-York dbd59e19cd [3.4] Persist and check default language selection (#7490) 2018-11-29 19:51:19 +01:00
Lars Maier 2ed283ef3c Fixing broken UI. (#7551) 2018-11-29 19:31:45 +01:00
Jan 2c3542d41e
fixed issue #7522 (#7554) 2018-11-29 19:26:42 +01:00
Simon 4132870e49 Document RocksDB exclusive option (#7517) (#7538) 2018-11-29 18:42:43 +01:00
jsteemann 67dc735664 Revert "fixed some asan search stuff (#7543)"
This reverts commit 4ac9fde1d4.
2018-11-29 17:27:15 +01:00
Jan b708edd059
fix assertion (#7526) 2018-11-29 15:43:06 +01:00
Tobias Gödderz f61ccd4047 Reload Foxx routes during startup (#7531) 2018-11-29 15:31:40 +01:00
Andrey Abramov e67c2cac06
avoid calling cluster related functions while instantiating views on … (#7509) (#7528)
* avoid calling cluster related functions while instantiating views on a db server

* minor cleanup
2018-11-29 17:18:34 +03:00
Heiko 4ac9fde1d4 fixed some asan search stuff (#7543) 2018-11-29 15:08:35 +01:00
Simon 933ca8a775 Bug fix/restore index refactor (#7470) (#7491)
(cherry picked from commit d0efd95a37)
2018-11-29 14:08:29 +01:00
Kaveh Vahedipour 3225a7b16d [3.4] Feature/engine version added to agent configuration (#7481)
* agents' is obtained from leader's configuration
* corrections in Supervision for advertised endpoints
* change log
* Updated Documentation for cluster/health.
* Unified naming convention.
* Fixed missing update of volatile fields.
* Set version in right order.
* Removed debug output.
* Fixed jslint - missing ;
2018-11-29 12:00:47 +01:00
Max Neunhöffer 804ac13db2
SynchronizeShard's potentially long running while loops yield for shutdown (#7523) 2018-11-29 11:47:16 +01:00
Max Neunhöffer b74358a3dd
Improve log messages. (#7520) 2018-11-29 11:30:43 +01:00
Jan f1086bac4f
added option `--rocksdb.enforce-block-cache-size-limit` (#7508) 2018-11-28 20:40:20 +01:00
Jan c0d05ce869
Bug fix 3.4/use lock for pregel stats (#7512) 2018-11-28 20:26:32 +01:00
Max Neunhöffer 10b6813f01
Fix index creation (port from devel). (#7443)
* Fix index creation in cluster.

Simplify and correct error handling logic in ensureIndexCoordinator.

* After index creation, wait until index appears.

We wait until the Supervision has removed the isBuilding flag and
the coordinator has reloaded the Plan.

* More index handling fixes.

* Explicitly remove isBuilding flag in coordinator (again).

* Fix order of arguments in REPLACE call.

* Take out debugging output again.

* Fix catch tests by holding mutex shorter.

* Better mutex handling in ClusterInfo.
2018-11-28 16:58:27 +01:00
Jan 2d4b38600f
prevent operations from overtaking each other (#7498) 2018-11-28 14:30:54 +01:00
Andrey Abramov 30fe53e34e
Bug fix/internal issue #502 (#7480) (#7487)
* update iresearch

* update iresearch
2018-11-27 23:53:27 +03:00
Simon 8ddb9a063b Micro-Optimize AQL CXX calls (#7486) 2018-11-27 20:22:18 +01:00
Vasiliy 3caed3eb9a issue 506.3: backport 3.4: issue 506.3: use camel-case configuration parameter names consistntly, add a configuration version property to iresearch view meta (#7476)
* issue 506.3: backport 3.4: issue 506.3: use camel-case configuration parameter names consistntly, add a configuration version property to iresearch view meta

* backport: ensure meta version is supported

* backport: hide 'version' property from non-persistence json
2018-11-27 18:35:34 +03:00
Vasiliy f701c6d681 issue 506.2: backport 3.4: add optimization to not reexecute a primary-key filter if a match was already found (#7461)
* issue 506.2: backport 3.4: add optimization to not reexecute a primary-key filter if a match was already found

* backport: explicitly check type of instance of the primary-key filter

* backport: return non-null prepared filter and convert check to assert
2018-11-27 18:29:41 +03:00
jsteemann 5fa0de04a9 Merge branch '3.4' of https://github.com/arangodb/arangodb into 3.4 2018-11-27 13:44:35 +01:00
jsteemann f907dcebbd increase shutdown time 2018-11-27 13:44:18 +01:00
Lars Maier 154d449061 Export Version and Engine in Cluster Health. Additionally export `versionString` in registered Servers. (#7463) 2018-11-27 09:15:38 +01:00
Dan Larkin-York 00c060c884 [3.4] Fix end condition (hasMore) for EnumerateViewNode. (#7278)
* Fix end condition (hasMore) for EnumerateViewNode.

* Fix crashes.

* Some more fixes.

* eliminate code duplication
2018-11-27 00:41:29 +03:00
Jan ffc823e1c8
Bug fix 3.4/backport optimizations (#7434) 2018-11-26 19:16:05 +01:00
Tobias Gödderz a83300dc29 Fix error handling in case ClusterCommResult.result == nullptr (#7355) 2018-11-26 16:22:43 +01:00
Vasiliy b0d11022b9 issue 506.1: backport 3.4: address issue with multiple identical document insertions on rocksdb recovery (#7453) 2018-11-26 17:17:47 +03:00
Max Neunhöffer 6bd61760f2
Fix moving of shard leaders. (#7446)
* Ungreylist move shard test.
* Move leader shard: wait until all but the old leader are in sync.
* Increate moveShard timeout to 10000 seconds.
* Add CHANGELOG.
* Fix compilation.
* Fix a misleading comment.
2018-11-26 15:05:56 +01:00
jsteemann 9658300f11 revert Scheduler changes 2018-11-26 09:54:41 +01:00
Simon 96346a12d0 switch default message for requireFromPresent (#7439) (#7450)
(cherry picked from commit f90b48f792)
2018-11-26 09:16:48 +01:00
Andrey Abramov 822e15e770
issue 153: ensure views are dropped in Agency when database is dropped in cluster, minor fixes (#7370) (#7451)
* issue 153: ensure views are dropped in Agency when database is dropped in cluster, minor fixes

* backport: add test to ensure views are dropped when database is dropped from plan, fix some issues in ClusterInfo

* optimize primary key lookups in ArangoSearch

* fix test

* Add JS tests

* temporary comment optimizations

# Conflicts:
#	arangod/Cluster/ClusterInfo.cpp
2018-11-26 00:33:58 +03:00
Michael Hackstein 8098bb4eed
Bug fix 3.4/syncing of followers (#7377)
* Added some DEBUG output for replication rest handler

* Some more debug logging.

* Increased the priority of the ReplicationHandler. This way we will not get stuck with locks that cannot be canceled. Also cancel the lock on the correct database.

* Added extensive log output for replication thins

* Added tombstones to RestReplicationHandler. In a very unlikely case the cancel of a lock can be executed BEFORE the code that actually registers the lock, in this case we will now write a tombstone and do not lock.

* Revert "Added extensive log output for replication thins"

This reverts commit 6d4e37ea1e59e3b3457336019cc7dbc4c979504d.

* Added extensive log output for replication things, now in ERR level instead of MAINTAINER only

* Now actually use hours for synchronization

* React to errors under soft lock if they show up.

* Added a retry loop to increase the read-lock timer.

* Added more timeing output in RocksDB collection internals to figure out why the followers are dropped

* Tweaked RocksDB options

* Revert "Tweaked RocksDB options"

This reverts commit 2bf9c43280beda4792c47d079387fe5154cdd896.

* Removed debug output

* Applied all requested changes by goedderz

* Deleted unused variable
2018-11-23 16:08:27 +01:00
Wilfried Goesgens d5f92f87c3 fix none reply 2018-11-23 11:29:00 +01:00
Wilfried Goesgens 4bbd6a02bb Bug fix/less exceptions (#7385) (#7415) 2018-11-23 11:15:36 +01:00
Wilfried Goesgens c50d346453 add alternative to ClusterInfo::getCollection() that doesn't throw (#7413)
* add alternative to ClusterInfo::getCollection() that doesn't throw (#7339)

* handle more potential nullptrs, fix try/catch scope
2018-11-23 11:15:25 +01:00
Wilfried Goesgens d4af8fe287 remove enterprise-gotos (#7375) (#7414) 2018-11-23 11:14:21 +01:00
Simon ebad3c3c83 Fix restore of views, add --view option (#7425) (#7427)
(cherry picked from commit c584527d79)
2018-11-23 09:11:33 +01:00
jsteemann 1b4eef270c make message a debug message instead of an error message 2018-11-22 18:46:42 +01:00
Jan b363372c63
Bug fix 3.4/remove shutdown assertion (#7387) 2018-11-22 15:36:06 +01:00
Jan 19dc2ca0b7
yet more micro optimizations (#7399) 2018-11-21 17:09:16 +01:00
Simon ef239cbe4e Make recovery more reliable (#7297) (#7367) 2018-11-21 16:51:38 +01:00
Jan Christoph Uhde 8441368954 disable RocksDB indexing for some secondary index operations (#7393) 2018-11-21 15:50:36 +01:00
Kaveh Vahedipour 860fa21219 Bug fix 3.4/index readiness (#6716)
* backport of test data generation for maintenance from devel
* 3.4 working
* fixing index use in cluster while still being built
* fixed broken views
* correct 200 for ensureIndex
* merge with 3.4
* agency comm to handle replace in array
* supervision changes
* cluster info's exsureIndex
* 3.4 ready
* timeout
* missing files from origin
* neunhoef complaints
* bogus entry
* no need to wait for current once again
* no longer necessary. done in IndexFactory now
* correct comments
* left overs
* dead code revived
* Move CHANGELOG entry to the right place.
2018-11-21 14:41:36 +01:00
Jan 637ffada86
fix assertion failures for MMFiles operationsQueue state on shutdown (#7389) 2018-11-21 09:33:38 +01:00
Dan Larkin-York a7c9374527 Hide internal link properties (#7209) 2018-11-20 18:00:14 +01:00
Jan a5f4fe4a22
dont update lastProcessedTick too early (#7381) 2018-11-20 17:54:30 +01:00
Jan d27c4cc113
Bug fix 3.4/aql speedup (#7378) 2018-11-20 16:08:16 +01:00
Simon 5124633e6a Faster index creation (#7348) 2018-11-20 13:41:01 +01:00
Jan eda236f968
AuthenticationFeature::isEnabled() is not doing what is expected (#7373) 2018-11-20 11:30:10 +01:00
jsteemann dcf6cde1f4 fix assertion 2018-11-19 14:48:22 +01:00
Tobias Gödderz 3d1c643e23 [3.4] MMFiles replication: get followers under lock (#7298)
* Fix resign order

* Fixed a typo

* Get followers later, add TODOs

* Added a callback parameter to collection insert methods

* Get followers under the lock if necessary

* Extracted the replication of inserts into a separate method

* Move shortcut into replicate method

* Added callbacks for remove, replace and update

* Added missing overrides

* Extracted replication code from modifyLocal and removeLocal

* Update followers under lock also during replace, update, remove

* Fix changes from the last commit for update/replace

* Update comments, add asserts

* Remove changes for document-level locks that will be done in another PR

* Unify replication

* Adapt log messages to the devel ones

* Move common methods from its descendants to TransactionCollection, fix Mock on the way

* More IResearch test / mock fixes

* Relax asserts for nested transactions

* Reformat

* Fix non-babies remove and modify replication
2018-11-19 13:03:07 +01:00
Dan Larkin-York 4210d9eab6 [3.4] Updates to collection versioning. (#7260) 2018-11-19 09:45:09 +01:00
Matthew Von-Maszewski 4362137ba4
Bugfix 3.4: Null pointer defense in Scheduler::post(callback) (#7285)
* defense against the dark arts (nullptr in _ioContext)

* move incQueued() so that we can imply race state of _ioContext.

* adjust to meet Jans expectations

* jsteeman noticed that queue count is not considered before shutdown ... bad

* add JobGuard object to manage working count.  should hold shutdown a tad longer.

* TEMPORARY HACK:  need to validate problem that is randomly occurring in Jenkins automation

* TEMPORARY HACK 2: trying to isolate an acceptable sequence.

* TEMPORARY HACK 3: trying to isolate an acceptable sequence.

* TEMPORARY HACK 4: so close ... seem to have all the moving parts isolated.  Come on Jenkin!

* shutdown now orderly finishes everything already in fifo queues and active on threads.  Then forces any late requests to execute on callers thread.
2018-11-16 12:20:00 -06:00
Max Neunhöffer c005e0b0f0
Improve error reporting in maintenance. (#7340)
* Improve error reporting from maintenance.
* Fix compilation.
* Tiny polishing fix.
2018-11-16 10:25:55 +01:00
Lars Maier ea7b476ab6 [3.4] Endpoints via _system (#7344)
* Allow accessing _api/cluser/endpoints as authenticated user via the _system database.
* Removed start.sh
2018-11-16 10:01:42 +01:00
Jan 812c9223fe
fix queries that refer to COLLECT variables from inside COLLECT (#7333) 2018-11-15 15:10:56 +01:00
Max Neunhöffer dafb0d1a06
supervision bug fix to start with clean transient store (#7323) 2018-11-15 11:24:53 +01:00
Andrey Abramov 2a1542fdc2
Feature/arangosearch pk endianness (#7306) (#7312)
* refactor arangosearch pks

* minor refactoring

* store PK as BigEndian since it leads to more compact index representation

* force iresearch to not to use libbfd

* fix tests
2018-11-14 15:56:40 +03:00
Max Neunhöffer 805f7a7621
Fix timeout in cluster operation in create and drop collections. (#7300)
* Fix loophole.
* Fix inquiry case of id not found: 404.
* Also handle correctly in AgencyComm.
* Fix agency tests.
* Fix error handling in dropCollectionOnCoordinator.
2018-11-14 10:02:26 +01:00
Jan ccc064433b
Bug fix 3.4/aql micro optimizations (#7286) 2018-11-13 11:55:04 +01:00
Jan 8bcb1a310c
fix failing non-deterministic query-stream test (#7296) 2018-11-13 11:37:09 +01:00
jsteemann 046dac4234 fix define name 2018-11-12 13:09:48 +01:00
jsteemann bce1f51b8c simplify conditions 2018-11-12 11:14:19 +01:00
Andrey Abramov 6b7ea6cb9b
Feature/arangosearch optimize documents reading (#7280) (#7284)
* optimize reading documents from arangosearch index

* simplify code

* get rid of useless interface

* even more simplifications

* update iresearch to commit 40128bf50cea3546313fbfd71e5a32bb88e418a2

* optimize PK reading

* cleanup

* minor refactoring

* address review comments

* micro optimization
2018-11-09 20:01:13 +03:00
Wilfried Goesgens ccc43cd932 Bug fix 3.4/remove enterprise goto (#7274)
* add more information when timeout failing the index creation tests

* rather use null-pointers than try/catch for control flow
2018-11-08 16:27:51 +01:00
Dan Larkin-York 8bd754b9ad [3.4] Fix nullptr dereference in SynchronizeShard. (#7267) 2018-11-08 14:12:33 +01:00