1
0
Fork 0
Commit Graph

2380 Commits

Author SHA1 Message Date
Jan Christoph Uhde 0b8c75c7b7 one shard db - devel (#9395) 2019-09-23 15:48:37 +02:00
Tobias Gödderz aa14af9ae0 Extended the range of a suppressed warning for MSVC (#10039) 2019-09-20 13:09:34 +02:00
jsteemann 85bdf6ef01 fixit 2019-09-19 12:40:33 +02:00
Jan 41b0717bc9
remove unused code (#10043) 2019-09-19 12:25:25 +02:00
jsteemann 8d45b52bc7 Merge branch 'devel' of github.com:arangodb/arangodb into devel 2019-09-19 12:21:56 +02:00
jsteemann 13c98b82b5 added message 2019-09-19 12:21:37 +02:00
Max Neunhöffer 71c8314107 Add more strong references to pthread stuff. (#10038)
* Add more strong references to pthread stuff.

* Circumvent libgcc/libmusl mistake in multi-threaded detection.

* CHANGELOG.
2019-09-19 12:36:15 +03:00
Jan 46f3fa6923
fix graph error message (#10041) 2019-09-19 10:32:36 +02:00
Tobias Gödderz 4b81ed33b7 Interprocedural optimizations (#9596)
* Enable IPO (LTO) via CMake

* Use separate IPO_ENABLED variable

* Enabled IPO for more binaries

* Enabled IPO for arangobackup

* Suppress an MSVC warning (though it is not unjustified...)

* Fix static builds, broken due to Policy CMP0060 introduced in cmake 3.3

* Removed policies superseded by minimum cmake version

* Disable IPO with google tests due to a g++ bug

* Disable IPO with google tests only on AUTO

* Disable warning for correct line

* Enabled IPO for additional libs

* Fixed some problems found with IPO enabled

- Fixed ODR violation in Pregel
- Removed unused code in ClusterMethods
- Avoid possible uninitialized variable usage

* Set IPO globally, except for 3rdParty libs
2019-09-18 11:29:37 +02:00
jsteemann ce6489b761 remove unused function 2019-09-16 19:56:47 +02:00
Jan 84ad504a6c
corrected several wrong macro names (#9971) 2019-09-11 16:45:32 +02:00
Max Neunhöffer 7d5313611f
Fix a shutdown busy loop after main if two exceptions collide. (#9978) 2019-09-11 14:47:49 +02:00
KVS85 4fc39dd4b3
Debug segfault reimplementation (#9940)
* Changed debugSegfault to debugTerminate

* Fix *nix compilation

* More data for broken reconnect

* Remove circumventCores completely

* Fix forgotten calls
2019-09-09 23:07:45 +03:00
Kaveh Vahedipour ad36adc354 Feature/hotbackup list retries (#9924)
* retry hot backup listing for 2 minutes in cluster before giving up
2019-09-05 16:44:43 +02:00
Markus Pfeiffer 753ff4aa67 Feature/atomic database creation 2 (#9826) 2019-09-05 12:38:07 +02:00
Simon 72396e6b79 Bug fix/windows network (#9895) 2019-09-03 20:22:23 +02:00
jsteemann d1b1b6726f Revert "Feature/resource usage (#9647)"
This reverts commit 76afb5001e.
2019-09-03 12:48:02 +02:00
Frank Celler 575f818498 fixed includes for Mac 2019-09-02 16:57:31 +02:00
Wilfried Goesgens 76afb5001e Feature/resource usage (#9647)
* spawn an arangosh that checks cluster nodes are responsive

* attempting to kill 0 may end up bad for us, prohibit it

* don't trip over this optional stuff.

* fix killing of spectator

* only write one line per minute

* fix function name

* print sHitlist of tests, add resource usage

* fix lint

* start refactoring result processing into its own library

* /proc reading only works on linux

* new result processing library

* clean up test loading flow, remove unused blacklist implementation

* measure SUT start/stop + test time

* lint

* more non-test variables

* improve test runner naming

* finish gathering statistics about start/run/stop

* use internal stats code for external processes too

* fix status in sample testcase

* tell that procdump is gone - it seems this happenes in reality without coredumps being written

* refactor test result analyzing, add utility to work with analyzers of json dumps from CI systems

* fix testcase error handling

* also run watcher for agency

* lint

* fix default options for test added default options

* if arguments occur multiple times, update value to an array

* start implementing some test analyzers

* fix color

* write json report unconditional

* add analyzer that searches for long setup/teardown tests

* enable thread dump; log error if we fail to acquire the threads

* disable procdump for the agency

* output error if we fail to get process stats

* fix json invocation

* add debug logging

* only add buildType if that directory actually exists

* trap sleepers

* trap sleepers

* trap sleepers

* trap sleepers

* trap sleepers

* disable thread counting on the wintendo

* disable thread counting on the wintendo

* more measurements

* more measurements

* one more place

* one more place

* remove debugging code

* Update js/client/modules/@arangodb/process-utils.js

Co-Authored-By: Dan Larkin-York <danielhlarkin@users.noreply.github.com>

* Update js/client/modules/@arangodb/process-utils.js

Co-Authored-By: Dan Larkin-York <danielhlarkin@users.noreply.github.com>

* Apply suggestions from code review

Co-Authored-By: Dan Larkin-York <danielhlarkin@users.noreply.github.com>

* rename as sugested by @dan

* undo debug changes

* lint, make cluster health monitor optional per default

* fix spawning of active failover SUT, fix cluster health monitor shutdown

* use std::find_if (as @dan sugested), fix log ids

* fix scope of before-time
2019-08-30 16:20:07 +03:00
Simon 0ee0cebb11 Non-Blocking inserts (#9823) 2019-08-30 09:17:58 +02:00
Jan 30b36a2a42
fix return value checks (#9852) 2019-08-29 20:38:53 +02:00
Jan 8c13019900
don't create `arangosh_XXXXXX` directories (#9808) 2019-08-27 11:27:44 +02:00
jsteemann df761d2010 remove unused boost includes 2019-08-26 19:39:31 +02:00
Jan 00bcc4954c
AQL date functions improvements (#9714) 2019-08-22 12:50:08 +02:00
Jan ec586667e4
downgrade WARN messages to INFO level (#9761) 2019-08-20 18:28:57 +02:00
Dan Larkin-York f7d41ad8fb Check scheduler queue return value (#9754) 2019-08-19 19:31:55 +02:00
jsteemann 2735114c59 Revert "Check scheduler queue return value. (#9701)"
This reverts commit 41c1b571b9.
2019-08-19 10:13:48 +02:00
Dan Larkin-York 41c1b571b9 Check scheduler queue return value. (#9701) 2019-08-19 10:01:08 +02:00
Simon 8af83d5bd4 Auxilliary changes from timeseries branch (#9699) 2019-08-15 10:12:58 +02:00
Tobias Gödderz 6b3e11dd40 Fixed error code to not re-use an old one (#9686) 2019-08-13 15:18:45 +02:00
Frank Celler 68ea717af4
Feature/set environment (#9688) 2019-08-13 09:18:16 +02:00
Frank Celler aa3d3f8e40
Feature/cleanup ccpcheck (#9665) 2019-08-12 11:11:49 +02:00
Tobias Gödderz 9cd332b958 Feature/rebootid notice changes (#9523)
* Consolidated _servers and _serverAdvertisedEndpoints, added rebootId, prepared change notifications

* Cleanup

* Added a RebootId type

* Began implementing RebootTracker (still WIP)

* Moved RebootId operators into the class

* Removed RebootId operator<< again

* Added tests, added CallbackGuard, removed/commented old RebootTracker code

* Fix: do not try to call unset callbacks

* Split one test, added another

* Added more tests

* Renamed tests, added more tests

* Fixed missing variable declarations

* Let MockServer appear to be started

* Reorded test, fixed naming

* Implemented callMeOnChange()

* Re-implemented RebootTracker (not yet working)

* Resolved a TODO, updated a test, added comments

* Call old callbacks immediately

* Fixed tests

* Use EXPECT_* instead of ASSERT_*

* Suppress a log message

* Resolved TODOs

* Reverted changes on reading ServersRegistered

* Update RebootTracker

* Introduce `rebootId` into ServerState for Cluster

 * A server *boots* if it is started on a previously non-existing data
   directory and hence does not have a UUID yet.
 * A server *reboots* if it is started on a pre-existing data directory

We keep the rebootId in the cluster's agency under
Current/ServersKnown/$uuid/rebootId.

When rebooting (and subsequently re-joining a cluster), the server increments
its rebootId in Phase 2 of registration. This way it can be detected within the
cluster whether a server was restarted.

This information will later be used to handle cases where server restarts can
lead to problems, for example with transactions or in-progress queries.

* Move rebootId into Current/ServersKnown/

* Fixed typo

* Fixed log ids

* Add deletion of ServersKnown/UUID from agency

* Add deletion of Current/ServersKnown/UUID to removeServer

* Clean up readRebootIdFromAgency and add retry loop around it

* Bugfix

* Added nolint comments

* Fixed initialization order

* Fixed ClusterInfo-test

* Added log messages

* Revert "Fixed ClusterInfo-test"

This reverts commit d983596979.

* Disabled assertion for google tests

* Ignore windows compile warning

* Always call loadServers in loadCurrent

* Fix really subtle bug when not returning a value

* Introduce `rebootId` into ServerState for Cluster

 * A server *boots* if it is started on a previously non-existing data
   directory and hence does not have a UUID yet.
 * A server *reboots* if it is started on a pre-existing data directory

We keep the rebootId in the cluster's agency under
Current/ServersKnown/$uuid/rebootId.

When rebooting (and subsequently re-joining a cluster), the server increments
its rebootId in Phase 2 of registration. This way it can be detected within the
cluster whether a server was restarted.

This information will later be used to handle cases where server restarts can
lead to problems, for example with transactions or in-progress queries.

* Move rebootId into Current/ServersKnown/

* Add deletion of ServersKnown/UUID from agency

* Add deletion of Current/ServersKnown/UUID to removeServer

* Clean up readRebootIdFromAgency and add retry loop around it

* Fixed compile error due to forbidden implicit cast

* Fixed compile error on windows

* Fixed compile error due to devel merge

* Removed dead comment

* Removed TODO note

* Extended comment

* Removed TODO note

* Fixed using an invalidated iterator

* Copy string only if necessary

* Fixed compile error
2019-08-12 09:33:22 +02:00
Dan Larkin-York 320c08ee4e Add constructors to fix cppcheck warnings. (#9648) 2019-08-07 08:23:16 +02:00
Dan Larkin-York 3d0246cb18 Decentralize includes (#9623) 2019-08-06 15:32:09 +02:00
Lars Maier ed496fe5dd Feature/hotbackup devel (#9495)
Hotbackup
2019-08-02 11:39:46 +02:00
Matthew Von-Maszewski 91b56a50a3
Feature: Add gzip and encryption to import/export (#9560)
* port of feature-3.4/mv-gzip-export to devel branch

* add explicit namespaces so gcc 6.3.0 would successfully compile

* add conditional cleanup of ENCRYPTION file from another branch

* use _lseek() for Windows build to avoid deprecated warning.

* change from ifdef for lseek variants to TRI_LSEEK.

* force Windows lseek to return Linux expected type.
2019-07-26 07:53:39 -04:00
Michael Hackstein 36b1d290a9
Bug fix/failover with min replication factor (#9486)
* Improve collection time of IResearchQueryOptimizationTest

* Added a minReplicationFactor field in Collections. It is not possible to modify it yet and noone cares for it

* Added some assertion son minReplicationFactor

* Transaction API will now reject writes as soon as minimal replication factor is NOT fulfilled

* added minReplicationFactor to the user interface, preparation for the collection api changes

* added minReplicationFactor to VocBaseCollection, RestReplicationHandler, RestCollectionHandler, ClusterMethods, ClusterInfo and ClusterCollectionCreationInfo

* added minReplicationFactor usage to tests

* TODO TEMOPORARY COMMIT FOR TESTING PLEASE REVERT ME

* minReplicationFactor now able to change via collection  properties route

* fixed wrongly assert

* added minReplicationFactor to the graph management ui

* added minReplicationFactor to the gharial api

* Fixed off-by-one error in minReplicationFactor. We actually enforced one more.

* adjusted description of minReplicationFactor

* FollowerInfo Refactoring

* added gharial api graph creation tests with minimal replication factor

* proper cleanup of shell collection tests, removed lots of duplicate code, preparation for some new tests

* added collection create tests using invalid/valid names, replicationFactor and minReplicationFactor

* Debug logging

* MORE Debug logging

* Included replication fast lane

* Use correct minreplicationfactor

* modified debug logging

* Fixed compileissues

* MORE Debug logging

* MORE Debug logging

* MORE Debug logging

* MORE Debug logging

* MORE Debug logging

* MORE Debug logging

* MORE Debug logging

* Revert "MORE Debug logging"

This reverts commit dab5af28c0.

* Revert "MORE Debug logging"

This reverts commit 6134b664bd.

* Revert "MORE Debug logging"

This reverts commit 80160bdf3b.

* Revert "MORE Debug logging"

This reverts commit 06aabcdfe1.

* Removed debug output

* Added replication fast lane. Also refactored the commands as i cannot take it any more...

* Put some requests of RocksDBReplication onto CATCHUP Lane.

* Put some requests of MMFilesReplication onto CATCHUP Lane.

* Adjusted Fast and MED lane usage in Supervised scheduler

* Added changelog entry

* Added new features entry

* A new leader will now keep old followers in case of failover

* Update arangod/Cluster/ClusterCollectionCreationInfo.cpp

Co-Authored-By: Tobias Gödderz <tobias@arangodb.com>

* Fixed JSLINT

* Unified lane handling of replication handlers

* Sorry forgotten in last commit

* replaced strings with static strings

* more use of static strings

* optimized min repl description in the ui

* decr initial loop variable

* clean up of the createWithId test

* more use of static strings

* Update js/apps/system/_admin/aardvark/APP/frontend/js/views/collectionsView.js

Co-Authored-By: Tobias Gödderz <tobias@arangodb.com>

* Added some comments on condition, renamed variable as suggested in review

* Added check for min replicationFactor to be non-zero

* Added assertion

* Added function to modify min and max replication factor in one go

* added missing semicolon

* rm log devel

* Added a second information to follower info that can keep track of followers that have been in sync before a failover has taken place

* Maintenance reports previous version now to follower info. instead of lying by itself. The Follower Info now gets a failover save mode to report insync followers

* check replFactor against nr dbservers

* Add lie reporting in CURRENT

* Reverted most of my recent commits about Failover situation. The intended plan simply does not work out

* move replication checks from logical collection to rest collection handler

* added more replication tests

* Include assert only if we are not in gtest

* jslint

* set min repl factor to zero if satellite collection

* check replication attributes in v8 collection

* Initial commit, old plan, does not yet work

* fixed ires tests

* Included FailoverCandidates key. Not fully implemented

* fixed wrong assert

* unified in sync follower reporting

* fixed compiler errors

* Cleanup locking, and fixed potential deadlocks

* Comments about locking order in FollowerInfo.

* properly check uint

* Keep old leader as potential failover candidate

* Transaction methods now use followerInfo to check if the leader can write, this might have the sideeffect that 'failoverCandidates' are updated

* Let agency check failoverCandidates if possible

* Initialize member variables

* Use unified follower reporting in DBServerAgencySync

* Removed obsolete variable, collecting it somewhere else

* repl factor attr check

* Reimplemented previous followers, second attempt now. PhaseOne and PhaseTwo can now synchronize on current.

* Fixed assertion, forgot an off-by-one

* adjusted test to be more preciese now

* Fixed failove candidates list

* Disable write on dropping too many followers

* Allow to run updateFailoerCandidates multiple times with same leader.

* Final fixes, resilience tests now green, crossing fingers for jenkins

* Fixed race on atomics comparison

* Fixed invalid number type

* added nullptr handling

* added nullptr handling

* Removed invalid assert

* Make takeover of leadership an atomic operation

* Update tests/js/common/shell/shell-cluster-collection.js

Co-Authored-By: Tobias Gödderz <tobias@arangodb.com>

* Review fixes

* Fixed creation code to use takeoverLeadership

* Update arangod/Cluster/FollowerInfo.h

Co-Authored-By: Tobias Gödderz <tobias@arangodb.com>

* Applied review fixes

* There is no timeout

* Moved AQL + Pregel to INTERNAL_AQL lane, which is medium priority, to avoid deadlocks with Sync replication

* More review fixes

* Use difference if you want to compare two vectors...

* Use std::string ...

* Now check if we are in recovery mode

* Added documentation for minReplicationFactor

* Added readme update as well in documenation
2019-07-19 15:00:30 +02:00
Wilfried Goesgens c922c5f133 move deleting of ClusterComm threads to the unprepare of the ClusterFeature (#9369) 2019-07-19 13:53:00 +02:00
Michael Hackstein cbcf561450
Feature/min replication factor (#9433)
* Added a minReplicationFactor field in Collections. It is not possible to modify it yet and noone cares for it

* Added some assertion son minReplicationFactor

* Transaction API will now reject writes as soon as minimal replication factor is NOT fulfilled

* added minReplicationFactor to the user interface, preparation for the collection api changes

* added minReplicationFactor to VocBaseCollection, RestReplicationHandler, RestCollectionHandler, ClusterMethods, ClusterInfo and ClusterCollectionCreationInfo

* added minReplicationFactor usage to tests

* TODO TEMOPORARY COMMIT FOR TESTING PLEASE REVERT ME

* minReplicationFactor now able to change via collection  properties route

* fixed wrongly assert

* added minReplicationFactor to the graph management ui

* added minReplicationFactor to the gharial api

* Fixed off-by-one error in minReplicationFactor. We actually enforced one more.

* adjusted description of minReplicationFactor

* FollowerInfo Refactoring

* added gharial api graph creation tests with minimal replication factor

* proper cleanup of shell collection tests, removed lots of duplicate code, preparation for some new tests

* added collection create tests using invalid/valid names, replicationFactor and minReplicationFactor

* Debug logging

* MORE Debug logging

* Included replication fast lane

* Use correct minreplicationfactor

* modified debug logging

* Fixed compileissues

* MORE Debug logging

* MORE Debug logging

* MORE Debug logging

* MORE Debug logging

* MORE Debug logging

* MORE Debug logging

* MORE Debug logging

* Revert "MORE Debug logging"

This reverts commit dab5af28c0.

* Revert "MORE Debug logging"

This reverts commit 6134b664bd.

* Revert "MORE Debug logging"

This reverts commit 80160bdf3b.

* Revert "MORE Debug logging"

This reverts commit 06aabcdfe1.

* Removed debug output

* Added replication fast lane. Also refactored the commands as i cannot take it any more...

* Put some requests of RocksDBReplication onto CATCHUP Lane.

* Put some requests of MMFilesReplication onto CATCHUP Lane.

* Adjusted Fast and MED lane usage in Supervised scheduler

* Added changelog entry

* Added new features entry

* A new leader will now keep old followers in case of failover

* Update arangod/Cluster/ClusterCollectionCreationInfo.cpp

Co-Authored-By: Tobias Gödderz <tobias@arangodb.com>

* Fixed JSLINT

* Unified lane handling of replication handlers

* Sorry forgotten in last commit

* replaced strings with static strings

* more use of static strings

* optimized min repl description in the ui

* decr initial loop variable

* clean up of the createWithId test

* more use of static strings

* Update js/apps/system/_admin/aardvark/APP/frontend/js/views/collectionsView.js

Co-Authored-By: Tobias Gödderz <tobias@arangodb.com>

* Added some comments on condition, renamed variable as suggested in review

* Added check for min replicationFactor to be non-zero

* Added assertion

* Added function to modify min and max replication factor in one go

* added missing semicolon

* rm log devel

* Added a second information to follower info that can keep track of followers that have been in sync before a failover has taken place

* Maintenance reports previous version now to follower info. instead of lying by itself. The Follower Info now gets a failover save mode to report insync followers

* check replFactor against nr dbservers

* Add lie reporting in CURRENT

* Reverted most of my recent commits about Failover situation. The intended plan simply does not work out

* move replication checks from logical collection to rest collection handler

* added more replication tests

* Include assert only if we are not in gtest

* jslint

* set min repl factor to zero if satellite collection

* check replication attributes in v8 collection

* fixed ires tests

* fixed wrong assert

* properly check uint

* repl factor attr check

* adjusted test to be more preciese now

* Fixed race on atomics comparison

* Fixed invalid number type

* Update tests/js/common/shell/shell-cluster-collection.js

Co-Authored-By: Tobias Gödderz <tobias@arangodb.com>

* Review fixes

* More review fixes
2019-07-19 13:02:28 +02:00
Simon e5507d840f Feature/comm task refactor (#9426) 2019-07-16 09:43:25 +02:00
Jan 5c39692c20
remove several outdated error codes (#9402) 2019-07-05 13:36:16 +02:00
Jan 1a58cc2213
add VelocyPackHelper::equal method (#9389) 2019-07-03 12:15:11 +02:00
Jan 11f5f33659
make sure all error code names are prefixed with ERROR_ @fceller @kvs85 (#9384) 2019-07-02 18:07:33 +02:00
Jan 9cb08ded92
make the comparison functions unambiguous (#9349)
* make the comparison functions unambiguous

* added @kaveh's suggestion
2019-07-01 16:35:28 +02:00
Wilfried Goesgens 6561f21536 Feature/make rspec timeouteable (#9354)
* make rspec timeouteable (on windows for now)

* add timeout state, fix windows implementation

* abort testrun if a timeout in rspec occured

* lint

* fix scope of procdump stopper

* fix include scope

* fix invokation

* use proper variable to wait for
2019-06-28 15:01:04 +02:00
Lars Maier eb1aa6e024 Response compression (#9300)
* First draft of response compression.

* Cleanup.

* Removed compression from /_api/version.

* Added ruby test for response compression.
2019-06-28 10:02:48 +02:00
Jan 32ce797be4
make it possible to limit the number of parallel WAL tailing invocations (#9344) 2019-06-27 18:43:53 +02:00
Jan a5a656adbd
only build the status messages if actually necessary (#9318) 2019-06-25 10:16:12 +02:00
Jan Christoph Uhde 3f603f024f remove some containers from common.h (#9223)
* remove some containers from Common.h

* enterprise fixes
2019-06-07 13:27:24 +02:00