1
0
Fork 0
Commit Graph

80 Commits

Author SHA1 Message Date
Markus Pfeiffer d25ea0e377 Cleanup ServerState.cpp (#9923)
* Remove unused function mkdir()

* Remove some outdated comments

* Add include guards to AgencyStrings.h

* Move check for initialized agency out of registerAtAgencyPhase1

* Whitespace cleanup

* Address two minor comments from review
2019-09-10 10:06:16 +02:00
Jan ec3043dd8f
check for duplicate server endpoints on cluster startup (#9860) 2019-08-30 10:29:46 +02:00
Tobias Gödderz 9cd332b958 Feature/rebootid notice changes (#9523)
* Consolidated _servers and _serverAdvertisedEndpoints, added rebootId, prepared change notifications

* Cleanup

* Added a RebootId type

* Began implementing RebootTracker (still WIP)

* Moved RebootId operators into the class

* Removed RebootId operator<< again

* Added tests, added CallbackGuard, removed/commented old RebootTracker code

* Fix: do not try to call unset callbacks

* Split one test, added another

* Added more tests

* Renamed tests, added more tests

* Fixed missing variable declarations

* Let MockServer appear to be started

* Reorded test, fixed naming

* Implemented callMeOnChange()

* Re-implemented RebootTracker (not yet working)

* Resolved a TODO, updated a test, added comments

* Call old callbacks immediately

* Fixed tests

* Use EXPECT_* instead of ASSERT_*

* Suppress a log message

* Resolved TODOs

* Reverted changes on reading ServersRegistered

* Update RebootTracker

* Introduce `rebootId` into ServerState for Cluster

 * A server *boots* if it is started on a previously non-existing data
   directory and hence does not have a UUID yet.
 * A server *reboots* if it is started on a pre-existing data directory

We keep the rebootId in the cluster's agency under
Current/ServersKnown/$uuid/rebootId.

When rebooting (and subsequently re-joining a cluster), the server increments
its rebootId in Phase 2 of registration. This way it can be detected within the
cluster whether a server was restarted.

This information will later be used to handle cases where server restarts can
lead to problems, for example with transactions or in-progress queries.

* Move rebootId into Current/ServersKnown/

* Fixed typo

* Fixed log ids

* Add deletion of ServersKnown/UUID from agency

* Add deletion of Current/ServersKnown/UUID to removeServer

* Clean up readRebootIdFromAgency and add retry loop around it

* Bugfix

* Added nolint comments

* Fixed initialization order

* Fixed ClusterInfo-test

* Added log messages

* Revert "Fixed ClusterInfo-test"

This reverts commit d983596979.

* Disabled assertion for google tests

* Ignore windows compile warning

* Always call loadServers in loadCurrent

* Fix really subtle bug when not returning a value

* Introduce `rebootId` into ServerState for Cluster

 * A server *boots* if it is started on a previously non-existing data
   directory and hence does not have a UUID yet.
 * A server *reboots* if it is started on a pre-existing data directory

We keep the rebootId in the cluster's agency under
Current/ServersKnown/$uuid/rebootId.

When rebooting (and subsequently re-joining a cluster), the server increments
its rebootId in Phase 2 of registration. This way it can be detected within the
cluster whether a server was restarted.

This information will later be used to handle cases where server restarts can
lead to problems, for example with transactions or in-progress queries.

* Move rebootId into Current/ServersKnown/

* Add deletion of ServersKnown/UUID from agency

* Add deletion of Current/ServersKnown/UUID to removeServer

* Clean up readRebootIdFromAgency and add retry loop around it

* Fixed compile error due to forbidden implicit cast

* Fixed compile error on windows

* Fixed compile error due to devel merge

* Removed dead comment

* Removed TODO note

* Extended comment

* Removed TODO note

* Fixed using an invalidated iterator

* Copy string only if necessary

* Fixed compile error
2019-08-12 09:33:22 +02:00
Andrey Abramov e8b38bfa8e
bug-fix/internal-issue-#549 (#9086)
* do not persist legacy analyzers into _analyzers table

* fix arangosearch upgrade in cluster

* get rid of Vasiliy's shit

* address review comments

* ensure link is synchronized after creation in upgrade

* fix compilation error

* minor cleanup

* fix tests

* distribute '_analyzers' collection as '_graphs'

* comment out Vasiliy's shit part 2
2019-05-30 22:00:06 +03:00
Lars Maier c99e8e8973 [devel] ClientID Agency Transaction (#8652)
* Changed clientId to format <serverid>:<uuid>.
* Changed behavior if id is not known.
2019-04-30 10:39:23 +02:00
Simon 49cc3bcd1e Refactorings from cluster trx improvement branch (#8391) 2019-03-14 23:13:17 +01:00
Tobias Gödderz a1d3bc3e94 Foxx queue jobs hanging after Foxxmaster crash (#7922)
* Fixed bug where the Foxxmaster doesn't reset jobs after a crash when it should, or a non-master coordinator removes jobs in progress during startup

* Added a regression test

* Updated CHANGELOG

* Fixed non-maintainer compile
2019-01-14 16:08:08 +01:00
Frank Celler ac9f375fb5 big reformat 2018-12-26 00:54:03 +01:00
Tobias Gödderz ceeae07ffe Reload Foxx routes during startup (#7533) 2018-12-10 12:41:50 +01:00
Simon cb4c07e0ed Replace engine equality feature (#6931)
* replace engine equality feature

* remove pointless code
2018-10-17 14:41:47 +02:00
Max Neunhöffer 84735955ea Add advertised endpoints. (#6104) 2018-09-13 16:30:55 +02:00
Kaveh Vahedipour 28754cbf15 Feature/schmutz plus plus (#5972)
- Schmutz now called "Maintenance" and completely implemented in C++
 - Fix index locking bug in mmfiles
 - Fix a bug in mmfiles with silent option and repsert
 - Slightly increase supervision okperiod and graceperiod
2018-08-24 12:15:35 +02:00
jsteemann 39021d008d make engine equality check feature abort the startup when there are different storage engines used in a cluster 2018-07-17 14:08:01 +02:00
Dan Larkin-York 21e16a8a24 Add load balancer awareness for cursor API (#5682) 2018-07-03 14:29:09 +02:00
Simon 545561e9a9 Read only server (#5652) 2018-07-03 09:58:16 +02:00
Jan 8e6d5df129
fixed minor several compiler complaints (#5406) 2018-05-23 11:50:00 +02:00
Matthew Von-Maszewski dd03ca5dd8 shutdown quicker on db server and coordinator heartbeat threads (#5114)
* shutdown quicker on db server and coordinator heartbeat threads

* Adjust the new condition variable usage to follow normal coding patterns.  Probably was ok.  But why take the chance.  And simplify future maintenance.
2018-04-20 15:00:08 +02:00
Simon 8be273efb8 Replication cleanup (#5105) 2018-04-17 08:17:42 +02:00
Jan 76dcd6ded5
added option `--cluster.require-persisted-id` (#5001) 2018-04-13 11:08:49 +02:00
Jan 7cb115a1a9
remove option `--cluster.my-local-info` (#4999) 2018-04-03 17:34:08 +02:00
Jan 5abf0c1185 Bug fix/fixes 1511 (#3711) 2017-11-16 14:18:51 +01:00
Kaveh Vahedipour 7e816db51e Bug fix/agency restart enhancements (#3619)
* Removed unused active(...) method in Agent
* Inception's restart from persistence allows peer with empty active RAFT list to join
* Agency's UUID is persisted outside of the database comparable to coordinator and db server action.
* Publicized Methods to UUID stuff in ServerState
* Inception method documentation
* added --agency.disaster-recovery-id to allow for specification of known former agency id. this is a very dangerous option potentially.
* Delete a unused methods.
* separate _id and _recoveryId
* populating active list with entire pool
* Improve logging.
* reject gossip from unknown agent, if pool is complete
2017-11-10 23:40:26 +01:00
m0ppers 6cbf7159be Feature/server mode (#3590)
* Switching from ttl to supervision based failover mechanism

* Allowing canceling of ongoing actions

* refactored asyncjobmanager

* refactoring some code

* adding read-only flag

* Fixing some bugs

* Fixing tests

* Canceling ongoing operations

* removing some unused code + some asserts

* catching some exceptions to reduce log pollution, removing unnecessary code, removing tests for _changeMode

* fixing "createsANewDatabaseWithAnInvalidUser"

* Current work

* proper ifs

* Migrate resthandler to c++ and implement mode

* readonly server mode spec test

* Add changelog

* change code so it expects a full object and not just a string

* Fix jslint
2017-11-10 17:56:21 +01:00
Simon Grätzer ee8209943f Missing things for active / passive (#3578)
* Switching from ttl to supervision based failover mechanism

* Allowing canceling of ongoing actions

* refactored asyncjobmanager

* refactoring some code

* adding read-only flag

* catching some exceptions to reduce log pollution, removing unnecessary code, removing tests for _changeMode

* fixing "createsANewDatabaseWithAnInvalidUser"

* auth = off does not longer make everyone superuser

* Fixing cluster_sync and maybe resilience
2017-11-04 20:30:23 +01:00
Simon Grätzer 7c31960cf2 Feature/async failover (#3451) 2017-10-18 23:59:29 +02:00
Max Neunhöffer 9a2385b941 Add host id detection and show in /_admin/cluster/Health. (#3389) 2017-10-11 12:42:44 +02:00
Simon Grätzer ffc465433a No access collections Improvements (#3190)
* consolidated EdgeDocumentToken

* optimizing cluster traversal

* adding skip collection checks

* API cleanup

* copying AQLValue to avoid use-after-free bugs

* Fixing rocksdb SingleServerEdgeCursor

* Fixing a collection resolving issue
2017-09-07 14:55:07 +02:00
Kaveh Vahedipour c1abc0333d cluster documentation varnish (#2553) 2017-06-12 19:02:11 +02:00
Andreas Streichardt 1bb8f97773 Fix secondaries 2017-02-13 14:00:19 +01:00
Andreas Streichardt fe07f3515f Fixup registering with agency 2017-02-10 19:35:11 +01:00
jsteemann a1b3bfcc80 dont include ServerState when not needed 2017-02-02 10:16:53 +01:00
Kaveh Vahedipour bcfec215b8 tested restart from 3.1 database 2017-01-28 20:32:29 +01:00
Kaveh Vahedipour daa1856aa0 localId overrules persisted UUID 2017-01-28 12:05:31 +01:00
Kaveh Vahedipour 4a95e82fa6 ShortName for servers in new ugly UUID world 2016-11-25 15:25:51 +01:00
Kaveh Vahedipour f553f3460e unique identifiers in cluster ids 2016-11-18 15:28:23 +01:00
Andreas Streichardt 1318fa313b Implement cluster authentication 2016-10-17 13:35:55 +02:00
Andreas Streichardt 87c8c0033a Improve clusterawareness in foxx and foxx queues 2016-08-10 12:26:24 +02:00
Andreas Streichardt 526c8f42c2 Fix foxx issues in cluster
Bootstrap will now be done on the bootstrap coordinator.

queues will now be executed by the "foxxmaster"
2016-07-29 16:06:31 +02:00
Andreas Streichardt f7301bdc7c Implement unregister on shutdown 2016-06-10 18:21:41 +02:00
jsteemann 0da9ac7cdc micro optimizations 2016-04-23 16:23:15 +02:00
Kaveh Vahedipour a315997617 callback redesign 2016-04-07 10:20:08 +02:00
Andreas Streichardt 90862b6081 Proper secondary => Primary failover 2016-03-17 22:39:15 +00:00
Andreas Streichardt dca42efb2e Rework cluster state handling 2016-03-17 14:48:33 +00:00
Andreas Streichardt adce528373 Proper initialization
Also find a fitting spot for our role
2016-02-04 11:29:43 +01:00
Jan Steemann 9046e1831b clang-format 2016-01-27 13:43:46 +01:00
jsteemann 842384016d namespace cleanup 2016-01-21 00:20:22 +01:00
jsteemann 431900f17a changed namespace from triagens to arangodb 2016-01-17 00:44:53 +01:00
Jan Steemann 687d6133f0 comments reformatting 2016-01-11 09:52:39 +01:00
jsteemann 9f0576c65f don't rely so much on namespace std being present 2016-01-08 01:05:06 +01:00
jsteemann 50c0e18d53 removed useless con|destructor comments 2016-01-07 21:19:53 +01:00