1
0
Fork 0
Commit Graph

127 Commits

Author SHA1 Message Date
Lars Maier 821509ef0f Fixed wrong permissions for rebalanceShards. (#10374) 2019-11-12 12:23:23 +03:00
KVS85 e64080e207
Merge 3.5.1 back to 3.5 (#9713)
* Bug fix 3.5/make arangosh reconnect (#9615)

* make arangosh reconnect

* added CHANGELOG entry

* fix lagging AgencyCallbacks (#9620)

* fix lagging AgencyCallbacks

* optimizations, discussed with @mchacki

* fix wording

* updated CHANGELOG

* fix yet another undefined behavior (#9629)

* [3.5.1] Fail the FailedLeader Job if the new leader fails. (#9628)

* Fail the FailedLeader Job if the new leader fails.

* Updated changelog.

* In case of timeout do not rollback.

* Fixed catch tests.

* Changed wording.

* DELETED rollback.

* reduce wait timeouts as a mitigation for notifying waiters without ho… (#9619)

* reduce wait timeouts as a mitigation for notifying waiters without holding the required mutex

this is a quick mitigation only, which reduces maximum wait time from 1
second to 100 milliseconds without changing other behavior.

the main problem of notifying pending writers without successfully
acquiring the required mutex still needs proper addressing.

* adjust timing-dependent test

* [3.5.1] Fast Controlled Leaderchange (#9634)

* First draft of keeping in sync during controlled leader change.

* Test if server is actually the leader in plan.

* Updated changelog.

* Added oldLeader check for set-the-leader request.

* Small fixes.

* Removed LOG_DEVEL.

* less copying, more moving! 🚚 (#9645)

* attempt to fix load_balancing tests in slow test environments (#9626)

* Bug fix/fix swagger datatype (#9045) (#9602)

* Bug fix/fix swagger datatype (#9045)

* remove http so https arangos will work

* verify that query parameters are proper swagger data types, fix offending documentation files

* return the actual type - not the list of available ones

* check formats

* there is no uint64 in swagger

* Fresh Swagger

* Port TakeoverShardLeadership from devel to 3.5.1 (#9659)

* Create TakeoverShardLeader job.
* Add TakeoverShardLeadership to Action factory.
* Add log message at level debug.
* Sort out LOG_TOPIC ids.
* Fix unit tests.
* CHANGELOG.

* Bug fix 3.5/hide mmfiles specific info in web ui (#9668)

* attempt to fix load_balancing tests in slow test environments (#9626)

* Bug fix/fix swagger datatype (#9045) (#9602)

* Bug fix/fix swagger datatype (#9045)

* remove http so https arangos will work

* verify that query parameters are proper swagger data types, fix offending documentation files

* return the actual type - not the list of available ones

* check formats

* there is no uint64 in swagger

* Fresh Swagger

* hide MMFiles-specific information when we don't need it

* Ported ResignLeadership to 3.5 (#9656)

* attempt to fix load_balancing tests in slow test environments (#9626)

* Bug fix/fix swagger datatype (#9045) (#9602)

* Bug fix/fix swagger datatype (#9045)

* remove http so https arangos will work

* verify that query parameters are proper swagger data types, fix offending documentation files

* return the actual type - not the list of available ones

* check formats

* there is no uint64 in swagger

* Fresh Swagger

* Ported ResignLeadership to 3.5

* Add the actual http route.

* Aardvark: Add k Shortest Paths example graph to UI (#9491) (#9661)

* Aardvark: Add k Shortest Paths example graph to UI (#9491)

* Add example graph to UI

* Add kShortestPathsGraph to examples.js

* Update example-graph.js

* Update aardvark.js

* Regenerate UI

* add the ability to have cluster special examples (#9613) (#9663)

* add the ability to have cluster special examples

* Update get_cluster_health.md

* fix abort condition, fix negative filtering for cluster tests

* Test if job fails with unmet assertion

* Remove cluster test example

* germanize

* better skip reasons

* removing superfluous semicolons

* Revert skip reasons, too noisy

* various replication improvements: (#9675)

* attempt to fix load_balancing tests in slow test environments (#9626)

* Bug fix/fix swagger datatype (#9045) (#9602)

* Bug fix/fix swagger datatype (#9045)

* remove http so https arangos will work

* verify that query parameters are proper swagger data types, fix offending documentation files

* return the actual type - not the list of available ones

* check formats

* there is no uint64 in swagger

* Fresh Swagger

* various replication improvements:

- better debuggability (more log details)
- shorter minimum wait delay in active failover
- fixed too early pruning of WAL files on leaders

* Bug fix 3.5/fix rocksdb return code (#9692)

* attempt to fix load_balancing tests in slow test environments (#9626)

* Bug fix/fix swagger datatype (#9045) (#9602)

* Bug fix/fix swagger datatype (#9045)

* remove http so https arangos will work

* verify that query parameters are proper swagger data types, fix offending documentation files

* return the actual type - not the list of available ones

* check formats

* there is no uint64 in swagger

* Fresh Swagger

* fix return codes for concurrent writes to same documents

* [3.5] Feature/rebootid notice changes, backport of #9523 (#9684)

* Feature/rebootid notice changes, backport of #9523

* Fixed error code to not re-use an old one

* Bug fix 3.5/issue 9679 (#9682)

* attempt to fix load_balancing tests in slow test environments (#9626)

* Bug fix/fix swagger datatype (#9045) (#9602)

* Bug fix/fix swagger datatype (#9045)

* remove http so https arangos will work

* verify that query parameters are proper swagger data types, fix offending documentation files

* return the actual type - not the list of available ones

* check formats

* there is no uint64 in swagger

* Fresh Swagger

* fixed issue #9679

* bug-fix/issue-#9660 (#9704) (#9707)

* bug-fix/issue-#9660 (#9704)

* fix issue

* Update tests/js/common/aql/aql-view-arangosearch-cluster.inc

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* Update tests/js/common/aql/aql-view-arangosearch-noncluster.js

Co-Authored-By: Jan <jsteemann@users.noreply.github.com>

* fix cluster tests

* Update CHANGELOG

* [3.5] agency node fixes (#9698)

* node fixes port from 3.4
* fixed change log

* update rocksdb statistics to deliver sums from column family instead of single value from default family. (#9706)

* Feature 3.5/geo functions (#9710)

* Add support for WGS84 on distances (#9672)

* Add area calculations (#9693)

* Update CHANGELOG
2019-08-14 20:24:47 +03:00
Lars Maier 7d98b5bde4 Info log when server is removed. Store remove date in agency. (#9285) 2019-06-24 14:04:57 +02:00
Lars Maier 2bb96eedac Fixed lost precondition when removing server. (#8987) 2019-05-14 15:38:17 +02:00
Jan 900c23fda7
use _system database in /_admin/cluster/health (#8898) 2019-05-06 09:13:33 +02:00
Jan Christoph Uhde 677a79026c Foxx Security (#8845) 2019-04-25 09:56:29 +02:00
Kaveh Vahedipour 68178ba165 [devel] supervision bug fix backports (#8314)
* back ports for supervision fixes from 3.4 part 1

* back ports for supervision fixes from 3.4 part 2
2019-03-04 19:27:24 +01:00
Simon 1123bc10c0 Check authentication for cluster APIs (#8051) 2019-02-04 14:47:44 +01:00
Lars Maier 52cff7ad55 Feature/engine version added to agent configuration (#7481) (#7524)
* agents' is obtained from leader's configuration
* corrections in Supervision for advertised endpoints
* change log
* Updated Documentation for cluster/health.
* Unified naming convention.
* Fixed missing update of volatile fields.
* Set version in right order.
* Removed debug output.
* Fixed jslint - missing ;
2018-11-29 14:25:40 +01:00
Kaveh Vahedipour 28754cbf15 Feature/schmutz plus plus (#5972)
- Schmutz now called "Maintenance" and completely implemented in C++
 - Fix index locking bug in mmfiles
 - Fix a bug in mmfiles with silent option and repsert
 - Slightly increase supervision okperiod and graceperiod
2018-08-24 12:15:35 +02:00
Simon 18c3069117 Make cluster routes check roles (#6239) 2018-08-24 09:46:27 +02:00
Jan 102f15bece
removed several unused internal APIs (#6193) 2018-08-20 12:57:58 +02:00
Max Neunhöffer a84d9f7335
Add an API to query for status of moveShard and cleanOutServer jobs. (#5593)
This is so far intentionally undocumented, since we want to collect
experience with it first.
2018-06-15 16:27:47 +02:00
Kaveh Vahedipour 3d043b35a3 Feature/supervsion maintenance mode (#5108)
* Supervision goes to Maintenance mode, when /arango/Supervision/Maintenance exists
* coordinator route stands
* stop updates in transient, when supervision off
2018-04-20 13:23:22 +02:00
Jan 5fd0bb7dbf
removed remainders of dysfunctional `/_admin/cluster-test` and `/_admin/clusterCheckPort` API endpoints and removed them from documentation (#4861) 2018-03-18 22:48:09 +01:00
Simon 35136a89c0 Fix some problems with active failover (#4540) 2018-02-09 15:11:53 +01:00
Heiko 61de1b6099 Bug fix/optimize shard distribution api and ui (#3921)
* UI: document/edge editor now remembering their modes (e.g. code or tree)

* changed shardDistribution api behaviour, added PUT route to only fetch collection based shard distribution

* ui: optimized shards view, added missing cleanup function in nodes view

* broken test

* shard distribution tests not fit the new api behaviour

* variables as reference

* CHANGELOG
2018-01-02 12:42:12 +01:00
Andreas Streichardt 5071a77340 Fix jslint 2017-11-21 11:36:25 +01:00
Andreas Streichardt 810ca8a9ef Fix removal of nodes 2017-11-20 17:13:37 +01:00
Michael Hackstein 5c633f9fae Bug fix/speedup shard distribution (#3645)
* Added a more sophisticated test for shardDistribution format

* Updated shard distribution test to use request instead of download

* Added a cxx reporter for the shard distribuation. WIP

* Added some virtual functions/classes for Mocking

* Added a unittest for the new CXX ShardDistribution Reporter.

* The ShardDsitributionReporter now reports Plan and Current correctly. However it does not dare to find a good total/current value and just returns a default. Hence these tests are still red

* Shard distribution now uses the cxx variant

* The ShardDistribution reporter now tries to execute count on the shards

* Updated changelog

* Added error case tests. If the servers time out the mechanism will stop bothering after two seconds and just report default values.
2017-11-10 15:17:08 +01:00
Jan 7613bc4314 Bug fix/fixes 0211 (#3568)
* remove some non-unused V8 persistents

* do not throw that many bogus assertions

* do not rely on server role being defined

* slightly better debug output for V8 context debugging

* fix collection ids in inventory response

* simplify bootstrap a bit

* slightly better error handling

* make elapsed time a queryable value

* use less memory for stub collections

* added assertions that will always make sense

* added assertions

* do not garbage-collect while waiting

* less copying of parameters

* do not show "load indexes into memory" buttons for mmfiles engine

  as all indexes are in memory anyway

* when a collection is truncated via the web interface, flush the WAL and rotate all active journals

this will make close all open journals on leader and followers and make them subject to compaction opportunities

* fix invalid server id values being passed from web interface to backend

* introduce afterTruncate method for indexes

* added test case for issue #3447

* updated CHANGELOG

* don't warn about replicationFactor for system collections

* check that the queries actually use the geo index and not some other index

* properly report error in web interface

* fix some internals checks that made truncate fail for bigger collections in maintainer mode

* also run a compact() operation after a serious truncate

in order to make iteration over the truncated range much faster
when the collection is next accessed

* increase default maximum number of V8 contexts to at least 16
2017-11-09 12:48:15 +01:00
Jan 53ea1ba560 remove dead apis (#3574) 2017-11-03 14:44:21 +01:00
Simon Grätzer 7c31960cf2 Feature/async failover (#3451) 2017-10-18 23:59:29 +02:00
Heiko a03a86fe46 fixed wrong selection of the database inside the internal cluster js api (#3202) 2017-09-13 17:19:18 +02:00
Kaveh Vahedipour 627f344266 fixed a bug, where when servers failed, when also agency leadership c… (#3189)
* fixed a bug, where when servers failed, when also agency leadership changes

* redid entire design of checkDBServers/checkCoordinators.

* comparison in supervision must be between oldPersisted and newHealth

* UI stuff

* UI stuff

* FailedServer test needed adjustment

* Hopefully final round

* fixed supervision failure detection

* FailedServer tests back to origin devel

* oldNot documented among preconditions in Agency HTTP API docs

* changed only look for status updated

* non action line in api-cluster
2017-09-07 16:10:23 +02:00
Kaveh Vahedipour 9cad75e4e8 Feature/cluster id and extended health (#3073)
* added unique id to cluster, added access to Health

* added agents to health api

* added agents to health api

* added agents to health api

* transaction information for api

* agents listed like other servers

* missing line through merge conflict

* fixed git merge glitch
2017-08-18 11:36:53 +02:00
Kaveh Vahedipour 1d1e0f5a50 Feature/cluster id and extended health (#3046)
* added unique id to cluster, added access to Health

* added agents to health api

* added agents to health api

* added agents to health api

* transaction information for api

* agents listed like other servers

* missing line through merge conflict
2017-08-18 11:13:23 +02:00
Kaveh Vahedipour 231a360b3b fixes for secondaries 2017-07-11 14:05:51 +02:00
Andreas Streichardt 7f8ff26f41 Fix jslint 2017-05-11 15:34:56 +02:00
Andreas Streichardt 9ec902ace8 Allow removing dbnodes 2017-05-11 15:18:29 +02:00
hkernbach d7bfe4dfc7 Merge branch 'devel' of github.com:arangodb/arangodb into devel 2017-05-10 14:26:45 +02:00
hkernbach 38a3bab42e added api cluster api routes 2017-05-10 14:26:26 +02:00
Andreas Streichardt 23911d66d0 Finally fix the error where suddenly an array of dbservers are being called 2017-05-10 12:10:09 +02:00
Max Neunhoeffer bf2e5f30ca Port 3.1 fixes to devel, shortName translation. 2017-04-26 10:02:14 +02:00
Max Neunhoeffer c8a205b1aa New /_api/cluster/endpoints.
Also fix documentation (and deprecate) /_api/endpoint.
2017-03-15 13:33:50 +01:00
Andreas Streichardt e345415879 Allow removing dbservers which are no longer in use 2017-02-13 17:23:39 +01:00
Andreas Streichardt 1b92c8e46b Allow removing dbservers 2017-02-13 17:23:39 +01:00
jsteemann 5a1fe3a341 jslint 2017-02-13 14:35:14 +01:00
Andreas Streichardt 1bb8f97773 Fix secondaries 2017-02-13 14:00:19 +01:00
Andreas Streichardt cf39ddd39a Fix jslint 2017-02-10 19:43:04 +01:00
Andreas Streichardt bc1df86ddd remove lock queries 2017-02-10 19:34:29 +01:00
Andreas Streichardt f1da0c54f6 jslint 2017-02-08 15:07:47 +01:00
Andreas Streichardt edd27e7eff more remove checks 2017-02-08 15:07:47 +01:00
Andreas Streichardt ad1c7e7c13 oh jslint 2017-02-08 10:49:36 +01:00
Andreas Streichardt 2e3026db91 Make it possible to remove failed coordinators 2017-02-08 10:26:13 +01:00
hkernbach 4ae274fbd3 fixed cluster parsing issue when node information is not available 2017-01-10 17:07:51 +01:00
Andreas Streichardt 975e7a65b2 Fix cluster internal requests in authentication enabled cluster 2017-01-05 16:20:15 +01:00
Kaveh Vahedipour 805f894368 fixed faulty output when primary db server from changeSecondary api could not be found 2016-11-30 11:27:54 +01:00
Andreas Streichardt 2abc46f3e6 fix shardview...no more endless loops during shard rebalancing...also
fix error management when trying to reach dbservers
2016-11-21 17:47:06 +01:00
jsteemann aa5bd85b4f fix log message 2016-11-21 14:06:10 +01:00