mirror of https://gitee.com/bigwinds/arangodb
137 lines
3.9 KiB
Markdown
137 lines
3.9 KiB
Markdown
---
|
|
layout: default
|
|
description: Note
|
|
---
|
|
Monitoring replication slave
|
|
============================
|
|
|
|
**Note**: this recipe is working with ArangoDB 2.5, you need a collectd curl_json plugin with correct boolean type mapping.
|
|
|
|
Problem
|
|
-------
|
|
|
|
How to monitor the slave status using the `collectd curl_JSON` plugin.
|
|
|
|
Solution
|
|
--------
|
|
|
|
Since arangodb [reports the replication status in JSON](../http/replications-replication-applier.html#state-of-the-replication-applier),
|
|
integrating it with the [collectd curl_JSON plugin](monitoring-collectd.html)
|
|
should be an easy exercise. However, only very recent versions of collectd will handle boolean flags correctly.
|
|
|
|
Our test master/slave setup runs with the master listening on `tcp://127.0.0.1:8529` and the slave (which we query) listening on `tcp://127.0.0.1:8530`.
|
|
They replicate a database by the name `testDatabase`.
|
|
|
|
Since replication appliers are active per database and our example doesn't use the default `_system`, we need to specify its name in the URL like this: `_db/testDatabase`.
|
|
|
|
We need to parse a document from a request like this:
|
|
|
|
curl --dump - http://localhost:8530/_db/testDatabase/_api/replication/applier-state
|
|
|
|
If the replication is not running the document will look like that:
|
|
|
|
```javascript
|
|
{
|
|
"state": {
|
|
"running": false,
|
|
"lastAppliedContinuousTick": null,
|
|
"lastProcessedContinuousTick": null,
|
|
"lastAvailableContinuousTick": null,
|
|
"safeResumeTick": null,
|
|
"progress": {
|
|
"time": "2015-11-02T13:24:07Z",
|
|
"message": "applier shut down",
|
|
"failedConnects": 0
|
|
},
|
|
"totalRequests": 1,
|
|
"totalFailedConnects": 0,
|
|
"totalEvents": 0,
|
|
"totalOperationsExcluded": 0,
|
|
"lastError": {
|
|
"time": "2015-11-02T13:24:07Z",
|
|
"errorMessage": "no start tick",
|
|
"errorNum": 1413
|
|
},
|
|
"time": "2015-11-02T13:31:53Z"
|
|
},
|
|
"server": {
|
|
"version": "2.7.0",
|
|
"serverId": "175584498800385"
|
|
},
|
|
"endpoint": "tcp://127.0.0.1:8529",
|
|
"database": "testDatabase"
|
|
}
|
|
```
|
|
|
|
A running replication will return something like this:
|
|
|
|
```javascript
|
|
{
|
|
"state": {
|
|
"running": true,
|
|
"lastAppliedContinuousTick": "1150610894145",
|
|
"lastProcessedContinuousTick": "1150610894145",
|
|
"lastAvailableContinuousTick": "1151639153985",
|
|
"safeResumeTick": "1150610894145",
|
|
"progress": {
|
|
"time": "2015-11-02T13:49:56Z",
|
|
"message": "fetching master log from tick 1150610894145",
|
|
"failedConnects": 0
|
|
},
|
|
"totalRequests": 12,
|
|
"totalFailedConnects": 0,
|
|
"totalEvents": 2,
|
|
"totalOperationsExcluded": 0,
|
|
"lastError": {
|
|
"errorNum": 0
|
|
},
|
|
"time": "2015-11-02T13:49:57Z"
|
|
},
|
|
"server": {
|
|
"version": "2.7.0",
|
|
"serverId": "175584498800385"
|
|
},
|
|
"endpoint": "tcp://127.0.0.1:8529",
|
|
"database": "testDatabase"
|
|
}
|
|
```
|
|
|
|
We create a simple collectd configuration in `/etc/collectd/collectd.conf.d/slave_testDatabase.conf` that matches our API:
|
|
|
|
```javascript
|
|
TypesDB "/etc/collectd/collectd.conf.d/slavestate_types.db"
|
|
<Plugin curl_json>
|
|
# Adjust the URL so collectd can reach your arangod slave instance:
|
|
<URL "http://localhost:8530/_db/testDatabase/_api/replication/applier-state">
|
|
# Set your authentication to that database here:
|
|
# User "foo"
|
|
# Password "bar"
|
|
<Key "state/running">
|
|
Type "boolean"
|
|
</Key>
|
|
<Key "state/totalOperationsExcluded">
|
|
Type "counter"
|
|
</Key>
|
|
<Key "state/totalRequests">
|
|
Type "counter"
|
|
</Key>
|
|
<Key "state/totalFailedConnects">
|
|
Type "counter"
|
|
</Key>
|
|
</URL>
|
|
</Plugin>
|
|
```
|
|
|
|
To get nice metric names, we specify our own `types.db` file in `/etc/collectd/collectd.conf.d/slavestate_types.db`:
|
|
|
|
```
|
|
boolean value:ABSOLUTE:0:1
|
|
```
|
|
|
|
So, basically `state/running` will give you `0`/`1` if its (not / ) running through the collectd monitor.
|
|
|
|
|
|
**Author:** [Wilfried Goesgens](https://github.com/dothebart){:target="_blank"}
|
|
|
|
**Tags:** #monitoring #foxx #json
|