mirror of https://gitee.com/bigwinds/arangodb
Bug fix/collection babies race timeout (#9185)
* Fixed include guard. * Forward port of 3.4 bug-fix * Removed lockers alltogether we are secured mutex already * Fixed recursive lock gathering
This commit is contained in:
parent
cc125b377c
commit
2c78e2471b
21
CHANGELOG
21
CHANGELOG
|
@ -1,6 +1,9 @@
|
||||||
devel
|
devel
|
||||||
-----
|
-----
|
||||||
|
|
||||||
|
* Speed up collection creation process in cluster, if not all agency callbacks are
|
||||||
|
delivered successfully.
|
||||||
|
|
||||||
* increased performance of document inserts, by reducing the number of checks in unique / primary indexes
|
* increased performance of document inserts, by reducing the number of checks in unique / primary indexes
|
||||||
|
|
||||||
* fixed a callback function in the web UI where the variable `this` was out of scope.
|
* fixed a callback function in the web UI where the variable `this` was out of scope.
|
||||||
|
@ -34,7 +37,7 @@ devel
|
||||||
|
|
||||||
v3.5.0-rc.3 (2019-05-31)
|
v3.5.0-rc.3 (2019-05-31)
|
||||||
------------------------
|
------------------------
|
||||||
|
|
||||||
* fix issue #9106: Sparse Skiplist Index on multiple fields not used for FILTER + SORT query
|
* fix issue #9106: Sparse Skiplist Index on multiple fields not used for FILTER + SORT query
|
||||||
|
|
||||||
Allow AQL query optimizer to use sparse indexes in more cases, specifically when
|
Allow AQL query optimizer to use sparse indexes in more cases, specifically when
|
||||||
|
@ -52,7 +55,7 @@ v3.5.0-rc.3 (2019-05-31)
|
||||||
* Bugfix for smart graph traversals with uniqueVertices: path, which could
|
* Bugfix for smart graph traversals with uniqueVertices: path, which could
|
||||||
sometimes lead to erroneous traversal results
|
sometimes lead to erroneous traversal results
|
||||||
|
|
||||||
* Pregel algorithms can be run with the option "useMemoryMaps: true" to be
|
* Pregel algorithms can be run with the option "useMemoryMaps: true" to be
|
||||||
able to run algorithms on data that is bigger than the available RAM.
|
able to run algorithms on data that is bigger than the available RAM.
|
||||||
|
|
||||||
* fix a race in TTL thread deactivation/shutdown
|
* fix a race in TTL thread deactivation/shutdown
|
||||||
|
@ -80,7 +83,7 @@ v3.5.0-rc.2 (2019-05-23)
|
||||||
and uncompressed data blocks not fitting into the block cache
|
and uncompressed data blocks not fitting into the block cache
|
||||||
|
|
||||||
The error can only occur for collection or index scans with the RocksDB storage engine
|
The error can only occur for collection or index scans with the RocksDB storage engine
|
||||||
when the RocksDB block cache is used and set to a very small size, plus its maximum size is
|
when the RocksDB block cache is used and set to a very small size, plus its maximum size is
|
||||||
enforced by setting the `--rocksdb.enforce-block-cache-size-limit` option to `true`.
|
enforced by setting the `--rocksdb.enforce-block-cache-size-limit` option to `true`.
|
||||||
|
|
||||||
Previously these incomplete reads could have been ignored silently, making collection or
|
Previously these incomplete reads could have been ignored silently, making collection or
|
||||||
|
@ -88,7 +91,7 @@ v3.5.0-rc.2 (2019-05-23)
|
||||||
|
|
||||||
* fixed internal issue #3918: added optional second parameter "withId" to AQL
|
* fixed internal issue #3918: added optional second parameter "withId" to AQL
|
||||||
function PREGEL_RESULT
|
function PREGEL_RESULT
|
||||||
|
|
||||||
this parameter defaults to `false`. When set to `true` the results of the Pregel
|
this parameter defaults to `false`. When set to `true` the results of the Pregel
|
||||||
computation run will also contain the `_id` attribute for each vertex and not
|
computation run will also contain the `_id` attribute for each vertex and not
|
||||||
just `_key`. This allows distinguishing vertices from different vertex collections.
|
just `_key`. This allows distinguishing vertices from different vertex collections.
|
||||||
|
@ -99,9 +102,9 @@ v3.5.0-rc.2 (2019-05-23)
|
||||||
|
|
||||||
* internally switch unit tests framework from catch to gtest
|
* internally switch unit tests framework from catch to gtest
|
||||||
|
|
||||||
* disable selection of index types "hash" and "skiplist" in the web interface when
|
* disable selection of index types "hash" and "skiplist" in the web interface when
|
||||||
using the RocksDB engine. The index types "hash", "skiplist" and "persistent" are
|
using the RocksDB engine. The index types "hash", "skiplist" and "persistent" are
|
||||||
just aliases of each other with the RocksDB engine, so there is no need to offer all
|
just aliases of each other with the RocksDB engine, so there is no need to offer all
|
||||||
of them. After initially only offering "hash" indexes, we decided to only offer
|
of them. After initially only offering "hash" indexes, we decided to only offer
|
||||||
indexes of type "persistent", as it is technically the most
|
indexes of type "persistent", as it is technically the most
|
||||||
appropriate description.
|
appropriate description.
|
||||||
|
@ -619,7 +622,7 @@ v3.4.6 (2019-05-21)
|
||||||
and uncompressed data blocks not fitting into the block cache
|
and uncompressed data blocks not fitting into the block cache
|
||||||
|
|
||||||
The error can only occur for collection or index scans with the RocksDB storage engine
|
The error can only occur for collection or index scans with the RocksDB storage engine
|
||||||
when the RocksDB block cache is used and set to a very small size, plus its maximum size is
|
when the RocksDB block cache is used and set to a very small size, plus its maximum size is
|
||||||
enforced by setting the `--rocksdb.enforce-block-cache-size-limit` option to `true`.
|
enforced by setting the `--rocksdb.enforce-block-cache-size-limit` option to `true`.
|
||||||
|
|
||||||
Previously these incomplete reads could have been ignored silently, making collection or
|
Previously these incomplete reads could have been ignored silently, making collection or
|
||||||
|
@ -627,7 +630,7 @@ v3.4.6 (2019-05-21)
|
||||||
|
|
||||||
* fixed internal issue #3918: added optional second parameter "withId" to AQL
|
* fixed internal issue #3918: added optional second parameter "withId" to AQL
|
||||||
function PREGEL_RESULT
|
function PREGEL_RESULT
|
||||||
|
|
||||||
this parameter defaults to `false`. When set to `true` the results of the Pregel
|
this parameter defaults to `false`. When set to `true` the results of the Pregel
|
||||||
computation run will also contain the `_id` attribute for each vertex and not
|
computation run will also contain the `_id` attribute for each vertex and not
|
||||||
just `_key`. This allows distinguishing vertices from different vertex collections.
|
just `_key`. This allows distinguishing vertices from different vertex collections.
|
||||||
|
|
|
@ -125,7 +125,7 @@ bool AgencyCallback::execute(std::shared_ptr<VPackBuilder> newData) {
|
||||||
return result;
|
return result;
|
||||||
}
|
}
|
||||||
|
|
||||||
void AgencyCallback::executeByCallbackOrTimeout(double maxTimeout) {
|
bool AgencyCallback::executeByCallbackOrTimeout(double maxTimeout) {
|
||||||
// One needs to acquire the mutex of the condition variable
|
// One needs to acquire the mutex of the condition variable
|
||||||
// before entering this function!
|
// before entering this function!
|
||||||
if (!_cv.wait(static_cast<uint64_t>(maxTimeout * 1000000.0)) &&
|
if (!_cv.wait(static_cast<uint64_t>(maxTimeout * 1000000.0)) &&
|
||||||
|
@ -134,5 +134,7 @@ void AgencyCallback::executeByCallbackOrTimeout(double maxTimeout) {
|
||||||
<< "Waiting done and nothing happended. Refetching to be sure";
|
<< "Waiting done and nothing happended. Refetching to be sure";
|
||||||
// mop: watches have not triggered during our sleep...recheck to be sure
|
// mop: watches have not triggered during our sleep...recheck to be sure
|
||||||
refetchAndUpdate(false, true); // Force a check
|
refetchAndUpdate(false, true); // Force a check
|
||||||
|
return true;
|
||||||
}
|
}
|
||||||
|
return false;
|
||||||
}
|
}
|
||||||
|
|
|
@ -112,9 +112,12 @@ class AgencyCallback {
|
||||||
|
|
||||||
//////////////////////////////////////////////////////////////////////////////
|
//////////////////////////////////////////////////////////////////////////////
|
||||||
/// @brief wait until a callback is received or a timeout has happened
|
/// @brief wait until a callback is received or a timeout has happened
|
||||||
|
///
|
||||||
|
/// @return true => if we got woken up after maxTimeout
|
||||||
|
/// false => if someone else ringed the condition variable
|
||||||
//////////////////////////////////////////////////////////////////////////////
|
//////////////////////////////////////////////////////////////////////////////
|
||||||
|
|
||||||
void executeByCallbackOrTimeout(double);
|
bool executeByCallbackOrTimeout(double);
|
||||||
|
|
||||||
//////////////////////////////////////////////////////////////////////////////
|
//////////////////////////////////////////////////////////////////////////////
|
||||||
/// @brief private members
|
/// @brief private members
|
||||||
|
|
|
@ -1977,13 +1977,9 @@ Result ClusterInfo::createCollectionsCoordinator(std::string const& databaseName
|
||||||
|
|
||||||
if (nrDone->load(std::memory_order_acquire) == infos.size()) {
|
if (nrDone->load(std::memory_order_acquire) == infos.size()) {
|
||||||
{
|
{
|
||||||
// We need to lock all condition variables
|
// We do not need to lock all condition variables
|
||||||
std::vector<::arangodb::basics::ConditionLocker> lockers;
|
// we are save by cacheMutex
|
||||||
for (auto& cb : agencyCallbacks) {
|
|
||||||
CONDITION_LOCKER(locker, cb->_cv);
|
|
||||||
}
|
|
||||||
cbGuard.fire();
|
cbGuard.fire();
|
||||||
// After the guard is done we can release the lockers
|
|
||||||
}
|
}
|
||||||
// Now we need to remove TTL + the IsBuilding flag in Agency
|
// Now we need to remove TTL + the IsBuilding flag in Agency
|
||||||
opers.clear();
|
opers.clear();
|
||||||
|
@ -2009,13 +2005,9 @@ Result ClusterInfo::createCollectionsCoordinator(std::string const& databaseName
|
||||||
}
|
}
|
||||||
if (tmpRes > TRI_ERROR_NO_ERROR) {
|
if (tmpRes > TRI_ERROR_NO_ERROR) {
|
||||||
{
|
{
|
||||||
// We need to lock all condition variables
|
// We do not need to lock all condition variables
|
||||||
std::vector<::arangodb::basics::ConditionLocker> lockers;
|
// we are save by cacheMutex
|
||||||
for (auto& cb : agencyCallbacks) {
|
|
||||||
CONDITION_LOCKER(locker, cb->_cv);
|
|
||||||
}
|
|
||||||
cbGuard.fire();
|
cbGuard.fire();
|
||||||
// After the guard is done we can release the lockers
|
|
||||||
}
|
}
|
||||||
|
|
||||||
// report error
|
// report error
|
||||||
|
@ -2047,9 +2039,22 @@ Result ClusterInfo::createCollectionsCoordinator(std::string const& databaseName
|
||||||
TRI_ASSERT(agencyCallbacks.size() == infos.size());
|
TRI_ASSERT(agencyCallbacks.size() == infos.size());
|
||||||
for (size_t i = 0; i < infos.size(); ++i) {
|
for (size_t i = 0; i < infos.size(); ++i) {
|
||||||
if (infos[i].state == ClusterCollectionCreationInfo::INIT) {
|
if (infos[i].state == ClusterCollectionCreationInfo::INIT) {
|
||||||
// This one has not responded, wait for it.
|
bool wokenUp = false;
|
||||||
CONDITION_LOCKER(locker, agencyCallbacks[i]->_cv);
|
{
|
||||||
agencyCallbacks[i]->executeByCallbackOrTimeout(interval);
|
// This one has not responded, wait for it.
|
||||||
|
CONDITION_LOCKER(locker, agencyCallbacks[i]->_cv);
|
||||||
|
wokenUp = agencyCallbacks[i]->executeByCallbackOrTimeout(interval);
|
||||||
|
}
|
||||||
|
if (wokenUp) {
|
||||||
|
++i;
|
||||||
|
// We got woken up by waittime, not by callback.
|
||||||
|
// Let us check if we skipped other callbacks as well
|
||||||
|
for (; i < infos.size(); ++i) {
|
||||||
|
if (infos[i].state == ClusterCollectionCreationInfo::INIT) {
|
||||||
|
agencyCallbacks[i]->refetchAndUpdate(true, false);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
break;
|
break;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
Loading…
Reference in New Issue