1
0
Fork 0

centralized more of the sharding strategies code (#6140)

This commit is contained in:
Jan 2018-08-15 14:37:01 +02:00 committed by GitHub
parent dc2bca23cf
commit 2bc672cebd
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
14 changed files with 410 additions and 102 deletions

View File

@ -112,6 +112,36 @@ included in the list of attribute paths for the index:
| a, b, c | a, b | not allowed |
| a, b, c | a, b, c | allowed |
Sharding strategy
-----------------
strategy to use for the collection. Since ArangoDB 3.4 there are
different sharding strategies to select from when creating a new
collection. The selected *shardingStrategy* value will remain
fixed for the collection and cannot be changed afterwards. This is
important to make the collection keep its sharding settings and
always find documents already distributed to shards using the same
initial sharding algorithm.
The available sharding strategies are:
- `community-compat`: default sharding used by ArangoDB community
versions before ArangoDB 3.4
- `enterprise-compat`: default sharding used by ArangoDB enterprise
versions before ArangoDB 3.4
- `enterprise-smart-edge-compat`: default sharding used by smart edge
collections in ArangoDB enterprise versions before ArangoDB 3.4
- `hash`: default sharding used by ArangoDB 3.4 for new collections
(excluding smart edge collections)
- `enterprise-hash-smart-edge`: default sharding used by ArangoDB 3.4
for new smart edge collections
If no sharding strategy is specified, the default will be `hash` for
all collections, and `enterprise-hash-smart-edge` for all smart edge
collections (requires the *Enterprise Edition* of ArangoDB).
Manually overriding the sharding strategy does not yet provide a
benefit, but it may later in case other sharding strategies are added.
Moving/Rebalancing _shards_
---------------------------

View File

@ -20,7 +20,7 @@ these documents are marked as 'dead' with a deletion marker.
Over time the number of dead documents may rise, and we don't want to use the previously mentioned
resources, plus the disk space should be given back to the system.
Thus several journal files can be combined to one, ommitting the dead documents.
Thus several journal files can be combined to one, omitting the dead documents.
Combining several of these data files into one is called compaction. The compaction process reads
the alive documents from the original data files, and writes them into new data file.

View File

@ -165,11 +165,39 @@ to the [naming conventions](../NamingConventions/README.md).
performance on these collections.
- *distributeShardsLike* distribute the shards of this collection
cloning the shard distribution of another. If this value is set
it will copy *replicationFactor* and *numberOfShards* from the
other collection, the attributes in this collection will be
ignored and can be omitted.
cloning the shard distribution of another. If this value is set,
it will copy the attributes *replicationFactor*, *numberOfShards* and
*shardingStrategy* from the other collection.
- *shardingStrategy* (optional): specifies the name of the sharding
strategy to use for the collection. Since ArangoDB 3.4 there are
different sharding strategies to select from when creating a new
collection. The selected *shardingStrategy* value will remain
fixed for the collection and cannot be changed afterwards. This is
important to make the collection keep its sharding settings and
always find documents already distributed to shards using the same
initial sharding algorithm.
The available sharding strategies are:
- `community-compat`: default sharding used by ArangoDB community
versions before ArangoDB 3.4
- `enterprise-compat`: default sharding used by ArangoDB enterprise
versions before ArangoDB 3.4
- `enterprise-smart-edge-compat`: default sharding used by smart edge
collections in ArangoDB enterprise versions before ArangoDB 3.4
- `hash`: default sharding used by ArangoDB 3.4 for new collections
(excluding smart edge collections)
- `enterprise-hash-smart-edge`: default sharding used by ArangoDB 3.4
for new smart edge collections
If no sharding strategy is specified, the default will be `hash` for
all collections, and `enterprise-hash-smart-edge` for all smart edge
collections (requires the *Enterprise Edition* of ArangoDB).
Manually overriding the sharding strategy does not yet provide a
benefit, but it may later in case other sharding strategies are added.
In single-server mode, the *shardingStrategy* attribute is meaningless
and will be ignored.
`db._create(collection-name, properties, type)`

View File

@ -39,22 +39,22 @@ For example you might create your collections like this:
```javascript
// Create main vertex collection:
db._create("vertices", {
shardKeys:['_key'],
numberOfShards: 8
});
shardKeys: ['_key'],
numberOfShards: 8
});
// Optionally create arbitrary additional vertex collections
db._create("additonal", {
distributeShardsLike:"vertices",
numberOfShards:8
});
distributeShardsLike: "vertices",
numberOfShards: 8
});
// Create (one or more) edge-collections:
db._createEdgeCollection("edges", {
shardKeys:['vertex'],
distributeShardsLike:"vertices",
numberOfShards:8
});
shardKeys: ['vertex'],
distributeShardsLike: "vertices",
numberOfShards: 8
});
```
You will need to ensure that edge documents contain the proper values in their sharding attribute.
@ -62,9 +62,9 @@ For a vertex document with the following content ```{_key:"A", value:0}```
the corresponding edge documents would have look like this:
```
{_from:"vertices/A", _to: "vertices/B", vertex:"A"}
{_from:"vertices/A", _to: "vertices/C", vertex:"A"}
{_from:"vertices/A", _to: "vertices/D", vertex:"A"}
{"_from":"vertices/A", "_to": "vertices/B", vertex:"A"}
{"_from":"vertices/A", "_to": "vertices/C", vertex:"A"}
{"_from":"vertices/A", "_to": "vertices/D", vertex:"A"}
...
```

View File

@ -179,7 +179,7 @@ the `SORT` clause of the query in the same order as they appear in the index def
Skiplist indexes are always created in ascending order, but they can be used to access
the indexed elements in both ascending or descending order. However, for a combined index
(an index on multiple attributes) this requires that the sort orders in a single query
as specified in the `SORT` clause must be either all ascending (optionally ommitted
as specified in the `SORT` clause must be either all ascending (optionally omitted
as ascending is the default) or all descending.
For example, if the skiplist index is created on attributes `value1` and `value2`

View File

@ -175,8 +175,16 @@ HTTP REST API
The following incompatible changes were made in context of ArangoDB's HTTP REST
APIs:
- The following, partly undocumented REST APIs have been removed in ArangoDB 3.4:
- `GET /_admin/test`
- `GET /_admin/clusterCheckPort`
- `GET /_admin/cluster-test`
- `GET /_admin/statistics/short`
- `GET /_admin/statistics/long`
- `GET /_api/index` will now return type `geo` for geo indexes, not type `geo1`
or `geo2`.
or `geo2` as previous versions did.
For geo indexes, the index API will not return the attributes `constraint` and
`ignoreNull` anymore. These attributes were initially deprecated in ArangoDB 2.5
@ -203,12 +211,6 @@ APIs:
AQL user functions on the top level of the response.
Each AQL user function description now also contains the 'isDeterministic' attribute.
- `GET /_admin/status` now returns the attribute `operationMode` in addition to
`mode`. The attribute `writeOpsEnabled` is now also represented by the new an
attribute `readOnly`, which is has an inverted value compared to the original
attribute. In future releases the old attributes will be deprecated in favor
of the new ones.
- if authentication is turned on, requests to databases by users with insufficient
access rights will be answered with HTTP 401 (forbidden) instead of HTTP 404 (not found).
@ -246,6 +248,44 @@ The following APIs have been added or augmented:
}
```
- `GET /_admin/status` now returns the attribute `operationMode` in addition to
`mode`. The attribute `writeOpsEnabled` is now also represented by the new an
attribute `readOnly`, which is has an inverted value compared to the original
attribute. In future releases the old attributes will be deprecated in favor
of the new ones.
- `POST /_api/collection` now will process the optional `shardingStrategy`
attribute in the response body in cluster mode.
This attribute specifies the name of the sharding strategy to use for the
collection. Since ArangoDB 3.4 there are different sharding strategies to
select from when creating a new collection. The selected *shardingStrategy*
value will remain fixed for the collection and cannot be changed afterwards.
This is important to make the collection keep its sharding settings and
always find documents already distributed to shards using the same initial
sharding algorithm.
The available sharding strategies are:
- `community-compat`: default sharding used by ArangoDB community
versions before ArangoDB 3.4
- `enterprise-compat`: default sharding used by ArangoDB enterprise
versions before ArangoDB 3.4
- `enterprise-smart-edge-compat`: default sharding used by smart edge
collections in ArangoDB enterprise versions before ArangoDB 3.4
- `hash`: default sharding used by ArangoDB 3.4 for new collections
(excluding smart edge collections)
- `enterprise-hash-smart-edge`: default sharding used by ArangoDB 3.4
for new smart edge collections
If no sharding strategy is specified, the default will be `hash` for
all collections, and `enterprise-hash-smart-edge` for all smart edge
collections (requires the *Enterprise Edition* of ArangoDB).
Manually overriding the sharding strategy does not yet provide a
benefit, but it may later in case other sharding strategies are added.
In single-server mode, the *shardingStrategy* attribute is meaningless and
will be ignored.
- APIs for view management have been added at endpoint `/_api/view`.
- The REST APIs for modifying graphs at endpoint `/_api/gharial` now support returning
@ -262,14 +302,6 @@ The following APIs have been added or augmented:
The exception from this is that the HTTP DELETE verb for these APIs does not
support `returnOld` because that would make the existing API incompatible.
The following, partly undocumented REST APIs have been removed in ArangoDB 3.4:
- `GET /_admin/test`
- `GET /_admin/clusterCheckPort`
- `GET /_admin/cluster-test`
- `GET /_admin/statistics/short`
- `GET /_admin/statistics/long`
AQL
---

View File

@ -48,6 +48,8 @@ In a cluster setup, the result will also contain the following attributes:
- *replicationFactor*: contains how many copies of each shard are kept on different DBServers.
- *shardingStrategy*: the sharding strategy selected for the collection.
@RESTRETURNCODES
@RESTRETURNCODE{400}

View File

@ -83,7 +83,7 @@ Not used for other key generator types.
The following values for *type* are valid:
- *2*: document collection
- *3*: edges collection
- *3*: edge collection
@RESTBODYPARAM{indexBuckets,integer,optional,int64}
The number of buckets into which indexes using a hash
@ -127,14 +127,41 @@ copies take over, usually without an error being reported.
@RESTBODYPARAM{distributeShardsLike,string,optional,string}
(The default is *""*): in an Enterprise Edition cluster, this attribute binds
the specifics of sharding for the newly created collection to follow that of a
the specifics of sharding for the newly created collection to follow that of a
specified existing collection.
**Note**: Using this parameter has consequences for the prototype
collection. It can no longer be dropped, before sharding imitating
collection. It can no longer be dropped, before the sharding-imitating
collections are dropped. Equally, backups and restores of imitating
collections alone will generate warnings, which can be overridden,
collections alone will generate warnings (which can be overridden)
about missing sharding prototype.
@RESTBODYPARAM{shardingStrategy,string,optional,string}
This attribute specifies the name of the sharding strategy to use for
the collection. Since ArangoDB 3.4 there are different sharding strategies
to select from when creating a new collection. The selected *shardingStrategy*
value will remain fixed for the collection and cannot be changed afterwards.
This is important to make the collection keep its sharding settings and
always find documents already distributed to shards using the same
initial sharding algorithm.
The available sharding strategies are:
- *community-compat*: default sharding used by ArangoDB community
versions before ArangoDB 3.4
- *enterprise-compat*: default sharding used by ArangoDB enterprise
versions before ArangoDB 3.4
- *enterprise-smart-edge-compat*: default sharding used by smart edge
collections in ArangoDB enterprise versions before ArangoDB 3.4
- *hash*: default sharding used by ArangoDB 3.4 for new collections
(excluding smart edge collections)
- *enterprise-hash-smart-edge*: default sharding used by ArangoDB 3.4
for new smart edge collections
If no sharding strategy is specified, the default will be *hash* for
all collections, and *enterprise-hash-smart-edge* for all smart edge
collections (requires the *Enterprise Edition* of ArangoDB).
Manually overriding the sharding strategy does not yet provide a
benefit, but it may later in case other sharding strategies are added.
@RESTQUERYPARAMETERS
@RESTQUERYPARAM{waitForSyncReplication,integer,optional}

View File

@ -54,6 +54,10 @@ In a cluster setup, the result will also contain the following attributes:
* *replicationFactor*: determines how many copies of each shard are kept
on different DBServers.
* *shardingStrategy*: the sharding strategy selected for the collection.
This attribute will only be populated in cluster mode and is not populated
in single-server mode.
`collection.properties(properties)`
Changes the collection properties. *properties* must be an object with
@ -73,13 +77,9 @@ one or more of the following attribute(s):
different DBServers, valid values are integer numbers
in the range of 1-10 *(Cluster only)*
*Note*: it is not possible to change the journal size after the journal or
datafile has been created. Changing this parameter will only effect newly
created journals. Also note that you cannot lower the journal size to less
then size of the largest document already stored in the collection.
**Note**: some other collection properties, such as *type*, *isVolatile*,
or *keyOptions* cannot be changed once the collection is created.
*keyOptions*, *numberOfShards* or *shardingStrategy* cannot be changed once
the collection is created.
@EXAMPLES

View File

@ -54,14 +54,19 @@ void ShardingFeature::prepare() {
ShardingStrategyCommunityCompat::NAME, [](ShardingInfo* sharding) {
return std::make_unique<ShardingStrategyCommunityCompat>(sharding);
});
registerFactory(ShardingStrategyHash::NAME, [](ShardingInfo* sharding) {
return std::make_unique<ShardingStrategyHash>(sharding);
});
#ifdef USE_ENTERPRISE
// note: enterprise-compat is always there so users can downgrade from
// enterprise edition to community edition
registerFactory(
ShardingStrategyEnterpriseCompat::NAME, [](ShardingInfo* sharding) {
return std::make_unique<ShardingStrategyEnterpriseCompat>(sharding);
});
registerFactory(
ShardingStrategyHash::NAME, [](ShardingInfo* sharding) {
return std::make_unique<ShardingStrategyHash>(sharding);
});
#ifdef USE_ENTERPRISE
// the following sharding strategies are only available in the enterprise
// edition
registerFactory(
ShardingStrategyEnterpriseSmartEdgeCompat::NAME,
[](ShardingInfo* sharding) {
@ -74,6 +79,18 @@ void ShardingFeature::prepare() {
return std::make_unique<ShardingStrategyEnterpriseHashSmartEdge>(
sharding);
});
#else
// in the community-version register some stand-ins for the sharding
// strategies only available in the enterprise edition
// note: these standins will actually not do any sharding, but always
// throw an exception telling the user that the selected sharding
// strategy is only available in the enterprise edition
for (auto const& name : std::vector<std::string>{"enterprise-smart-edge-compat", "enterprise-hash-smart-edge"}) {
registerFactory(name,
[name](ShardingInfo* sharding) {
return std::make_unique<ShardingStrategyOnlyInEnterprise>(name);
});
}
#endif
}
@ -147,6 +164,8 @@ std::string ShardingFeature::getDefaultShardingStrategy(
// on a DB server, we will not use sharding
return ShardingStrategyNone::NAME;
}
// before 3.4, there were only hard-coded sharding strategies
// no sharding strategy found in collection meta data
#ifdef USE_ENTERPRISE
@ -166,6 +185,7 @@ std::string ShardingFeature::getDefaultShardingStrategyForNewCollection(
VPackSlice const& properties) const {
TRI_ASSERT(ServerState::instance()->isRunningInCluster());
// from 3.4 onwards, the default sharding strategy for new collections is "hash"
#ifdef USE_ENTERPRISE
bool isSmart =
VelocyPackHelper::getBooleanValue(properties, "isSmart", false);

View File

@ -48,10 +48,14 @@ class ShardingFeature : public application_features::ApplicationFeature {
std::unique_ptr<ShardingStrategy> create(std::string const& name, ShardingInfo* sharding);
/// @brief returns the name of the default sharding strategy for new
/// collections
std::string getDefaultShardingStrategyForNewCollection(
VPackSlice const& properties) const;
private:
/// @brief returns the name of the default sharding strategy for existing
/// collections without a sharding strategy assigned
std::string getDefaultShardingStrategy(ShardingInfo const* sharding) const;
std::unordered_map<std::string, ShardingStrategy::FactoryFunction> _factories;

View File

@ -42,6 +42,12 @@ namespace {
enum class Part : uint8_t { ALL, FRONT, BACK };
void preventUseOnSmartEdgeCollection(LogicalCollection const* collection, std::string const& strategyName) {
if (collection->isSmart() && collection->type() == TRI_COL_TYPE_EDGE) {
THROW_ARANGO_EXCEPTION_MESSAGE(TRI_ERROR_BAD_PARAMETER, std::string("sharding strategy ") + strategyName + " cannot be used for smart edge collections");
}
}
inline void parseAttributeAndPart(std::string const& attr,
std::string& realAttr, Part& part) {
if (attr.size() > 0 && attr.back() == ':') {
@ -56,6 +62,7 @@ inline void parseAttributeAndPart(std::string const& attr,
}
}
template<bool returnNullSlice>
VPackSlice buildTemporarySlice(VPackSlice const& sub, Part const& part,
VPackBuilder& temporaryBuilder,
bool splitSlash) {
@ -106,15 +113,69 @@ VPackSlice buildTemporarySlice(VPackSlice const& sub, Part const& part,
}
}
}
if (returnNullSlice) {
return VPackSlice::nullSlice();
}
return sub;
}
template<bool returnNullSlice>
uint64_t hashByAttributesImpl(
VPackSlice slice, std::vector<std::string> const& attributes,
bool docComplete, int& error, std::string const& key) {
uint64_t hash = TRI_FnvHashBlockInitial();
error = TRI_ERROR_NO_ERROR;
slice = slice.resolveExternal();
if (slice.isObject()) {
std::string realAttr;
::Part part;
for (auto const& attr : attributes) {
::parseAttributeAndPart(attr, realAttr, part);
VPackSlice sub = slice.get(realAttr).resolveExternal();
VPackBuilder temporaryBuilder;
if (sub.isNone()) {
if (realAttr == StaticStrings::KeyString && !key.empty()) {
temporaryBuilder.add(VPackValue(key));
sub = temporaryBuilder.slice();
} else {
if (!docComplete) {
error = TRI_ERROR_CLUSTER_NOT_ALL_SHARDING_ATTRIBUTES_GIVEN;
}
// Null is equal to None/not present
sub = VPackSlice::nullSlice();
}
}
sub = ::buildTemporarySlice<returnNullSlice>(sub, part, temporaryBuilder, false);
hash = sub.normalizedHash(hash);
}
} else if (slice.isString() && attributes.size() == 1) {
std::string realAttr;
::Part part;
::parseAttributeAndPart(attributes[0], realAttr, part);
if (realAttr == StaticStrings::KeyString && key.empty()) {
// We always need the _key part. Everything else should be ignored
// beforehand.
VPackBuilder temporaryBuilder;
VPackSlice sub =
::buildTemporarySlice<returnNullSlice>(slice, part, temporaryBuilder, true);
hash = sub.normalizedHash(hash);
}
}
return hash;
}
} // namespace
std::string const ShardingStrategyNone::NAME("none");
std::string const ShardingStrategyCommunityCompat::NAME("community-compat");
std::string const ShardingStrategyEnterpriseCompat::NAME("enterprise-compat");
std::string const ShardingStrategyHash::NAME("hash");
/// @brief a sharding class used for single server and the DB servers
/// calling getResponsibleShard on this class will always throw an exception
ShardingStrategyNone::ShardingStrategyNone()
: ShardingStrategy() {
@ -123,6 +184,7 @@ ShardingStrategyNone::ShardingStrategyNone()
}
}
/// calling getResponsibleShard on this class will always throw an exception
int ShardingStrategyNone::getResponsibleShard(arangodb::velocypack::Slice slice,
bool docComplete, ShardID& shardID,
bool& usesDefaultShardKeys,
@ -130,6 +192,25 @@ int ShardingStrategyNone::getResponsibleShard(arangodb::velocypack::Slice slice,
THROW_ARANGO_EXCEPTION_MESSAGE(TRI_ERROR_INTERNAL, "unexpected invocation of ShardingStrategyNone");
}
/// @brief a sharding class used to indicate that the selected sharding strategy
/// is only available in the enterprise edition of ArangoDB
/// calling getResponsibleShard on this class will always throw an exception
/// with an appropriate error message
ShardingStrategyOnlyInEnterprise::ShardingStrategyOnlyInEnterprise(std::string const& name)
: ShardingStrategy(),
_name(name) {}
/// @brief will always throw an exception telling the user the selected sharding is only
/// available in the enterprise edition
int ShardingStrategyOnlyInEnterprise::getResponsibleShard(arangodb::velocypack::Slice slice,
bool docComplete, ShardID& shardID,
bool& usesDefaultShardKeys,
std::string const& key) {
THROW_ARANGO_EXCEPTION_MESSAGE(TRI_ERROR_ONLY_ENTERPRISE, std::string("sharding strategy '") + _name + "' is only available in the enterprise edition of ArangoDB");
}
/// @brief base class for hash-based sharding
ShardingStrategyHashBase::ShardingStrategyHashBase(ShardingInfo* sharding)
: ShardingStrategy(),
@ -204,48 +285,10 @@ uint64_t ShardingStrategyHashBase::hashByAttributes(
VPackSlice slice, std::vector<std::string> const& attributes,
bool docComplete, int& error, std::string const& key) {
uint64_t hash = TRI_FnvHashBlockInitial();
error = TRI_ERROR_NO_ERROR;
slice = slice.resolveExternal();
if (slice.isObject()) {
std::string realAttr;
::Part part;
for (auto const& attr : attributes) {
::parseAttributeAndPart(attr, realAttr, part);
VPackSlice sub = slice.get(realAttr).resolveExternal();
VPackBuilder temporaryBuilder;
if (sub.isNone()) {
if (realAttr == StaticStrings::KeyString && !key.empty()) {
temporaryBuilder.add(VPackValue(key));
sub = temporaryBuilder.slice();
} else {
if (!docComplete) {
error = TRI_ERROR_CLUSTER_NOT_ALL_SHARDING_ATTRIBUTES_GIVEN;
}
// Null is equal to None/not present
sub = VPackSlice::nullSlice();
}
}
sub = ::buildTemporarySlice(sub, part, temporaryBuilder, false);
hash = sub.normalizedHash(hash);
}
} else if (slice.isString() && attributes.size() == 1) {
std::string realAttr;
::Part part;
::parseAttributeAndPart(attributes[0], realAttr, part);
if (realAttr == StaticStrings::KeyString && key.empty()) {
// We always need the _key part. Everything else should be ignored
// beforehand.
VPackBuilder temporaryBuilder;
VPackSlice sub =
::buildTemporarySlice(slice, part, temporaryBuilder, true);
hash = sub.normalizedHash(hash);
}
}
return hash;
return ::hashByAttributesImpl<false>(slice, attributes, docComplete, error, key);
}
/// @brief old version of the sharding used in the community edition
/// this is DEPRECATED and should not be used for new collections
ShardingStrategyCommunityCompat::ShardingStrategyCommunityCompat(ShardingInfo* sharding)
@ -258,19 +301,65 @@ ShardingStrategyCommunityCompat::ShardingStrategyCommunityCompat(ShardingInfo* s
_usesDefaultShardKeys = true;
}
if (_sharding->collection()->isSmart() &&
_sharding->collection()->type() == TRI_COL_TYPE_EDGE) {
THROW_ARANGO_EXCEPTION_MESSAGE(TRI_ERROR_BAD_PARAMETER, std::string("sharding strategy ") + NAME + " cannot be used for smart edge collection");
::preventUseOnSmartEdgeCollection(_sharding->collection(), NAME);
}
/// @brief old version of the sharding used in the enterprise edition
/// this is DEPRECATED and should not be used for new collections
ShardingStrategyEnterpriseBase::ShardingStrategyEnterpriseBase(
ShardingInfo* sharding)
: ShardingStrategyHashBase(sharding) {
// whether or not the collection uses the default shard attributes (["_key"])
// this setting is initialized to false, and we may change it now
TRI_ASSERT(!_usesDefaultShardKeys);
auto shardKeys = _sharding->shardKeys();
TRI_ASSERT(!shardKeys.empty());
if (shardKeys.size() == 1) {
_usesDefaultShardKeys =
(shardKeys[0] == StaticStrings::KeyString ||
(shardKeys[0][0] == ':' &&
shardKeys[0].compare(1, shardKeys[0].size() - 1,
StaticStrings::KeyString) == 0) ||
(shardKeys[0].back() == ':' &&
shardKeys[0].compare(0, shardKeys[0].size() - 1,
StaticStrings::KeyString) == 0));
}
}
/// @brief this implementation of "hashByAttributes" is slightly different
/// than the implementation in the Community version
/// we leave the differences in place, because making any changes here
/// will affect the data distribution, which we want to avoid
uint64_t ShardingStrategyEnterpriseBase::hashByAttributes(
VPackSlice slice, std::vector<std::string> const& attributes,
bool docComplete, int& error, std::string const& key) {
return ::hashByAttributesImpl<true>(slice, attributes, docComplete, error, key);
}
/// @brief old version of the sharding used in the enterprise edition
/// this is DEPRECATED and should not be used for new collections
ShardingStrategyEnterpriseCompat::ShardingStrategyEnterpriseCompat(
ShardingInfo* sharding)
: ShardingStrategyEnterpriseBase(sharding) {
::preventUseOnSmartEdgeCollection(_sharding->collection(), NAME);
}
/// @brief default hash-based sharding strategy
/// used for new collections from 3.4 onwards
ShardingStrategyHash::ShardingStrategyHash(ShardingInfo* sharding)
: ShardingStrategyHashBase(sharding) {
// whether or not the collection uses the default shard attributes (["_key"])
// this setting is initialized to false, and we may change it now
TRI_ASSERT(!_usesDefaultShardKeys);
auto shardKeys = _sharding->shardKeys();
TRI_ASSERT(!shardKeys.empty());
if (shardKeys.size() == 1) {
_usesDefaultShardKeys =
(shardKeys[0] == StaticStrings::KeyString ||
@ -282,8 +371,5 @@ ShardingStrategyHash::ShardingStrategyHash(ShardingInfo* sharding)
StaticStrings::KeyString) == 0));
}
if (_sharding->collection()->isSmart() &&
_sharding->collection()->type() == TRI_COL_TYPE_EDGE) {
THROW_ARANGO_EXCEPTION_MESSAGE(TRI_ERROR_BAD_PARAMETER, std::string("sharding strategy ") + NAME + " cannot be used for smart edge collection");
}
::preventUseOnSmartEdgeCollection(_sharding->collection(), NAME);
}

View File

@ -52,6 +52,31 @@ class ShardingStrategyNone final : public ShardingStrategy {
std::string const& key = "") override;
};
/// @brief a sharding class used to indicate that the selected sharding strategy
/// is only available in the enterprise edition of ArangoDB
/// calling getResponsibleShard on this class will always throw an exception
/// with an appropriate error message
class ShardingStrategyOnlyInEnterprise final : public ShardingStrategy {
public:
explicit ShardingStrategyOnlyInEnterprise(std::string const& name);
std::string const& name() const override { return _name; }
/// @brief does not really matter here
bool usesDefaultShardKeys() override { return true; }
/// @brief will always throw an exception telling the user the selected sharding is only
/// available in the enterprise edition
int getResponsibleShard(arangodb::velocypack::Slice,
bool docComplete, ShardID& shardID,
bool& usesDefaultShardKeys,
std::string const& key = "") override;
private:
/// @brief name of the sharding strategy we are replacing
std::string const _name;
};
/// @brief base class for hash-based sharding
class ShardingStrategyHashBase : public ShardingStrategy {
public:
@ -91,8 +116,37 @@ class ShardingStrategyCommunityCompat final : public ShardingStrategyHashBase {
static std::string const NAME;
};
/// @brief old version of the sharding used in the community edition
/// @brief old version of the sharding used in the enterprise edition
/// this is DEPRECATED and should not be used for new collections
class ShardingStrategyEnterpriseBase : public ShardingStrategyHashBase {
public:
explicit ShardingStrategyEnterpriseBase(ShardingInfo* sharding);
protected:
/// @brief this implementation of "hashByAttributes" is slightly different
/// than the implementation in the Community version
/// we leave the differences in place, because making any changes here
/// will affect the data distribution, which we want to avoid
uint64_t hashByAttributes(arangodb::velocypack::Slice slice,
std::vector<std::string> const& attributes,
bool docComplete, int& error,
std::string const& key) override final;
};
/// @brief old version of the sharding used in the enterprise edition
/// this is DEPRECATED and should not be used for new collections
class ShardingStrategyEnterpriseCompat : public ShardingStrategyEnterpriseBase {
public:
explicit ShardingStrategyEnterpriseCompat(ShardingInfo* sharding);
std::string const& name() const override { return NAME; }
static std::string const NAME;
};
/// @brief default hash-based sharding strategy
/// used for new collections from 3.4 onwards
class ShardingStrategyHash final : public ShardingStrategyHashBase {
public:
explicit ShardingStrategyHash(ShardingInfo* sharding);

View File

@ -68,13 +68,38 @@ function DocumentShardingSuite() {
},
testCreateWithEnterpriseSharding : function () {
if (!isEnterprise) {
try {
db._create(name1, { shardingStrategy: enterpriseCompat, numberOfShards: 5 });
} catch (err) {
assertEqual(ERRORS.ERROR_BAD_PARAMETER.code, err.errorNum);
}
let c = db._create(name1, { shardingStrategy: enterpriseCompat, numberOfShards: 5 });
assertEqual(enterpriseCompat, c.properties()["shardingStrategy"]);
assertEqual(5, c.properties()["numberOfShards"]);
assertEqual(["_key"], c.properties()["shardKeys"]);
for (let i = 0; i < 1000; ++i) {
c.insert({ _key: "test" + i, value: i });
}
assertEqual([ 188, 192, 198, 204, 218 ], Object.values(c.count(true)).sort());
for (let i = 0; i < 1000; ++i) {
assertEqual(i, c.document("test" + i).value);
}
},
testCreateWithEnterpriseShardingNonDefaultKeysNumericValues : function () {
let c = db._create(name1, { shardingStrategy: enterpriseCompat, shardKeys: ["value"], numberOfShards: 5 });
assertEqual(enterpriseCompat, c.properties()["shardingStrategy"]);
assertEqual(5, c.properties()["numberOfShards"]);
assertEqual(["value"], c.properties()["shardKeys"]);
let keys = [];
for (let i = 0; i < 1000; ++i) {
keys.push(c.insert({ value: i })._key);
}
assertEqual([ 0, 0, 0, 0, 1000 ], Object.values(c.count(true)).sort());
keys.forEach(function(k, i) {
assertEqual(i, c.document(k).value);
});
},
testCreateWithCommunitySharding : function () {