diff --git a/Documentation/Books/Manual/Administration/Cluster/README.md b/Documentation/Books/Manual/Administration/Cluster/README.md index c531d615a8..edbf1157c1 100644 --- a/Documentation/Books/Manual/Administration/Cluster/README.md +++ b/Documentation/Books/Manual/Administration/Cluster/README.md @@ -112,6 +112,36 @@ included in the list of attribute paths for the index: | a, b, c | a, b | not allowed | | a, b, c | a, b, c | allowed | +Sharding strategy +----------------- + +strategy to use for the collection. Since ArangoDB 3.4 there are +different sharding strategies to select from when creating a new +collection. The selected *shardingStrategy* value will remain +fixed for the collection and cannot be changed afterwards. This is +important to make the collection keep its sharding settings and +always find documents already distributed to shards using the same +initial sharding algorithm. + +The available sharding strategies are: +- `community-compat`: default sharding used by ArangoDB community + versions before ArangoDB 3.4 +- `enterprise-compat`: default sharding used by ArangoDB enterprise + versions before ArangoDB 3.4 +- `enterprise-smart-edge-compat`: default sharding used by smart edge + collections in ArangoDB enterprise versions before ArangoDB 3.4 +- `hash`: default sharding used by ArangoDB 3.4 for new collections + (excluding smart edge collections) +- `enterprise-hash-smart-edge`: default sharding used by ArangoDB 3.4 + for new smart edge collections + +If no sharding strategy is specified, the default will be `hash` for +all collections, and `enterprise-hash-smart-edge` for all smart edge +collections (requires the *Enterprise Edition* of ArangoDB). +Manually overriding the sharding strategy does not yet provide a +benefit, but it may later in case other sharding strategies are added. + + Moving/Rebalancing _shards_ --------------------------- diff --git a/Documentation/Books/Manual/Administration/Configuration/Compaction.md b/Documentation/Books/Manual/Administration/Configuration/Compaction.md index 7a9fc5db12..b7796714de 100644 --- a/Documentation/Books/Manual/Administration/Configuration/Compaction.md +++ b/Documentation/Books/Manual/Administration/Configuration/Compaction.md @@ -20,7 +20,7 @@ these documents are marked as 'dead' with a deletion marker. Over time the number of dead documents may rise, and we don't want to use the previously mentioned resources, plus the disk space should be given back to the system. -Thus several journal files can be combined to one, ommitting the dead documents. +Thus several journal files can be combined to one, omitting the dead documents. Combining several of these data files into one is called compaction. The compaction process reads the alive documents from the original data files, and writes them into new data file. diff --git a/Documentation/Books/Manual/DataModeling/Collections/DatabaseMethods.md b/Documentation/Books/Manual/DataModeling/Collections/DatabaseMethods.md index c1d8cbc5d4..2083ed5f85 100644 --- a/Documentation/Books/Manual/DataModeling/Collections/DatabaseMethods.md +++ b/Documentation/Books/Manual/DataModeling/Collections/DatabaseMethods.md @@ -165,11 +165,39 @@ to the [naming conventions](../NamingConventions/README.md). performance on these collections. - *distributeShardsLike* distribute the shards of this collection - cloning the shard distribution of another. If this value is set - it will copy *replicationFactor* and *numberOfShards* from the - other collection, the attributes in this collection will be - ignored and can be omitted. + cloning the shard distribution of another. If this value is set, + it will copy the attributes *replicationFactor*, *numberOfShards* and + *shardingStrategy* from the other collection. +- *shardingStrategy* (optional): specifies the name of the sharding + strategy to use for the collection. Since ArangoDB 3.4 there are + different sharding strategies to select from when creating a new + collection. The selected *shardingStrategy* value will remain + fixed for the collection and cannot be changed afterwards. This is + important to make the collection keep its sharding settings and + always find documents already distributed to shards using the same + initial sharding algorithm. + + The available sharding strategies are: + - `community-compat`: default sharding used by ArangoDB community + versions before ArangoDB 3.4 + - `enterprise-compat`: default sharding used by ArangoDB enterprise + versions before ArangoDB 3.4 + - `enterprise-smart-edge-compat`: default sharding used by smart edge + collections in ArangoDB enterprise versions before ArangoDB 3.4 + - `hash`: default sharding used by ArangoDB 3.4 for new collections + (excluding smart edge collections) + - `enterprise-hash-smart-edge`: default sharding used by ArangoDB 3.4 + for new smart edge collections + + If no sharding strategy is specified, the default will be `hash` for + all collections, and `enterprise-hash-smart-edge` for all smart edge + collections (requires the *Enterprise Edition* of ArangoDB). + Manually overriding the sharding strategy does not yet provide a + benefit, but it may later in case other sharding strategies are added. + + In single-server mode, the *shardingStrategy* attribute is meaningless + and will be ignored. `db._create(collection-name, properties, type)` diff --git a/Documentation/Books/Manual/Graphs/Pregel/README.md b/Documentation/Books/Manual/Graphs/Pregel/README.md index 12c803d622..b315063e2c 100644 --- a/Documentation/Books/Manual/Graphs/Pregel/README.md +++ b/Documentation/Books/Manual/Graphs/Pregel/README.md @@ -39,22 +39,22 @@ For example you might create your collections like this: ```javascript // Create main vertex collection: db._create("vertices", { - shardKeys:['_key'], - numberOfShards: 8 - }); + shardKeys: ['_key'], + numberOfShards: 8 +}); // Optionally create arbitrary additional vertex collections db._create("additonal", { - distributeShardsLike:"vertices", - numberOfShards:8 - }); + distributeShardsLike: "vertices", + numberOfShards: 8 +}); // Create (one or more) edge-collections: db._createEdgeCollection("edges", { - shardKeys:['vertex'], - distributeShardsLike:"vertices", - numberOfShards:8 - }); + shardKeys: ['vertex'], + distributeShardsLike: "vertices", + numberOfShards: 8 +}); ``` You will need to ensure that edge documents contain the proper values in their sharding attribute. @@ -62,9 +62,9 @@ For a vertex document with the following content ```{_key:"A", value:0}``` the corresponding edge documents would have look like this: ``` - {_from:"vertices/A", _to: "vertices/B", vertex:"A"} - {_from:"vertices/A", _to: "vertices/C", vertex:"A"} - {_from:"vertices/A", _to: "vertices/D", vertex:"A"} + {"_from":"vertices/A", "_to": "vertices/B", vertex:"A"} + {"_from":"vertices/A", "_to": "vertices/C", vertex:"A"} + {"_from":"vertices/A", "_to": "vertices/D", vertex:"A"} ... ``` diff --git a/Documentation/Books/Manual/Indexing/IndexBasics.md b/Documentation/Books/Manual/Indexing/IndexBasics.md index e02c4c2e6e..d1ae08cec9 100644 --- a/Documentation/Books/Manual/Indexing/IndexBasics.md +++ b/Documentation/Books/Manual/Indexing/IndexBasics.md @@ -179,7 +179,7 @@ the `SORT` clause of the query in the same order as they appear in the index def Skiplist indexes are always created in ascending order, but they can be used to access the indexed elements in both ascending or descending order. However, for a combined index (an index on multiple attributes) this requires that the sort orders in a single query -as specified in the `SORT` clause must be either all ascending (optionally ommitted +as specified in the `SORT` clause must be either all ascending (optionally omitted as ascending is the default) or all descending. For example, if the skiplist index is created on attributes `value1` and `value2` diff --git a/Documentation/Books/Manual/ReleaseNotes/UpgradingChanges34.md b/Documentation/Books/Manual/ReleaseNotes/UpgradingChanges34.md index 06c47fc49b..78bc09f67d 100644 --- a/Documentation/Books/Manual/ReleaseNotes/UpgradingChanges34.md +++ b/Documentation/Books/Manual/ReleaseNotes/UpgradingChanges34.md @@ -175,8 +175,16 @@ HTTP REST API The following incompatible changes were made in context of ArangoDB's HTTP REST APIs: +- The following, partly undocumented REST APIs have been removed in ArangoDB 3.4: + + - `GET /_admin/test` + - `GET /_admin/clusterCheckPort` + - `GET /_admin/cluster-test` + - `GET /_admin/statistics/short` + - `GET /_admin/statistics/long` + - `GET /_api/index` will now return type `geo` for geo indexes, not type `geo1` - or `geo2`. + or `geo2` as previous versions did. For geo indexes, the index API will not return the attributes `constraint` and `ignoreNull` anymore. These attributes were initially deprecated in ArangoDB 2.5 @@ -203,12 +211,6 @@ APIs: AQL user functions on the top level of the response. Each AQL user function description now also contains the 'isDeterministic' attribute. -- `GET /_admin/status` now returns the attribute `operationMode` in addition to - `mode`. The attribute `writeOpsEnabled` is now also represented by the new an - attribute `readOnly`, which is has an inverted value compared to the original - attribute. In future releases the old attributes will be deprecated in favor - of the new ones. - - if authentication is turned on, requests to databases by users with insufficient access rights will be answered with HTTP 401 (forbidden) instead of HTTP 404 (not found). @@ -246,6 +248,44 @@ The following APIs have been added or augmented: } ``` +- `GET /_admin/status` now returns the attribute `operationMode` in addition to + `mode`. The attribute `writeOpsEnabled` is now also represented by the new an + attribute `readOnly`, which is has an inverted value compared to the original + attribute. In future releases the old attributes will be deprecated in favor + of the new ones. + +- `POST /_api/collection` now will process the optional `shardingStrategy` + attribute in the response body in cluster mode. + + This attribute specifies the name of the sharding strategy to use for the + collection. Since ArangoDB 3.4 there are different sharding strategies to + select from when creating a new collection. The selected *shardingStrategy* + value will remain fixed for the collection and cannot be changed afterwards. + This is important to make the collection keep its sharding settings and + always find documents already distributed to shards using the same initial + sharding algorithm. + + The available sharding strategies are: + - `community-compat`: default sharding used by ArangoDB community + versions before ArangoDB 3.4 + - `enterprise-compat`: default sharding used by ArangoDB enterprise + versions before ArangoDB 3.4 + - `enterprise-smart-edge-compat`: default sharding used by smart edge + collections in ArangoDB enterprise versions before ArangoDB 3.4 + - `hash`: default sharding used by ArangoDB 3.4 for new collections + (excluding smart edge collections) + - `enterprise-hash-smart-edge`: default sharding used by ArangoDB 3.4 + for new smart edge collections + + If no sharding strategy is specified, the default will be `hash` for + all collections, and `enterprise-hash-smart-edge` for all smart edge + collections (requires the *Enterprise Edition* of ArangoDB). + Manually overriding the sharding strategy does not yet provide a + benefit, but it may later in case other sharding strategies are added. + + In single-server mode, the *shardingStrategy* attribute is meaningless and + will be ignored. + - APIs for view management have been added at endpoint `/_api/view`. - The REST APIs for modifying graphs at endpoint `/_api/gharial` now support returning @@ -262,14 +302,6 @@ The following APIs have been added or augmented: The exception from this is that the HTTP DELETE verb for these APIs does not support `returnOld` because that would make the existing API incompatible. -The following, partly undocumented REST APIs have been removed in ArangoDB 3.4: - -- `GET /_admin/test` -- `GET /_admin/clusterCheckPort` -- `GET /_admin/cluster-test` -- `GET /_admin/statistics/short` -- `GET /_admin/statistics/long` - AQL --- diff --git a/Documentation/DocuBlocks/Rest/Collections/get_api_collection_properties.md b/Documentation/DocuBlocks/Rest/Collections/get_api_collection_properties.md index 08b5413136..2ccfd5ec50 100644 --- a/Documentation/DocuBlocks/Rest/Collections/get_api_collection_properties.md +++ b/Documentation/DocuBlocks/Rest/Collections/get_api_collection_properties.md @@ -48,6 +48,8 @@ In a cluster setup, the result will also contain the following attributes: - *replicationFactor*: contains how many copies of each shard are kept on different DBServers. +- *shardingStrategy*: the sharding strategy selected for the collection. + @RESTRETURNCODES @RESTRETURNCODE{400} diff --git a/Documentation/DocuBlocks/Rest/Collections/post_api_collection.md b/Documentation/DocuBlocks/Rest/Collections/post_api_collection.md index 124021dfbe..46620990ff 100644 --- a/Documentation/DocuBlocks/Rest/Collections/post_api_collection.md +++ b/Documentation/DocuBlocks/Rest/Collections/post_api_collection.md @@ -83,7 +83,7 @@ Not used for other key generator types. The following values for *type* are valid: - *2*: document collection -- *3*: edges collection +- *3*: edge collection @RESTBODYPARAM{indexBuckets,integer,optional,int64} The number of buckets into which indexes using a hash @@ -127,14 +127,41 @@ copies take over, usually without an error being reported. @RESTBODYPARAM{distributeShardsLike,string,optional,string} (The default is *""*): in an Enterprise Edition cluster, this attribute binds -the specifics of sharding for the newly created collection to follow that of a +the specifics of sharding for the newly created collection to follow that of a specified existing collection. **Note**: Using this parameter has consequences for the prototype -collection. It can no longer be dropped, before sharding imitating +collection. It can no longer be dropped, before the sharding-imitating collections are dropped. Equally, backups and restores of imitating -collections alone will generate warnings, which can be overridden, +collections alone will generate warnings (which can be overridden) about missing sharding prototype. +@RESTBODYPARAM{shardingStrategy,string,optional,string} +This attribute specifies the name of the sharding strategy to use for +the collection. Since ArangoDB 3.4 there are different sharding strategies +to select from when creating a new collection. The selected *shardingStrategy* +value will remain fixed for the collection and cannot be changed afterwards. +This is important to make the collection keep its sharding settings and +always find documents already distributed to shards using the same +initial sharding algorithm. + +The available sharding strategies are: +- *community-compat*: default sharding used by ArangoDB community + versions before ArangoDB 3.4 +- *enterprise-compat*: default sharding used by ArangoDB enterprise + versions before ArangoDB 3.4 +- *enterprise-smart-edge-compat*: default sharding used by smart edge + collections in ArangoDB enterprise versions before ArangoDB 3.4 +- *hash*: default sharding used by ArangoDB 3.4 for new collections + (excluding smart edge collections) +- *enterprise-hash-smart-edge*: default sharding used by ArangoDB 3.4 + for new smart edge collections + +If no sharding strategy is specified, the default will be *hash* for +all collections, and *enterprise-hash-smart-edge* for all smart edge +collections (requires the *Enterprise Edition* of ArangoDB). +Manually overriding the sharding strategy does not yet provide a +benefit, but it may later in case other sharding strategies are added. + @RESTQUERYPARAMETERS @RESTQUERYPARAM{waitForSyncReplication,integer,optional} diff --git a/Documentation/DocuBlocks/collectionProperties.md b/Documentation/DocuBlocks/collectionProperties.md index 1593d6d80f..cc2a9fb312 100644 --- a/Documentation/DocuBlocks/collectionProperties.md +++ b/Documentation/DocuBlocks/collectionProperties.md @@ -54,6 +54,10 @@ In a cluster setup, the result will also contain the following attributes: * *replicationFactor*: determines how many copies of each shard are kept on different DBServers. +* *shardingStrategy*: the sharding strategy selected for the collection. + This attribute will only be populated in cluster mode and is not populated + in single-server mode. + `collection.properties(properties)` Changes the collection properties. *properties* must be an object with @@ -73,13 +77,9 @@ one or more of the following attribute(s): different DBServers, valid values are integer numbers in the range of 1-10 *(Cluster only)* -*Note*: it is not possible to change the journal size after the journal or -datafile has been created. Changing this parameter will only effect newly -created journals. Also note that you cannot lower the journal size to less -then size of the largest document already stored in the collection. - **Note**: some other collection properties, such as *type*, *isVolatile*, -or *keyOptions* cannot be changed once the collection is created. +*keyOptions*, *numberOfShards* or *shardingStrategy* cannot be changed once +the collection is created. @EXAMPLES diff --git a/arangod/Sharding/ShardingFeature.cpp b/arangod/Sharding/ShardingFeature.cpp index 29676d399c..ee5acd5093 100644 --- a/arangod/Sharding/ShardingFeature.cpp +++ b/arangod/Sharding/ShardingFeature.cpp @@ -54,14 +54,19 @@ void ShardingFeature::prepare() { ShardingStrategyCommunityCompat::NAME, [](ShardingInfo* sharding) { return std::make_unique(sharding); }); - registerFactory(ShardingStrategyHash::NAME, [](ShardingInfo* sharding) { - return std::make_unique(sharding); - }); -#ifdef USE_ENTERPRISE + // note: enterprise-compat is always there so users can downgrade from + // enterprise edition to community edition registerFactory( ShardingStrategyEnterpriseCompat::NAME, [](ShardingInfo* sharding) { return std::make_unique(sharding); }); + registerFactory( + ShardingStrategyHash::NAME, [](ShardingInfo* sharding) { + return std::make_unique(sharding); + }); +#ifdef USE_ENTERPRISE + // the following sharding strategies are only available in the enterprise + // edition registerFactory( ShardingStrategyEnterpriseSmartEdgeCompat::NAME, [](ShardingInfo* sharding) { @@ -74,6 +79,18 @@ void ShardingFeature::prepare() { return std::make_unique( sharding); }); +#else + // in the community-version register some stand-ins for the sharding + // strategies only available in the enterprise edition + // note: these standins will actually not do any sharding, but always + // throw an exception telling the user that the selected sharding + // strategy is only available in the enterprise edition + for (auto const& name : std::vector{"enterprise-smart-edge-compat", "enterprise-hash-smart-edge"}) { + registerFactory(name, + [name](ShardingInfo* sharding) { + return std::make_unique(name); + }); + } #endif } @@ -147,6 +164,8 @@ std::string ShardingFeature::getDefaultShardingStrategy( // on a DB server, we will not use sharding return ShardingStrategyNone::NAME; } + + // before 3.4, there were only hard-coded sharding strategies // no sharding strategy found in collection meta data #ifdef USE_ENTERPRISE @@ -166,6 +185,7 @@ std::string ShardingFeature::getDefaultShardingStrategyForNewCollection( VPackSlice const& properties) const { TRI_ASSERT(ServerState::instance()->isRunningInCluster()); + // from 3.4 onwards, the default sharding strategy for new collections is "hash" #ifdef USE_ENTERPRISE bool isSmart = VelocyPackHelper::getBooleanValue(properties, "isSmart", false); diff --git a/arangod/Sharding/ShardingFeature.h b/arangod/Sharding/ShardingFeature.h index 7a4bfc43d9..baddc2137b 100644 --- a/arangod/Sharding/ShardingFeature.h +++ b/arangod/Sharding/ShardingFeature.h @@ -48,10 +48,14 @@ class ShardingFeature : public application_features::ApplicationFeature { std::unique_ptr create(std::string const& name, ShardingInfo* sharding); + /// @brief returns the name of the default sharding strategy for new + /// collections std::string getDefaultShardingStrategyForNewCollection( VPackSlice const& properties) const; private: + /// @brief returns the name of the default sharding strategy for existing + /// collections without a sharding strategy assigned std::string getDefaultShardingStrategy(ShardingInfo const* sharding) const; std::unordered_map _factories; diff --git a/arangod/Sharding/ShardingStrategyDefault.cpp b/arangod/Sharding/ShardingStrategyDefault.cpp index c9127f4ea1..f871e75286 100644 --- a/arangod/Sharding/ShardingStrategyDefault.cpp +++ b/arangod/Sharding/ShardingStrategyDefault.cpp @@ -42,6 +42,12 @@ namespace { enum class Part : uint8_t { ALL, FRONT, BACK }; +void preventUseOnSmartEdgeCollection(LogicalCollection const* collection, std::string const& strategyName) { + if (collection->isSmart() && collection->type() == TRI_COL_TYPE_EDGE) { + THROW_ARANGO_EXCEPTION_MESSAGE(TRI_ERROR_BAD_PARAMETER, std::string("sharding strategy ") + strategyName + " cannot be used for smart edge collections"); + } +} + inline void parseAttributeAndPart(std::string const& attr, std::string& realAttr, Part& part) { if (attr.size() > 0 && attr.back() == ':') { @@ -56,6 +62,7 @@ inline void parseAttributeAndPart(std::string const& attr, } } +template VPackSlice buildTemporarySlice(VPackSlice const& sub, Part const& part, VPackBuilder& temporaryBuilder, bool splitSlash) { @@ -106,15 +113,69 @@ VPackSlice buildTemporarySlice(VPackSlice const& sub, Part const& part, } } } + + if (returnNullSlice) { + return VPackSlice::nullSlice(); + } return sub; } + +template +uint64_t hashByAttributesImpl( + VPackSlice slice, std::vector const& attributes, + bool docComplete, int& error, std::string const& key) { + uint64_t hash = TRI_FnvHashBlockInitial(); + error = TRI_ERROR_NO_ERROR; + slice = slice.resolveExternal(); + if (slice.isObject()) { + std::string realAttr; + ::Part part; + for (auto const& attr : attributes) { + ::parseAttributeAndPart(attr, realAttr, part); + VPackSlice sub = slice.get(realAttr).resolveExternal(); + VPackBuilder temporaryBuilder; + if (sub.isNone()) { + if (realAttr == StaticStrings::KeyString && !key.empty()) { + temporaryBuilder.add(VPackValue(key)); + sub = temporaryBuilder.slice(); + } else { + if (!docComplete) { + error = TRI_ERROR_CLUSTER_NOT_ALL_SHARDING_ATTRIBUTES_GIVEN; + } + // Null is equal to None/not present + sub = VPackSlice::nullSlice(); + } + } + sub = ::buildTemporarySlice(sub, part, temporaryBuilder, false); + hash = sub.normalizedHash(hash); + } + } else if (slice.isString() && attributes.size() == 1) { + std::string realAttr; + ::Part part; + ::parseAttributeAndPart(attributes[0], realAttr, part); + if (realAttr == StaticStrings::KeyString && key.empty()) { + // We always need the _key part. Everything else should be ignored + // beforehand. + VPackBuilder temporaryBuilder; + VPackSlice sub = + ::buildTemporarySlice(slice, part, temporaryBuilder, true); + hash = sub.normalizedHash(hash); + } + } + return hash; } +} // namespace + std::string const ShardingStrategyNone::NAME("none"); std::string const ShardingStrategyCommunityCompat::NAME("community-compat"); +std::string const ShardingStrategyEnterpriseCompat::NAME("enterprise-compat"); std::string const ShardingStrategyHash::NAME("hash"); + +/// @brief a sharding class used for single server and the DB servers +/// calling getResponsibleShard on this class will always throw an exception ShardingStrategyNone::ShardingStrategyNone() : ShardingStrategy() { @@ -123,6 +184,7 @@ ShardingStrategyNone::ShardingStrategyNone() } } +/// calling getResponsibleShard on this class will always throw an exception int ShardingStrategyNone::getResponsibleShard(arangodb::velocypack::Slice slice, bool docComplete, ShardID& shardID, bool& usesDefaultShardKeys, @@ -130,6 +192,25 @@ int ShardingStrategyNone::getResponsibleShard(arangodb::velocypack::Slice slice, THROW_ARANGO_EXCEPTION_MESSAGE(TRI_ERROR_INTERNAL, "unexpected invocation of ShardingStrategyNone"); } + +/// @brief a sharding class used to indicate that the selected sharding strategy +/// is only available in the enterprise edition of ArangoDB +/// calling getResponsibleShard on this class will always throw an exception +/// with an appropriate error message +ShardingStrategyOnlyInEnterprise::ShardingStrategyOnlyInEnterprise(std::string const& name) + : ShardingStrategy(), + _name(name) {} + +/// @brief will always throw an exception telling the user the selected sharding is only +/// available in the enterprise edition +int ShardingStrategyOnlyInEnterprise::getResponsibleShard(arangodb::velocypack::Slice slice, + bool docComplete, ShardID& shardID, + bool& usesDefaultShardKeys, + std::string const& key) { + THROW_ARANGO_EXCEPTION_MESSAGE(TRI_ERROR_ONLY_ENTERPRISE, std::string("sharding strategy '") + _name + "' is only available in the enterprise edition of ArangoDB"); +} + + /// @brief base class for hash-based sharding ShardingStrategyHashBase::ShardingStrategyHashBase(ShardingInfo* sharding) : ShardingStrategy(), @@ -204,48 +285,10 @@ uint64_t ShardingStrategyHashBase::hashByAttributes( VPackSlice slice, std::vector const& attributes, bool docComplete, int& error, std::string const& key) { - uint64_t hash = TRI_FnvHashBlockInitial(); - error = TRI_ERROR_NO_ERROR; - slice = slice.resolveExternal(); - if (slice.isObject()) { - std::string realAttr; - ::Part part; - for (auto const& attr : attributes) { - ::parseAttributeAndPart(attr, realAttr, part); - VPackSlice sub = slice.get(realAttr).resolveExternal(); - VPackBuilder temporaryBuilder; - if (sub.isNone()) { - if (realAttr == StaticStrings::KeyString && !key.empty()) { - temporaryBuilder.add(VPackValue(key)); - sub = temporaryBuilder.slice(); - } else { - if (!docComplete) { - error = TRI_ERROR_CLUSTER_NOT_ALL_SHARDING_ATTRIBUTES_GIVEN; - } - // Null is equal to None/not present - sub = VPackSlice::nullSlice(); - } - } - sub = ::buildTemporarySlice(sub, part, temporaryBuilder, false); - hash = sub.normalizedHash(hash); - } - } else if (slice.isString() && attributes.size() == 1) { - std::string realAttr; - ::Part part; - ::parseAttributeAndPart(attributes[0], realAttr, part); - if (realAttr == StaticStrings::KeyString && key.empty()) { - // We always need the _key part. Everything else should be ignored - // beforehand. - VPackBuilder temporaryBuilder; - VPackSlice sub = - ::buildTemporarySlice(slice, part, temporaryBuilder, true); - hash = sub.normalizedHash(hash); - } - } - - return hash; + return ::hashByAttributesImpl(slice, attributes, docComplete, error, key); } + /// @brief old version of the sharding used in the community edition /// this is DEPRECATED and should not be used for new collections ShardingStrategyCommunityCompat::ShardingStrategyCommunityCompat(ShardingInfo* sharding) @@ -258,19 +301,65 @@ ShardingStrategyCommunityCompat::ShardingStrategyCommunityCompat(ShardingInfo* s _usesDefaultShardKeys = true; } - if (_sharding->collection()->isSmart() && - _sharding->collection()->type() == TRI_COL_TYPE_EDGE) { - THROW_ARANGO_EXCEPTION_MESSAGE(TRI_ERROR_BAD_PARAMETER, std::string("sharding strategy ") + NAME + " cannot be used for smart edge collection"); + ::preventUseOnSmartEdgeCollection(_sharding->collection(), NAME); +} + + +/// @brief old version of the sharding used in the enterprise edition +/// this is DEPRECATED and should not be used for new collections +ShardingStrategyEnterpriseBase::ShardingStrategyEnterpriseBase( + ShardingInfo* sharding) + : ShardingStrategyHashBase(sharding) { + // whether or not the collection uses the default shard attributes (["_key"]) + // this setting is initialized to false, and we may change it now + TRI_ASSERT(!_usesDefaultShardKeys); + auto shardKeys = _sharding->shardKeys(); + TRI_ASSERT(!shardKeys.empty()); + + if (shardKeys.size() == 1) { + _usesDefaultShardKeys = + (shardKeys[0] == StaticStrings::KeyString || + (shardKeys[0][0] == ':' && + shardKeys[0].compare(1, shardKeys[0].size() - 1, + StaticStrings::KeyString) == 0) || + (shardKeys[0].back() == ':' && + shardKeys[0].compare(0, shardKeys[0].size() - 1, + StaticStrings::KeyString) == 0)); } } +/// @brief this implementation of "hashByAttributes" is slightly different +/// than the implementation in the Community version +/// we leave the differences in place, because making any changes here +/// will affect the data distribution, which we want to avoid +uint64_t ShardingStrategyEnterpriseBase::hashByAttributes( + VPackSlice slice, std::vector const& attributes, + bool docComplete, int& error, std::string const& key) { + + return ::hashByAttributesImpl(slice, attributes, docComplete, error, key); +} + + +/// @brief old version of the sharding used in the enterprise edition +/// this is DEPRECATED and should not be used for new collections +ShardingStrategyEnterpriseCompat::ShardingStrategyEnterpriseCompat( + ShardingInfo* sharding) + : ShardingStrategyEnterpriseBase(sharding) { + + ::preventUseOnSmartEdgeCollection(_sharding->collection(), NAME); +} + + /// @brief default hash-based sharding strategy +/// used for new collections from 3.4 onwards ShardingStrategyHash::ShardingStrategyHash(ShardingInfo* sharding) : ShardingStrategyHashBase(sharding) { // whether or not the collection uses the default shard attributes (["_key"]) // this setting is initialized to false, and we may change it now TRI_ASSERT(!_usesDefaultShardKeys); auto shardKeys = _sharding->shardKeys(); + TRI_ASSERT(!shardKeys.empty()); + if (shardKeys.size() == 1) { _usesDefaultShardKeys = (shardKeys[0] == StaticStrings::KeyString || @@ -282,8 +371,5 @@ ShardingStrategyHash::ShardingStrategyHash(ShardingInfo* sharding) StaticStrings::KeyString) == 0)); } - if (_sharding->collection()->isSmart() && - _sharding->collection()->type() == TRI_COL_TYPE_EDGE) { - THROW_ARANGO_EXCEPTION_MESSAGE(TRI_ERROR_BAD_PARAMETER, std::string("sharding strategy ") + NAME + " cannot be used for smart edge collection"); - } + ::preventUseOnSmartEdgeCollection(_sharding->collection(), NAME); } diff --git a/arangod/Sharding/ShardingStrategyDefault.h b/arangod/Sharding/ShardingStrategyDefault.h index 6ef0de04db..3276cc197a 100644 --- a/arangod/Sharding/ShardingStrategyDefault.h +++ b/arangod/Sharding/ShardingStrategyDefault.h @@ -52,6 +52,31 @@ class ShardingStrategyNone final : public ShardingStrategy { std::string const& key = "") override; }; + +/// @brief a sharding class used to indicate that the selected sharding strategy +/// is only available in the enterprise edition of ArangoDB +/// calling getResponsibleShard on this class will always throw an exception +/// with an appropriate error message +class ShardingStrategyOnlyInEnterprise final : public ShardingStrategy { + public: + explicit ShardingStrategyOnlyInEnterprise(std::string const& name); + + std::string const& name() const override { return _name; } + + /// @brief does not really matter here + bool usesDefaultShardKeys() override { return true; } + + /// @brief will always throw an exception telling the user the selected sharding is only + /// available in the enterprise edition + int getResponsibleShard(arangodb::velocypack::Slice, + bool docComplete, ShardID& shardID, + bool& usesDefaultShardKeys, + std::string const& key = "") override; + private: + /// @brief name of the sharding strategy we are replacing + std::string const _name; +}; + /// @brief base class for hash-based sharding class ShardingStrategyHashBase : public ShardingStrategy { public: @@ -91,8 +116,37 @@ class ShardingStrategyCommunityCompat final : public ShardingStrategyHashBase { static std::string const NAME; }; -/// @brief old version of the sharding used in the community edition +/// @brief old version of the sharding used in the enterprise edition /// this is DEPRECATED and should not be used for new collections +class ShardingStrategyEnterpriseBase : public ShardingStrategyHashBase { + public: + explicit ShardingStrategyEnterpriseBase(ShardingInfo* sharding); + + protected: + /// @brief this implementation of "hashByAttributes" is slightly different + /// than the implementation in the Community version + /// we leave the differences in place, because making any changes here + /// will affect the data distribution, which we want to avoid + uint64_t hashByAttributes(arangodb::velocypack::Slice slice, + std::vector const& attributes, + bool docComplete, int& error, + std::string const& key) override final; +}; + +/// @brief old version of the sharding used in the enterprise edition +/// this is DEPRECATED and should not be used for new collections +class ShardingStrategyEnterpriseCompat : public ShardingStrategyEnterpriseBase { + public: + explicit ShardingStrategyEnterpriseCompat(ShardingInfo* sharding); + + std::string const& name() const override { return NAME; } + + static std::string const NAME; +}; + + +/// @brief default hash-based sharding strategy +/// used for new collections from 3.4 onwards class ShardingStrategyHash final : public ShardingStrategyHashBase { public: explicit ShardingStrategyHash(ShardingInfo* sharding); diff --git a/js/common/tests/shell/shell-community-sharding-compat-cluster.js b/js/common/tests/shell/shell-community-sharding-compat-cluster.js index 4420343846..77d56a3d0f 100644 --- a/js/common/tests/shell/shell-community-sharding-compat-cluster.js +++ b/js/common/tests/shell/shell-community-sharding-compat-cluster.js @@ -68,13 +68,38 @@ function DocumentShardingSuite() { }, testCreateWithEnterpriseSharding : function () { - if (!isEnterprise) { - try { - db._create(name1, { shardingStrategy: enterpriseCompat, numberOfShards: 5 }); - } catch (err) { - assertEqual(ERRORS.ERROR_BAD_PARAMETER.code, err.errorNum); - } + let c = db._create(name1, { shardingStrategy: enterpriseCompat, numberOfShards: 5 }); + assertEqual(enterpriseCompat, c.properties()["shardingStrategy"]); + assertEqual(5, c.properties()["numberOfShards"]); + assertEqual(["_key"], c.properties()["shardKeys"]); + + for (let i = 0; i < 1000; ++i) { + c.insert({ _key: "test" + i, value: i }); } + + assertEqual([ 188, 192, 198, 204, 218 ], Object.values(c.count(true)).sort()); + + for (let i = 0; i < 1000; ++i) { + assertEqual(i, c.document("test" + i).value); + } + }, + + testCreateWithEnterpriseShardingNonDefaultKeysNumericValues : function () { + let c = db._create(name1, { shardingStrategy: enterpriseCompat, shardKeys: ["value"], numberOfShards: 5 }); + assertEqual(enterpriseCompat, c.properties()["shardingStrategy"]); + assertEqual(5, c.properties()["numberOfShards"]); + assertEqual(["value"], c.properties()["shardKeys"]); + + let keys = []; + for (let i = 0; i < 1000; ++i) { + keys.push(c.insert({ value: i })._key); + } + + assertEqual([ 0, 0, 0, 0, 1000 ], Object.values(c.count(true)).sort()); + + keys.forEach(function(k, i) { + assertEqual(i, c.document(k).value); + }); }, testCreateWithCommunitySharding : function () {