1
0
Fork 0

Feature/maskings (#8006)

This commit is contained in:
Frank Celler 2019-01-22 22:23:25 +01:00 committed by Frank Celler
parent ca9587a422
commit 8f0776049b
32 changed files with 2868 additions and 94 deletions

View File

@ -0,0 +1,541 @@
Arangodump Data Maskings
========================
*--maskings path-of-config*
It is possible to mask certain fields for a dump. A JSON configuration file is
used to define which fields should be masked and how.
The general structure of the config file looks like this:
```json
{
"collection-name": {
"type": MASKING_TYPE
"maskings" : [
MASKING1,
MASKING2,
...
]
},
...
}
```
Using `"*"` as collection name defines a default behavior for collections not
listed explicitly.
Masking Types
-------------
`type` is a string describing how to mask the given collection.
Possible values are:
- `"exclude"`: the collection is ignored completely and not even the structure data
is dumped.
- `"structure"`: only the collection structure is dumped, but no data at all
- `"masked"`: the collection structure and all data is dumped. However, the data
is subject to obfuscation defined in the attribute `maskings`.
- `"full"`: the collection structure and all data is dumped. No masking is
applied to this collection at all.
**Example**
```json
{
"private": {
"type": "exclude"
},
"log": {
"type": "structure"
},
"person": {
"type": "masked",
"maskings": [
{
"path": "name",
"type": "xifyFront",
"unmaskedLength": 2
},
{
"path": ".security_id",
"type": "xifyFront",
"unmaskedLength": 2
}
]
}
}
```
In the example the collection _private_ is completely ignored. Only the
structure of the collection _log_ is dumped, but not the data itself.
The collection _person_ is dumped completely but with the _name_ field masked
if it occurs on the top-level. It also masks fields with the name "security_id"
anywhere in the document. See below for a complete description of the parameters
of [type "xifyFront"](#xify-front).
### Masking vs. dump-data option
*arangodump* also supports a very coarse masking with the option
`--dump-data false`. This basically removes all data from the dump.
You can either use `--masking` or `--dump-data false`, but not both.
### Masking vs. include-collection option
*arangodump* also supports a very coarse masking with the option
`--include-collection`. This will restrict the collections that are
dumped to the ones explicitly listed.
It is possible to combine `--masking` and `--include-collection`.
This will take the intersection of exportable collections.
Path
----
If the path starts with a `.` then it is considered to match any path
ending in `name`. For example, `.name` will match the attribute name
`name` all leaf attributes in the document. Leaf attributes are
attributes whose value is `null` or of data type `string`, `number`,
`bool` and `array` (see below). `name` will only match leaf attributes
at top level. `person.name` will match the attribute `name` of a leaf
in the top-level object `person`.
If you have an attribute name that contains a dot, you need to quote the
name with either a tick or a backtick. For example:
"path": "´name.with.dots´"
or
"path": "`name.with.dots`"
If the attribute value is an array the masking is applied to all the
array elements individually.
**Example**
The following configuration will replace the value of the "name"
attribute with an "XXXX"-masked string:
```json
{
"type": "xifyFront",
"path": ".name",
"unmaskedLength": 2
}
```
The document:
```json
{
"name": "top-level-name",
"age": 42,
"nicknames" : [ { "name": "hugo" }, "egon" ],
"other": {
"name": [ "emil", { "secret": "superman" } ]
}
}
```
… will be changed as follows:
```json
{
"name": "xxxxxxxxxxxxme",
"age": 42,
"nicknames" : [ { "name": "xxgo" }, "egon" ],
"other": {
"name": [ "xxil", { "secret": "superman" } ]
}
}
```
The values `"egon"` and `"superman"` are not replaced, because they
are not contained in an attribute value of which the attribute name is
`name`.
### Nested objects and arrays
If you specify a path and the attribute value is an array then the
masking decision is applied to each element of the array as if this
was the value of the attribute.
If the attribute value is an object, then the attribute is not masked.
Instead the nested object is checked further for leaf attributes.
**Example**
Masking `email` will convert:
```json
{
"email" : "email address"
}
```
… into:
```json
{
"email" : "xxil xxxxxxss"
}
```
because `email` is a leaf attribute. The document:
```json
{
"email" : [
"address one",
"address two"
]
}
```
… will be converted into:
```json
{
"email" : [
"xxxxxss xne",
"xxxxxss xwo"
]
}
```
… because the array is "unfolded". The document:
```json
{
"email" : {
"address" : "email address"
}
}
```
… will not be changed because `email` is not a leaf attribute.
Masking Functions
-----------------
{% hint 'info' %}
The following masking functions are only available in the
[**Enterprise Edition**](https://www.arangodb.com/why-arangodb/arangodb-enterprise/)
{% endhint %}
- xify front
- zip
- datetime
- integral number
- decimal number
- credit card number
- phone number
- email address
The function:
- random string
… is available on Community Edition and in the Enterprise Edition.
### Random string
```json
{
"path": ".name",
"type": "randomString"
}
```
This masking type will replace all values of attributes with key
`name` with an anonymized string. It is not guaranteed that the string
will be of the same length.
A hash of the original string is computed. If the original string is
shorter then the hash will be used. This will result in a longer
replacement string. If the string is longer than the hash then
characters will be repeated as many times as needed to reach the full
original string length.
**Example**
Masking name as above, the document:
```json
{
"_key" : "38937",
"_id" : "examplecollection/38937",
"_rev" : "_YFaGG1u--_",
"name" : [
"My Name",
{
"other" : "Hallo Name"
},
[
"Name One",
"Name Two"
],
true,
false,
null,
1.0,
1234,
"This is a very long name"
]
}
```
… will be converted into
```json
{
"_key": "38937",
"_id": "examplecollection/38937",
"_rev": "_YFaGG1u--_",
"name": [
"+y5OQiYmp/o=",
{
"other": "Hallo Name"
},
[
"ihCTrlsKKdk=",
"yo/55hfla0U="
],
true,
false,
null,
1.0,
1234,
"hwjAfNe5BGw=hwjAfNe5BGw="
]
}
```
### Xify front
This masking type replaces the front characters with `x` and
blanks. Alphanumeric characters, `_` and `-` are replaced by `x`,
everything else is replaced by a blank.
```json
{
"path": ".name",
"type": "xifyFront",
"unmaskedLength": 2
}
```
This will mask all alphanumeric characters of a word except the last
two characters. Words of length 1 and 2 are unmasked. If the
attribute value is not a string the result will be `xxxx`.
"This is a test!Do you agree?"
… will become
"xxis is a xxst Do xou xxxee "
There is a catch. If you have an index on the attribute the masking
might distort the index efficiency or even cause errors in case of a
unique index.
```json
{
"type": "xifyFront",
"path": ".name",
"unmaskedLength": 2,
"hash": true
}
```
This will add a hash at the end of the string.
"This is a test!Do you agree?"
… will become
"xxis is a xxst Do xou xxxee NAATm8c9hVQ="
Note that the hash is based on a random secrect that is different for
each run. This avoids dictionary attacks which can be used to guess
values based pre-computations on dictionaries.
If you need reproducible results, i.e. hashes that do not change between
different runs of *arangodump*, you need to specify a secret as seed,
a number which must not be `0`.
```json
{
"type": "xifyFront",
"path": ".name",
"unmaskedLength": 2,
"hash": true,
"seed": 246781478647
}
```
### Zip
This masking type replaces a zip code with a random one. If the
attribute value is not a string then the default value of `"12345"` is
used as no zip is known. You can change the default value, see below.
```json
{
"path": ".code",
"type": "zip",
}
```
This will replace a real zip code with a random one. It uses the following
rule: If a character of the original zip code is a digit it will be replaced
by a random digit. If a character of the original zip code is a letter it
will be replaced by a random letter keeping the case.
```json
{
"path": ".code",
"type": "zip",
"default": "abcdef"
}
```
**Example**
If the original zip code is:
50674
… it will be replaced by e.g.:
98146
If the original zip code is:
SA34-EA
… it will be replaced by e.g.:
OW91-JI
Note that this will generate random zip code. Therefore there is a
chance generate the same zip code value multiple times, which can
cause unique constraint violations if a unique index is or will be
used on the zip code attribute.
### Datetime
This masking type replaces the value of the attribute with a random
date.
```json
{
"type": "datetime",
"begin" : "2019-01-01",
"end": "2019-12-31",
"output": "%yyyy-%mm-%dd",
}
```
`begin` and `end` are in ISO8601 format.
The `output` format is described in
[DATE_FORMAT](../../../AQL/Functions/Date.html#dateformat).
### Integral number
This masking type replaces the value of the attribute with a random
integral number. It will replace the value even if it is a string,
boolean, or false.
```json
{
"type": "integer",
"lower" : -100,
"upper": 100
}
```
### Decimal number
This masking type replaces the value of the attribute with a random
decimal. It will replace the value even if it is a string, boolean,
or false.
```json
{
"type": "float",
"lower" : -0.3,
"upper": 0.3
}
```
By default, the decimal has a scale of 2. I.e. it has at most 2
decimal digits. The definition:
```json
{
"type": "float",
"lower" : -0.3,
"upper": 0.3,
"scale": 3
}
```
… will generate numbers with at most 3 decimal digits.
### Credit card number
This masking type replaces the value of the attribute with a random
credit card number.
```json
{
"type": "creditCard",
}
```
See [Luhn](https://en.wikipedia.org/wiki/Luhn_algorithm) for details.
### Phone number
This masking type replaces a phone number with a random one. If the
attribute value is not a string it is replaced by the string
`"+1234567890"`.
```json
{
"type": "phone",
"default": "+4912345123456789"
}
```
This will replace an existing phone number with a random one. It uses
the following rule: If a character of the original number is a digit
it will be replaced by a random digit. If it is a letter it is replaced
by a letter. All other characters are unchanged.
```json
{ "type": "zip",
"default": "+4912345123456789"
}
```
If the attribute value is not a string use the value of default
`"+4912345123456789"`.
### Email address
This masking type takes an email address, computes a hash value and
split it into three equal parts `AAAA`, `BBBB`, and `CCCC`. The
resulting email address is `AAAA.BBBB@CCCC.invalid`.

View File

@ -29,6 +29,15 @@
* License Name: Boost Software License 1.0 * License Name: Boost Software License 1.0
* License Id: BSL-1.0 * License Id: BSL-1.0
### CreditCardGenerator 2016
* Name: CreditCardGenerator
* Version: 1.8.1
* Project Home: https://github.com/stormdark/CreditCardGenerator
* License: https://raw.githubusercontent.com/stormdark/CreditCardGenerator/master/LICENSE
* License Name: MIT License
* License Id: MIT
### Curl 7.50.3 ### Curl 7.50.3
* Name: Curl * Name: Curl

View File

@ -964,7 +964,7 @@ AqlValue addOrSubtractIsoDurationFromTimestamp(Query* query, tp_sys_clock_ms con
year_month_day ymd{floor<days>(tp)}; year_month_day ymd{floor<days>(tp)};
auto day_time = make_time(tp - sys_days(ymd)); auto day_time = make_time(tp - sys_days(ymd));
std::smatch duration_parts; std::smatch duration_parts;
if (!basics::regex_isoDuration(duration, duration_parts)) { if (!basics::regexIsoDuration(duration, duration_parts)) {
if (isSubtract) { if (isSubtract) {
::registerWarning(query, "DATE_SUBTRACT", TRI_ERROR_QUERY_INVALID_DATE_VALUE); ::registerWarning(query, "DATE_SUBTRACT", TRI_ERROR_QUERY_INVALID_DATE_VALUE);
} else { } else {
@ -1037,7 +1037,7 @@ bool parameterToTimePoint(Query* query, transaction::Methods* trx,
tp = tp_sys_clock_ms(milliseconds(value.toInt64(trx))); tp = tp_sys_clock_ms(milliseconds(value.toInt64(trx)));
} else { } else {
std::string const dateVal = value.slice().copyString(); std::string const dateVal = value.slice().copyString();
if (!basics::parse_dateTime(dateVal, tp)) { if (!basics::parseDateTime(dateVal, tp)) {
::registerWarning(query, AFN, TRI_ERROR_QUERY_INVALID_DATE_VALUE); ::registerWarning(query, AFN, TRI_ERROR_QUERY_INVALID_DATE_VALUE);
return false; return false;
} }
@ -3500,7 +3500,7 @@ AqlValue Functions::IsDatestring(arangodb::aql::Query*, transaction::Methods*,
if (value.isString()) { if (value.isString()) {
tp_sys_clock_ms tp; // unused tp_sys_clock_ms tp; // unused
isValid = basics::parse_dateTime(value.slice().copyString(), tp); isValid = basics::parseDateTime(value.slice().copyString(), tp);
} }
return AqlValue(AqlValueHintBool(isValid)); return AqlValue(AqlValueHintBool(isValid));

View File

@ -41,6 +41,7 @@
#include "Basics/StaticStrings.h" #include "Basics/StaticStrings.h"
#include "Basics/StringUtils.h" #include "Basics/StringUtils.h"
#include "Basics/VelocyPackHelper.h" #include "Basics/VelocyPackHelper.h"
#include "Maskings/Maskings.h"
#include "ProgramOptions/ProgramOptions.h" #include "ProgramOptions/ProgramOptions.h"
#include "Random/RandomGenerator.h" #include "Random/RandomGenerator.h"
#include "Shell/ClientFeature.h" #include "Shell/ClientFeature.h"
@ -223,6 +224,29 @@ bool isIgnoredHiddenEnterpriseCollection(arangodb::DumpFeature::Options const& o
return false; return false;
} }
arangodb::Result dumpJsonObjects(arangodb::DumpFeature::JobData& jobData,
arangodb::ManagedDirectory::File& file,
arangodb::basics::StringBuffer const& body) {
arangodb::basics::StringBuffer masked(1, false);
arangodb::basics::StringBuffer const* result = &body;
if (jobData.maskings != nullptr) {
jobData.maskings->mask(jobData.name, body, masked);
result = &masked;
}
file.write(result->c_str(), result->length());
if (file.status().fail()) {
return {TRI_ERROR_CANNOT_WRITE_FILE, std::string("cannot write file '") + file.path() +
"': " + file.status().errorMessage()};
}
jobData.stats.totalWritten += static_cast<uint64_t>(result->length());
return {TRI_ERROR_NO_ERROR};
}
/// @brief dump the actual data from an individual collection /// @brief dump the actual data from an individual collection
arangodb::Result dumpCollection(arangodb::httpclient::SimpleHttpClient& client, arangodb::Result dumpCollection(arangodb::httpclient::SimpleHttpClient& client,
arangodb::DumpFeature::JobData& jobData, arangodb::DumpFeature::JobData& jobData,
@ -298,13 +322,10 @@ arangodb::Result dumpCollection(arangodb::httpclient::SimpleHttpClient& client,
// now actually write retrieved data to dump file // now actually write retrieved data to dump file
arangodb::basics::StringBuffer const& body = response->getBody(); arangodb::basics::StringBuffer const& body = response->getBody();
file.write(body.c_str(), body.length()); arangodb::Result result = dumpJsonObjects(jobData, file, body);
if (file.status().fail()) {
return {TRI_ERROR_CANNOT_WRITE_FILE, if (result.fail()) {
std::string("cannot write file '") + file.path() + return result;
"': " + file.status().errorMessage()};
} else {
jobData.stats.totalWritten += (uint64_t)body.length();
} }
if (!checkMore || fromTick == 0) { if (!checkMore || fromTick == 0) {
@ -393,6 +414,21 @@ arangodb::Result processJob(arangodb::httpclient::SimpleHttpClient& client,
arangodb::Result result{TRI_ERROR_NO_ERROR}; arangodb::Result result{TRI_ERROR_NO_ERROR};
bool dumpStructure = true;
if (dumpStructure && jobData.maskings != nullptr) {
dumpStructure = jobData.maskings->shouldDumpStructure(jobData.name);
}
if (!dumpStructure) {
if (jobData.options.progress) {
LOG_TOPIC(INFO, arangodb::Logger::DUMP)
<< "# Dumping collection '" << jobData.name << "'...";
}
return result;
}
// prep hex string of collection name // prep hex string of collection name
std::string const hexString(arangodb::rest::SslInterface::sslMD5(jobData.name)); std::string const hexString(arangodb::rest::SslInterface::sslMD5(jobData.name));
@ -435,8 +471,14 @@ arangodb::Result processJob(arangodb::httpclient::SimpleHttpClient& client,
} }
} }
if (result.ok() && jobData.options.dumpData) { if (result.ok()) {
// save the actual data bool dumpData = jobData.options.dumpData;
if (dumpData && jobData.maskings != nullptr) {
dumpData = jobData.maskings->shouldDumpData(jobData.name);
}
// always create the file so that arangorestore does not complain
auto file = jobData.directory.writableFile(jobData.name + "_" + hexString + auto file = jobData.directory.writableFile(jobData.name + "_" + hexString +
".data.json", ".data.json",
true); true);
@ -444,10 +486,13 @@ arangodb::Result processJob(arangodb::httpclient::SimpleHttpClient& client,
return ::fileError(file.get(), true); return ::fileError(file.get(), true);
} }
if (jobData.options.clusterMode) { if (dumpData) {
result = ::handleCollectionCluster(client, jobData, *file); // save the actual data
} else { if (jobData.options.clusterMode) {
result = ::handleCollection(client, jobData, *file); result = ::handleCollectionCluster(client, jobData, *file);
} else {
result = ::handleCollection(client, jobData, *file);
}
} }
} }
@ -467,10 +512,11 @@ void handleJobResult(std::unique_ptr<arangodb::DumpFeature::JobData>&& jobData,
namespace arangodb { namespace arangodb {
DumpFeature::JobData::JobData(ManagedDirectory& dir, DumpFeature& feat, DumpFeature::JobData::JobData(ManagedDirectory& dir, DumpFeature& feat,
Options const& opts, Stats& stat, VPackSlice const& info, Options const& opts, maskings::Maskings* maskings,
Stats& stat, VPackSlice const& info,
uint64_t const batch, std::string const& c, uint64_t const batch, std::string const& c,
std::string const& n, std::string const& t) std::string const& n, std::string const& t)
: directory{dir}, feature{feat}, options{opts}, stats{stat}, collectionInfo{info}, batchId{batch}, cid{c}, name{n}, type{t} {} : directory{dir}, feature{feat}, options{opts}, maskings{maskings}, stats{stat}, collectionInfo{info}, batchId{batch}, cid{c}, name{n}, type{t} {}
DumpFeature::DumpFeature(application_features::ApplicationServer& server, int& exitCode) DumpFeature::DumpFeature(application_features::ApplicationServer& server, int& exitCode)
: ApplicationFeature(server, DumpFeature::featureName()), : ApplicationFeature(server, DumpFeature::featureName()),
@ -543,6 +589,9 @@ void DumpFeature::collectOptions(std::shared_ptr<options::ProgramOptions> option
options->addOption("--tick-end", "last tick to be included in data dump", options->addOption("--tick-end", "last tick to be included in data dump",
new UInt64Parameter(&_options.tickEnd)); new UInt64Parameter(&_options.tickEnd));
options->addOption("--maskings", "file with maskings definition",
new StringParameter(&_options.maskingsFile));
} }
void DumpFeature::validateOptions(std::shared_ptr<options::ProgramOptions> options) { void DumpFeature::validateOptions(std::shared_ptr<options::ProgramOptions> options) {
@ -697,8 +746,9 @@ Result DumpFeature::runDump(httpclient::SimpleHttpClient& client, std::string co
// queue job to actually dump collection // queue job to actually dump collection
auto jobData = auto jobData =
std::make_unique<JobData>(*_directory, *this, _options, _stats, collection, std::make_unique<JobData>(*_directory, *this, _options, _maskings.get(),
batchId, std::to_string(cid), name, collectionType); _stats, collection, batchId,
std::to_string(cid), name, collectionType);
_clientTaskQueue.queueJob(std::move(jobData)); _clientTaskQueue.queueJob(std::move(jobData));
} }
@ -833,7 +883,8 @@ Result DumpFeature::runClusterDump(httpclient::SimpleHttpClient& client,
} }
// queue job to actually dump collection // queue job to actually dump collection
auto jobData = std::make_unique<JobData>(*_directory, *this, _options, _stats, collection, auto jobData = std::make_unique<JobData>(*_directory, *this, _options,
_maskings.get(), _stats, collection,
0 /* batchId */, std::to_string(cid), 0 /* batchId */, std::to_string(cid),
name, "" /* collectionType */); name, "" /* collectionType */);
_clientTaskQueue.queueJob(std::move(jobData)); _clientTaskQueue.queueJob(std::move(jobData));
@ -936,8 +987,18 @@ void DumpFeature::reportError(Result const& error) {
} }
} }
/// @brief main method to run dump
void DumpFeature::start() { void DumpFeature::start() {
if (!_options.maskingsFile.empty()) {
maskings::MaskingsResult m = maskings::Maskings::fromFile(_options.maskingsFile);
if (m.status != maskings::MaskingsResult::VALID) {
LOG_TOPIC(FATAL, Logger::CONFIG) << m.message;
FATAL_ERROR_EXIT();
}
_maskings = std::move(m.maskings);
}
_exitCode = EXIT_SUCCESS; _exitCode = EXIT_SUCCESS;
// generate a fake client id that we sent to the server // generate a fake client id that we sent to the server

View File

@ -25,16 +25,20 @@
#define ARANGODB_DUMP_DUMP_FEATURE_H 1 #define ARANGODB_DUMP_DUMP_FEATURE_H 1
#include "ApplicationFeatures/ApplicationFeature.h" #include "ApplicationFeatures/ApplicationFeature.h"
#include "Basics/Mutex.h" #include "Basics/Mutex.h"
#include "Utils/ClientManager.h" #include "Utils/ClientManager.h"
#include "Utils/ClientTaskQueue.h" #include "Utils/ClientTaskQueue.h"
namespace arangodb { namespace arangodb {
namespace httpclient { namespace httpclient {
class SimpleHttpResult; class SimpleHttpResult;
} }
namespace maskings {
class Maskings;
}
class ManagedDirectory; class ManagedDirectory;
class DumpFeature : public application_features::ApplicationFeature { class DumpFeature : public application_features::ApplicationFeature {
@ -62,6 +66,7 @@ class DumpFeature : public application_features::ApplicationFeature {
struct Options { struct Options {
std::vector<std::string> collections{}; std::vector<std::string> collections{};
std::string outputPath{}; std::string outputPath{};
std::string maskingsFile{};
uint64_t initialChunkSize{1024 * 1024 * 8}; uint64_t initialChunkSize{1024 * 1024 * 8};
uint64_t maxChunkSize{1024 * 1024 * 64}; uint64_t maxChunkSize{1024 * 1024 * 64};
uint32_t threadCount{2}; uint32_t threadCount{2};
@ -85,13 +90,14 @@ class DumpFeature : public application_features::ApplicationFeature {
/// @brief Stores all necessary data to dump a single collection or shard /// @brief Stores all necessary data to dump a single collection or shard
struct JobData { struct JobData {
JobData(ManagedDirectory&, DumpFeature&, Options const&, Stats&, JobData(ManagedDirectory&, DumpFeature&, Options const&,
VPackSlice const&, uint64_t const, std::string const&, maskings::Maskings* maskings, Stats&, VPackSlice const&, uint64_t const,
std::string const&, std::string const&); std::string const&, std::string const&, std::string const&);
ManagedDirectory& directory; ManagedDirectory& directory;
DumpFeature& feature; DumpFeature& feature;
Options const& options; Options const& options;
maskings::Maskings* maskings;
Stats& stats; Stats& stats;
VPackSlice const collectionInfo; VPackSlice const collectionInfo;
@ -110,6 +116,7 @@ class DumpFeature : public application_features::ApplicationFeature {
Stats _stats; Stats _stats;
Mutex _workerErrorLock; Mutex _workerErrorLock;
std::queue<Result> _workerErrors; std::queue<Result> _workerErrors;
std::unique_ptr<maskings::Maskings> _maskings;
Result runDump(httpclient::SimpleHttpClient& client, std::string const& dbName); Result runDump(httpclient::SimpleHttpClient& client, std::string const& dbName);
Result runClusterDump(httpclient::SimpleHttpClient& client, std::string const& dbName); Result runClusterDump(httpclient::SimpleHttpClient& client, std::string const& dbName);

View File

@ -34,6 +34,7 @@
#include "Dump/DumpFeature.h" #include "Dump/DumpFeature.h"
#include "Logger/Logger.h" #include "Logger/Logger.h"
#include "Logger/LoggerFeature.h" #include "Logger/LoggerFeature.h"
#include "Maskings/AttributeMasking.h"
#include "ProgramOptions/ProgramOptions.h" #include "ProgramOptions/ProgramOptions.h"
#include "Random/RandomFeature.h" #include "Random/RandomFeature.h"
#include "Shell/ClientFeature.h" #include "Shell/ClientFeature.h"
@ -41,6 +42,7 @@
#ifdef USE_ENTERPRISE #ifdef USE_ENTERPRISE
#include "Enterprise/Encryption/EncryptionFeature.h" #include "Enterprise/Encryption/EncryptionFeature.h"
#include "Enterprise/Maskings/AttributeMaskingEE.h"
#endif #endif
using namespace arangodb; using namespace arangodb;
@ -52,6 +54,12 @@ int main(int argc, char* argv[]) {
ArangoGlobalContext context(argc, argv, BIN_DIRECTORY); ArangoGlobalContext context(argc, argv, BIN_DIRECTORY);
context.installHup(); context.installHup();
maskings::InstallMaskings();
#ifdef USE_ENTERPRISE
maskings::InstallMaskingsEE();
#endif
std::shared_ptr<options::ProgramOptions> options( std::shared_ptr<options::ProgramOptions> options(
new options::ProgramOptions(argv[0], "Usage: arangodump [<options>]", new options::ProgramOptions(argv[0], "Usage: arangodump [<options>]",
"For more information use:", BIN_DIRECTORY)); "For more information use:", BIN_DIRECTORY));

View File

@ -107,6 +107,12 @@ class ConfigBuilder {
this.config['create-database'] = 'false'; this.config['create-database'] = 'false';
} }
} }
setMaskings(dir) {
if (this.type !== 'dump') {
throw '"maskings" is not supported for binary: ' + this.type;
}
this.config['maskings'] = fs.join(TOP_DIR, "tests/js/common/test-data/maskings", dir);
}
activateEncryption() { this.config['encription.keyfile'] = fs.join(this.rootDir, 'secret-key'); } activateEncryption() { this.config['encription.keyfile'] = fs.join(this.rootDir, 'secret-key'); }
setRootDir(dir) { this.rootDir = dir; } setRootDir(dir) { this.rootDir = dir; }
restrictToCollection(collection) { restrictToCollection(collection) {

View File

@ -2,34 +2,36 @@
/* global print */ /* global print */
'use strict'; 'use strict';
// ////////////////////////////////////////////////////////////////////////////// // /////////////////////////////////////////////////////////////////////////////
// / DISCLAIMER // DISCLAIMER
// / //
// / Copyright 2016 ArangoDB GmbH, Cologne, Germany // Copyright 2016-2019 ArangoDB GmbH, Cologne, Germany
// / Copyright 2014 triagens GmbH, Cologne, Germany // Copyright 2014 triagens GmbH, Cologne, Germany
// / //
// / Licensed under the Apache License, Version 2.0 (the "License") // Licensed under the Apache License, Version 2.0 (the "License")
// / you may not use this file except in compliance with the License. // you may not use this file except in compliance with the License.
// / You may obtain a copy of the License at // You may obtain a copy of the License at
// / //
// / http://www.apache.org/licenses/LICENSE-2.0 // http://www.apache.org/licenses/LICENSE-2.0
// / //
// / Unless required by applicable law or agreed to in writing, software // Unless required by applicable law or agreed to in writing, software
// / distributed under the License is distributed on an "AS IS" BASIS, // distributed under the License is distributed on an "AS IS" BASIS,
// / WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// / See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// / limitations under the License. // limitations under the License.
// / //
// / Copyright holder is ArangoDB GmbH, Cologne, Germany // Copyright holder is ArangoDB GmbH, Cologne, Germany
// / //
// / @author Max Neunhoeffer // @author Max Neunhoeffer
// ////////////////////////////////////////////////////////////////////////////// // //////////////////////////////////////////////////////////////////////////////
const functionsDocumentation = { const functionsDocumentation = {
'dump': 'dump tests', 'dump': 'dump tests',
'dump_authentication': 'dump tests with authentication',
'dump_encrypted': 'encrypted dump tests', 'dump_encrypted': 'encrypted dump tests',
'dump_authentication': 'dump tests with authentication' 'dump_maskings': 'masked dump tests'
}; };
const optionsDocumentation = [ const optionsDocumentation = [
' - `skipEncrypted` : if set to true the encryption tests are skipped' ' - `skipEncrypted` : if set to true the encryption tests are skipped'
]; ];
@ -48,16 +50,18 @@ const RESET = require('internal').COLORS.COLOR_RESET;
const testPaths = { const testPaths = {
'dump': [tu.pathForTesting('server/dump')], 'dump': [tu.pathForTesting('server/dump')],
'dump_authentication': [tu.pathForTesting('server/dump')],
'dump_encrypted': [tu.pathForTesting('server/dump')], 'dump_encrypted': [tu.pathForTesting('server/dump')],
'dump_authentication': [tu.pathForTesting('server/dump')] 'dump_maskings': [tu.pathForTesting('server/dump')]
}; };
class DumpRestoreHelper { class DumpRestoreHelper {
constructor(instanceInfo, options, clientAuth, dumpOptions, which, afterServerStart) { constructor(instanceInfo, options, clientAuth, dumpOptions, restoreOptions, which, afterServerStart) {
this.instanceInfo = instanceInfo; this.instanceInfo = instanceInfo;
this.options = options; this.options = options;
this.clientAuth = clientAuth; this.clientAuth = clientAuth;
this.dumpOptions = dumpOptions; this.dumpOptions = dumpOptions;
this.restoreOptions = restoreOptions;
this.which = which; this.which = which;
this.fn = afterServerStart(instanceInfo); this.fn = afterServerStart(instanceInfo);
this.results = {failed: 1}; this.results = {failed: 1};
@ -66,11 +70,15 @@ class DumpRestoreHelper {
this.dumpConfig.setOutputDirectory('dump'); this.dumpConfig.setOutputDirectory('dump');
this.dumpConfig.setIncludeSystem(true); this.dumpConfig.setIncludeSystem(true);
this.restoreConfig = pu.createBaseConfig('restore', this.dumpOptions, this.instanceInfo); if (dumpOptions.hasOwnProperty("maskings")) {
this.dumpConfig.setMaskings(dumpOptions.maskings);
}
this.restoreConfig = pu.createBaseConfig('restore', this.restoreOptions, this.instanceInfo);
this.restoreConfig.setInputDirectory('dump', true); this.restoreConfig.setInputDirectory('dump', true);
this.restoreConfig.setIncludeSystem(true); this.restoreConfig.setIncludeSystem(true);
this.restoreOldConfig = pu.createBaseConfig('restore', this.dumpOptions, this.instanceInfo); this.restoreOldConfig = pu.createBaseConfig('restore', this.restoreOptions, this.instanceInfo);
this.restoreOldConfig.setInputDirectory('dump', true); this.restoreOldConfig.setInputDirectory('dump', true);
this.restoreOldConfig.setIncludeSystem(true); this.restoreOldConfig.setIncludeSystem(true);
this.restoreOldConfig.setDatabase('_system'); this.restoreOldConfig.setDatabase('_system');
@ -81,8 +89,8 @@ class DumpRestoreHelper {
this.restoreOldConfig.activateEncryption(); this.restoreOldConfig.activateEncryption();
} }
this.arangorestore = pu.run.arangoDumpRestoreWithConfig.bind(this, this.restoreConfig, this.dumpOptions, this.instanceInfo.rootDir); this.arangorestore = pu.run.arangoDumpRestoreWithConfig.bind(this, this.restoreConfig, this.restoreOptions, this.instanceInfo.rootDir);
this.arangorestoreOld = pu.run.arangoDumpRestoreWithConfig.bind(this, this.restoreOldConfig, this.dumpOptions, this.instanceInfo.rootDir); this.arangorestoreOld = pu.run.arangoDumpRestoreWithConfig.bind(this, this.restoreOldConfig, this.restoreOptions, this.instanceInfo.rootDir);
this.arangodump = pu.run.arangoDumpRestoreWithConfig.bind(this, this.dumpConfig, this.dumpOptions, this.instanceInfo.rootDir); this.arangodump = pu.run.arangoDumpRestoreWithConfig.bind(this, this.dumpConfig, this.dumpOptions, this.instanceInfo.rootDir);
} }
@ -225,10 +233,7 @@ function getClusterStrings(options)
} }
} }
// ////////////////////////////////////////////////////////////////////////////// function dump_backend (options, serverAuthInfo, clientAuth, dumpOptions, restoreOptions, which, tstFiles, afterServerStart) {
// / @brief TEST: dump
// //////////////////////////////////////////////////////////////////////////////
function dump_backend (options, serverAuthInfo, clientAuth, dumpOptions, which, tstFiles, afterServerStart) {
print(CYAN + which + ' tests...' + RESET); print(CYAN + which + ' tests...' + RESET);
let instanceInfo = pu.startInstance('tcp', options, serverAuthInfo, which); let instanceInfo = pu.startInstance('tcp', options, serverAuthInfo, which);
@ -243,7 +248,7 @@ function dump_backend (options, serverAuthInfo, clientAuth, dumpOptions, which,
}; };
return rc; return rc;
} }
const helper = new DumpRestoreHelper(instanceInfo, options, clientAuth, dumpOptions, which, afterServerStart); const helper = new DumpRestoreHelper(instanceInfo, options, clientAuth, dumpOptions, restoreOptions, which, afterServerStart);
const setupFile = tu.makePathUnix(fs.join(testPaths[which][0], tstFiles.dumpSetup)); const setupFile = tu.makePathUnix(fs.join(testPaths[which][0], tstFiles.dumpSetup));
const testFile = tu.makePathUnix(fs.join(testPaths[which][0], tstFiles.dumpAgain)); const testFile = tu.makePathUnix(fs.join(testPaths[which][0], tstFiles.dumpAgain));
@ -267,21 +272,21 @@ function dump_backend (options, serverAuthInfo, clientAuth, dumpOptions, which,
} }
} }
const foxxTestFile = tu.makePathUnix(fs.join(testPaths[which][0], tstFiles.foxxTest)); if (tstFiles.hasOwnProperty("foxxTest")) {
if (!helper.restoreFoxxComplete('UnitTestsDumpFoxxComplete') || const foxxTestFile = tu.makePathUnix(fs.join(testPaths[which][0], tstFiles.foxxTest));
!helper.testFoxxComplete(foxxTestFile, 'UnitTestsDumpFoxxComplete') || if (!helper.restoreFoxxComplete('UnitTestsDumpFoxxComplete') ||
!helper.restoreFoxxAppsBundle('UnitTestsDumpFoxxAppsBundle') || !helper.testFoxxComplete(foxxTestFile, 'UnitTestsDumpFoxxComplete') ||
!helper.testFoxxAppsBundle(foxxTestFile, 'UnitTestsDumpFoxxAppsBundle') || !helper.restoreFoxxAppsBundle('UnitTestsDumpFoxxAppsBundle') ||
!helper.restoreFoxxAppsBundle('UnitTestsDumpFoxxBundleApps') || !helper.testFoxxAppsBundle(foxxTestFile, 'UnitTestsDumpFoxxAppsBundle') ||
!helper.testFoxxAppsBundle(foxxTestFile, 'UnitTestsDumpFoxxBundleApps')) { !helper.restoreFoxxAppsBundle('UnitTestsDumpFoxxBundleApps') ||
return helper.extractResults(); !helper.testFoxxAppsBundle(foxxTestFile, 'UnitTestsDumpFoxxBundleApps')) {
return helper.extractResults();
}
} }
return helper.extractResults(); return helper.extractResults();
} }
// /////////////////////////////////////////////////////////////////////////////
function dump (options) { function dump (options) {
let c = getClusterStrings(options); let c = getClusterStrings(options);
let tstFiles = { let tstFiles = {
@ -292,7 +297,7 @@ function dump (options) {
foxxTest: 'check-foxx.js' foxxTest: 'check-foxx.js'
}; };
return dump_backend(options, {}, {}, options, 'dump', tstFiles, function(){}); return dump_backend(options, {}, {}, options, options, 'dump', tstFiles, function(){});
} }
function dumpAuthentication (options) { function dumpAuthentication (options) {
@ -332,7 +337,7 @@ function dumpAuthentication (options) {
foxxTest: 'check-foxx.js' foxxTest: 'check-foxx.js'
}; };
return dump_backend(options, serverAuthInfo, clientAuth, dumpAuthOpts, 'dump_authentication', tstFiles, function(){}); return dump_backend(options, serverAuthInfo, clientAuth, dumpAuthOpts, dumpAuthOpts, 'dump_authentication', tstFiles, function(){});
} }
function dumpEncrypted (options) { function dumpEncrypted (options) {
@ -373,20 +378,57 @@ function dumpEncrypted (options) {
foxxTest: 'check-foxx.js' foxxTest: 'check-foxx.js'
}; };
return dump_backend(options, {}, {}, dumpOptions, 'dump_encrypted', tstFiles, afterServerStart); return dump_backend(options, {}, {}, dumpOptions, dumpOptions, 'dump_encrypted', tstFiles, afterServerStart);
}
function dumpMaskings (options) {
// test is only meaningful in the enterprise version
let skip = true;
if (global.ARANGODB_CLIENT_VERSION) {
let version = global.ARANGODB_CLIENT_VERSION(true);
if (version.hasOwnProperty('enterprise-version')) {
skip = false;
}
}
if (skip) {
print('skipping dump_maskings test');
return {
dump_maskings: {
status: true,
skipped: true
}
};
}
let tstFiles = {
dumpSetup: 'dump-maskings-setup.js',
dumpAgain: 'dump-maskings.js',
dumpTearDown: 'dump-teardown.js'
};
let dumpMaskingsOpts = {
maskings: 'maskings1.json'
};
_.defaults(dumpMaskingsOpts, options);
return dump_backend(options, {}, {}, dumpMaskingsOpts, options, 'dump_maskings', tstFiles, function(){});
} }
// /////////////////////////////////////////////////////////////////////////////
exports.setup = function (testFns, defaultFns, opts, fnDocs, optionsDoc, allTestPaths) { exports.setup = function (testFns, defaultFns, opts, fnDocs, optionsDoc, allTestPaths) {
Object.assign(allTestPaths, testPaths); Object.assign(allTestPaths, testPaths);
testFns['dump'] = dump; testFns['dump'] = dump;
defaultFns.push('dump'); defaultFns.push('dump');
testFns['dump_authentication'] = dumpAuthentication;
defaultFns.push('dump_authentication');
testFns['dump_encrypted'] = dumpEncrypted; testFns['dump_encrypted'] = dumpEncrypted;
defaultFns.push('dump_encrypted'); defaultFns.push('dump_encrypted');
testFns['dump_authentication'] = dumpAuthentication; testFns['dump_maskings'] = dumpMaskings;
defaultFns.push('dump_authentication'); defaultFns.push('dump_maskings');
for (var attrname in functionsDocumentation) { fnDocs[attrname] = functionsDocumentation[attrname]; } for (var attrname in functionsDocumentation) { fnDocs[attrname] = functionsDocumentation[attrname]; }
for (var i = 0; i < optionsDocumentation.length; i++) { optionsDoc.push(optionsDocumentation[i]); } for (var i = 0; i < optionsDocumentation.length; i++) { optionsDoc.push(optionsDocumentation[i]); }

View File

@ -1,7 +1,7 @@
//////////////////////////////////////////////////////////////////////////////// ////////////////////////////////////////////////////////////////////////////////
/// DISCLAIMER /// DISCLAIMER
/// ///
/// Copyright 2014-2016 ArangoDB GmbH, Cologne, Germany /// Copyright 2014-2019 ArangoDB GmbH, Cologne, Germany
/// Copyright 2004-2014 triAGENS GmbH, Cologne, Germany /// Copyright 2004-2014 triAGENS GmbH, Cologne, Germany
/// ///
/// Licensed under the Apache License, Version 2.0 (the "License"); /// Licensed under the Apache License, Version 2.0 (the "License");
@ -25,9 +25,10 @@
#ifndef ARANGODB_BASICS_UTF8HELPER_H #ifndef ARANGODB_BASICS_UTF8HELPER_H
#define ARANGODB_BASICS_UTF8HELPER_H 1 #define ARANGODB_BASICS_UTF8HELPER_H 1
#include <velocypack/StringRef.h>
#include "Basics/Common.h" #include "Basics/Common.h"
#include <velocypack/StringRef.h>
#include <unicode/coll.h> #include <unicode/coll.h>
#include <unicode/regex.h> #include <unicode/regex.h>
#include <unicode/ustring.h> #include <unicode/ustring.h>
@ -40,10 +41,6 @@ class Utf8Helper {
Utf8Helper& operator=(Utf8Helper const&) = delete; Utf8Helper& operator=(Utf8Helper const&) = delete;
public: public:
//////////////////////////////////////////////////////////////////////////////
/// @brief a default helper
//////////////////////////////////////////////////////////////////////////////
static Utf8Helper DefaultUtf8Helper; static Utf8Helper DefaultUtf8Helper;
public: public:
@ -153,6 +150,26 @@ class Utf8Helper {
char const* replacement, size_t replacementLength, char const* replacement, size_t replacementLength,
bool partial, bool& error); bool partial, bool& error);
// append an UTF8 to a string. This will append 1 to 4 bytes.
static void appendUtf8Character(std::string& result, uint32_t ch) {
if (ch <= 0x7f) {
result.push_back((uint8_t)ch);
} else {
if (ch <= 0x7ff) {
result.push_back((uint8_t)((ch >> 6) | 0xc0));
} else {
if (ch <= 0xffff) {
result.push_back((uint8_t)((ch >> 12) | 0xe0));
} else {
result.push_back((uint8_t)((ch >> 18) | 0xf0));
result.push_back((uint8_t)(((ch >> 12) & 0x3f) | 0x80));
}
result.push_back((uint8_t)(((ch >> 6) & 0x3f) | 0x80));
}
result.push_back((uint8_t)((ch & 0x3f) | 0x80));
}
}
private: private:
Collator* _coll; Collator* _coll;
}; };

View File

@ -21,16 +21,503 @@
//////////////////////////////////////////////////////////////////////////////// ////////////////////////////////////////////////////////////////////////////////
#include "Basics/datetime.h" #include "Basics/datetime.h"
#include <date/date.h>
#include "Basics/NumberUtils.h" #include "Basics/NumberUtils.h"
#include "Logger/Logger.h" #include "Logger/Logger.h"
#include <boost/algorithm/string.hpp> #include <boost/algorithm/string.hpp>
#include <date/date.h>
#include <date/iso_week.h>
#include <chrono>
#include <regex> #include <regex>
#include <vector> #include <vector>
namespace { namespace {
using namespace date;
using namespace std::chrono;
std::string tail(std::string const& source, size_t const length) {
if (length >= source.size()) {
return source;
}
return source.substr(source.size() - length);
} // tail
typedef void (*format_func_t)(std::string& wrk, arangodb::tp_sys_clock_ms const&);
std::unordered_map<std::string, format_func_t> dateMap;
auto const unixEpoch = date::sys_seconds{std::chrono::seconds{0}};
std::vector<std::string> const monthNames = {"January", "February", "March",
"April", "May", "June",
"July", "August", "September",
"October", "November", "December"};
std::vector<std::string> const monthNamesShort = {"Jan", "Feb", "Mar", "Apr",
"May", "Jun", "Jul", "Aug",
"Sep", "Oct", "Nov", "Dec"};
std::vector<std::string> const weekDayNames = {"Sunday", "Monday",
"Tuesday", "Wednesday",
"Thursday", "Friday",
"Saturday"};
std::vector<std::string> const weekDayNamesShort = {"Sun", "Mon", "Tue", "Wed",
"Thu", "Fri", "Sat"};
std::
vector<std::pair<std::string, format_func_t>> const sortedDateMap = {{"%&",
[](std::string& wrk,
arangodb::tp_sys_clock_ms const& tp) {
}}, // Allow for literal "m" after "%m" ("%mm" ->
// %m%&m)
// zero-pad 4 digit years to length of 6 and add "+"
// prefix, keep negative as-is
{"%yyy"
"yyy",
[](std::string& wrk, arangodb::tp_sys_clock_ms const& tp) {
auto ymd = year_month_day(
floor<date::days>(tp));
auto yearnum = static_cast<int>(
ymd.year());
if (yearnum < 0) {
if (yearnum > -10) {
wrk.append(
"-00000");
} else if (yearnum > -100) {
wrk.append(
"-0000");
} else if (yearnum > -1000) {
wrk.append(
"-000");
} else if (yearnum > -10000) {
wrk.append(
"-00");
} else if (yearnum > -100000) {
wrk.append(
"-0");
} else {
wrk.append(
"-");
}
wrk.append(std::to_string(
abs(yearnum)));
return;
}
TRI_ASSERT(yearnum >= 0);
if (yearnum > 99999) {
// intentionally nothing
} else if (yearnum > 9999) {
wrk.append(
"+0");
} else if (yearnum > 999) {
wrk.append(
"+00");
} else if (yearnum > 99) {
wrk.append(
"+000");
} else if (yearnum > 9) {
wrk.append(
"+0000");
} else {
wrk.append(
"+00000");
}
wrk.append(std::to_string(yearnum));
}},
{"%mmm"
"m",
[](std::string& wrk, arangodb::tp_sys_clock_ms const& tp) {
auto ymd = year_month_day(
floor<date::days>(tp));
wrk.append(
::monthNames[static_cast<unsigned>(ymd.month()) - 1]);
}},
{"%yyy"
"y",
[](std::string& wrk, arangodb::tp_sys_clock_ms const& tp) {
auto ymd = year_month_day(
floor<date::days>(tp));
auto yearnum = static_cast<int>(
ymd.year());
if (yearnum < 0) {
if (yearnum > -10) {
wrk.append(
"-000");
} else if (yearnum > -100) {
wrk.append(
"-00");
} else if (yearnum > -1000) {
wrk.append(
"-0");
} else {
wrk.append(
"-");
}
wrk.append(std::to_string(
abs(yearnum)));
} else {
TRI_ASSERT(yearnum >= 0);
if (yearnum < 9) {
wrk.append(
"000");
wrk.append(std::to_string(yearnum));
} else if (yearnum < 99) {
wrk.append(
"00");
wrk.append(std::to_string(yearnum));
} else if (yearnum < 999) {
wrk.append(
"0");
wrk.append(std::to_string(yearnum));
} else {
std::string yearstr(
std::to_string(yearnum));
wrk.append(tail(yearstr, 4));
}
}
}},
{"%www"
"w",
[](std::string& wrk, arangodb::tp_sys_clock_ms const& tp) {
weekday wd{floor<date::days>(tp)};
wrk.append(
::weekDayNames[static_cast<unsigned>(wd)]);
}},
{"%mm"
"m",
[](std::string& wrk, arangodb::tp_sys_clock_ms const& tp) {
auto ymd = year_month_day(
floor<date::days>(tp));
wrk.append(
::monthNamesShort[static_cast<unsigned>(ymd.month()) - 1]);
}},
{"%ww"
"w",
[](std::string& wrk, arangodb::tp_sys_clock_ms const& tp) {
weekday wd{floor<date::days>(tp)};
wrk.append(
weekDayNamesShort[static_cast<unsigned>(wd)]);
}},
{"%ff"
"f",
[](std::string& wrk, arangodb::tp_sys_clock_ms const& tp) {
auto day_time = make_time(
tp - floor<date::days>(tp));
uint64_t millis =
day_time
.subseconds()
.count();
if (millis < 10) {
wrk.append(
"00");
} else if (millis < 100) {
wrk.append(
"0");
}
wrk.append(std::to_string(millis));
}},
{"%xx"
"x",
[](std::string& wrk, arangodb::tp_sys_clock_ms const& tp) {
auto ymd = year_month_day(
floor<date::days>(tp));
auto yyyy =
year{ymd.year()};
// we construct the date with the first day in the
// year:
auto firstDayInYear =
yyyy / jan /
day{0};
uint64_t daysSinceFirst =
duration_cast<date::days>(
tp - sys_days(firstDayInYear))
.count();
if (daysSinceFirst < 10) {
wrk.append(
"00");
} else if (daysSinceFirst < 100) {
wrk.append(
"0");
}
wrk.append(std::to_string(daysSinceFirst));
}},
// there"s no really sensible way to handle negative
// years, but better not drop the sign
{"%yy",
[](std::string& wrk, arangodb::tp_sys_clock_ms const& tp) {
auto ymd = year_month_day(
floor<date::days>(tp));
auto yearnum = static_cast<int>(
ymd.year());
if (yearnum < 10 &&
yearnum > -10) {
wrk.append(
"0");
wrk.append(std::to_string(
abs(yearnum)));
} else {
std::string yearstr(std::to_string(
abs(yearnum)));
wrk.append(tail(yearstr, 2));
}
}},
{"%mm",
[](std::string& wrk, arangodb::tp_sys_clock_ms const& tp) {
auto ymd = year_month_day(
floor<date::days>(tp));
auto month = static_cast<unsigned>(
ymd.month());
if (month < 10) {
wrk.append(
"0");
}
wrk.append(std::to_string(month));
}},
{"%dd",
[](std::string& wrk, arangodb::tp_sys_clock_ms const& tp) {
auto ymd = year_month_day(
floor<date::days>(tp));
auto day = static_cast<unsigned>(
ymd.day());
if (day < 10) {
wrk.append(
"0");
}
wrk.append(std::to_string(day));
}},
{"%hh",
[](std::string& wrk, arangodb::tp_sys_clock_ms const& tp) {
auto day_time = make_time(
tp - floor<date::days>(tp));
uint64_t hours =
day_time
.hours()
.count();
if (hours < 10) {
wrk.append(
"0");
}
wrk.append(std::to_string(hours));
}},
{"%ii",
[](std::string& wrk, arangodb::tp_sys_clock_ms const& tp) {
auto day_time = make_time(
tp - floor<date::days>(tp));
uint64_t minutes =
day_time
.minutes()
.count();
if (minutes < 10) {
wrk.append(
"0");
}
wrk.append(std::to_string(minutes));
}},
{"%ss",
[](std::string& wrk, arangodb::tp_sys_clock_ms const& tp) {
auto day_time = make_time(
tp - floor<date::days>(tp));
uint64_t seconds =
day_time
.seconds()
.count();
if (seconds < 10) {
wrk.append(
"0");
}
wrk.append(std::to_string(seconds));
}},
{"%kk",
[](std::string& wrk, arangodb::tp_sys_clock_ms const& tp) {
iso_week::year_weeknum_weekday yww{
floor<date::days>(tp)};
uint64_t isoWeek =
static_cast<unsigned>(
yww.weeknum());
if (isoWeek < 10) {
wrk.append(
"0");
}
wrk.append(std::to_string(isoWeek));
}},
{"%t",
[](std::string& wrk, arangodb::tp_sys_clock_ms const& tp) {
auto diffDuration =
tp - unixEpoch;
auto diff =
duration_cast<duration<double, std::milli>>(
diffDuration)
.count();
wrk.append(std::to_string(static_cast<int64_t>(
std::round(diff))));
}},
{"%z",
[](std::string& wrk,
arangodb::tp_sys_clock_ms const& tp) {
std::string formatted = format(
"%FT%TZ",
floor<milliseconds>(tp));
wrk.append(formatted);
}},
{"%w",
[](std::string& wrk,
arangodb::tp_sys_clock_ms const& tp) {
weekday wd{floor<date::days>(tp)};
wrk.append(std::to_string(
static_cast<unsigned>(wd)));
}},
{"%y",
[](std::string& wrk,
arangodb::tp_sys_clock_ms const& tp) {
auto ymd = year_month_day(
floor<date::days>(tp));
wrk.append(std::to_string(static_cast<int>(
ymd.year())));
}},
{"%m",
[](std::string& wrk,
arangodb::tp_sys_clock_ms const& tp) {
auto ymd = year_month_day(
floor<date::days>(tp));
wrk.append(std::to_string(static_cast<unsigned>(
ymd.month())));
}},
{"%d",
[](std::string& wrk,
arangodb::tp_sys_clock_ms const& tp) {
auto ymd = year_month_day(
floor<date::days>(tp));
wrk.append(std::to_string(static_cast<unsigned>(
ymd.day())));
}},
{"%h",
[](std::string& wrk,
arangodb::tp_sys_clock_ms const& tp) {
auto day_time = make_time(
tp - floor<date::days>(tp));
uint64_t hours =
day_time
.hours()
.count();
wrk.append(std::to_string(hours));
}},
{"%i",
[](std::string& wrk,
arangodb::tp_sys_clock_ms const& tp) {
auto day_time = make_time(
tp - floor<date::days>(tp));
uint64_t minutes =
day_time
.minutes()
.count();
wrk.append(std::to_string(minutes));
}},
{"%s",
[](std::string& wrk,
arangodb::tp_sys_clock_ms const& tp) {
auto day_time = make_time(
tp - floor<date::days>(tp));
uint64_t seconds =
day_time
.seconds()
.count();
wrk.append(std::to_string(seconds));
}},
{"%f",
[](std::string& wrk,
arangodb::tp_sys_clock_ms const& tp) {
auto day_time = make_time(
tp - floor<date::days>(tp));
uint64_t millis =
day_time
.subseconds()
.count();
wrk.append(std::to_string(millis));
}},
{"%x",
[](std::string& wrk,
arangodb::tp_sys_clock_ms const& tp) {
auto ymd = year_month_day(
floor<date::days>(tp));
auto yyyy =
year{ymd.year()};
// We construct the date with the first day in the
// year:
auto firstDayInYear =
yyyy / jan /
day{0};
uint64_t daysSinceFirst =
duration_cast<date::days>(
tp - sys_days(firstDayInYear))
.count();
wrk.append(std::to_string(daysSinceFirst));
}},
{"%k",
[](std::string& wrk,
arangodb::tp_sys_clock_ms const& tp) {
iso_week::year_weeknum_weekday yww{
floor<date::days>(tp)};
uint64_t isoWeek =
static_cast<unsigned>(
yww.weeknum());
wrk.append(std::to_string(isoWeek));
}},
{"%l",
[](std::string& wrk,
arangodb::tp_sys_clock_ms const& tp) {
year_month_day ymd{
floor<date::days>(tp)};
if (ymd.year()
.is_leap()) {
wrk.append(
"1");
} else {
wrk.append(
"0");
}
}},
{"%q",
[](std::string& wrk,
arangodb::tp_sys_clock_ms const& tp) {
year_month_day ymd{
floor<date::days>(tp)};
month m = ymd.month();
uint64_t part = static_cast<uint64_t>(
ceil(unsigned(m) / 3.0f));
TRI_ASSERT(part <= 4);
wrk.append(std::to_string(part));
}},
{"%a",
[](std::string& wrk,
arangodb::tp_sys_clock_ms const& tp) {
auto ymd = year_month_day{
floor<date::days>(tp)};
auto lastMonthDay =
ymd.year() /
ymd.month() / last;
wrk.append(std::to_string(static_cast<unsigned>(
lastMonthDay
.day())));
}},
{"%%",
[](std::string& wrk,
arangodb::tp_sys_clock_ms const& tp) {
wrk.append(
"%");
}},
{"%", [](std::string& wrk,
arangodb::tp_sys_clock_ms const& tp) {
}}};
// will be populated by DateRegexInitializer
std::regex dateFormatRegex;
std::regex const iso8601Regex( std::regex const iso8601Regex(
"(\\+|\\-)?\\d+(\\-\\d{1,2}(\\-\\d{1,2})?)?" // YY[YY]-MM-DD "(\\+|\\-)?\\d+(\\-\\d{1,2}(\\-\\d{1,2})?)?" // YY[YY]-MM-DD
"(" "("
@ -92,9 +579,64 @@ std::regex const durationRegex(
"P((\\d+)Y)?((\\d+)M)?((\\d+)W)?((\\d+)D)?(T((\\d+)H)?((\\d+)M)?((\\d+)(\\." "P((\\d+)Y)?((\\d+)M)?((\\d+)W)?((\\d+)D)?(T((\\d+)H)?((\\d+)M)?((\\d+)(\\."
"(\\d{1,3}))?S)?)?"); "(\\d{1,3}))?S)?)?");
struct DateRegexInitializer {
DateRegexInitializer() {
std::string myregex;
dateMap.reserve(sortedDateMap.size());
std::for_each(sortedDateMap.begin(), sortedDateMap.end(),
[&myregex](std::pair<std::string const&, format_func_t> const& p) {
(myregex.length() > 0) ? myregex += "|" + p.first : myregex = p.first;
dateMap.insert(std::make_pair(p.first, p.second));
});
dateFormatRegex = std::regex(myregex);
}
};
std::string executeDateFormatRegex(std::string const& search,
arangodb::tp_sys_clock_ms const& tp) {
std::string s;
auto first = search.begin();
auto last = search.end();
typename std::smatch::difference_type positionOfLastMatch = 0;
auto endOfLastMatch = first;
auto callback = [&tp, &endOfLastMatch, &positionOfLastMatch, &s](std::smatch const& match) {
auto positionOfThisMatch = match.position(0);
auto diff = positionOfThisMatch - positionOfLastMatch;
auto startOfThisMatch = endOfLastMatch;
std::advance(startOfThisMatch, diff);
s.append(endOfLastMatch, startOfThisMatch);
auto got = dateMap.find(match.str(0));
if (got != dateMap.end()) {
got->second(s, tp);
}
auto lengthOfMatch = match.length(0);
positionOfLastMatch = positionOfThisMatch + lengthOfMatch;
endOfLastMatch = startOfThisMatch;
std::advance(endOfLastMatch, lengthOfMatch);
};
std::regex_iterator<std::string::const_iterator> end;
std::regex_iterator<std::string::const_iterator> begin(first, last, dateFormatRegex);
std::for_each(begin, end, callback);
s.append(endOfLastMatch, last);
return s;
}
// populates dateFormatRegex
static DateRegexInitializer const initializer;
} // namespace } // namespace
bool arangodb::basics::parse_dateTime(std::string const& dateTimeIn, tp_sys_clock_ms& date_tp) { bool arangodb::basics::parseDateTime(std::string const& dateTimeIn, arangodb::tp_sys_clock_ms& date_tp) {
using namespace date; using namespace date;
using namespace std::chrono; using namespace std::chrono;
@ -233,8 +775,8 @@ bool arangodb::basics::parse_dateTime(std::string const& dateTimeIn, tp_sys_cloc
return true; return true;
} }
bool arangodb::basics::regex_isoDuration(std::string const& isoDuration, bool arangodb::basics::regexIsoDuration(std::string const& isoDuration,
std::smatch& durationParts) { std::smatch& durationParts) {
if (isoDuration.length() <= 1) { if (isoDuration.length() <= 1) {
return false; return false;
} }
@ -245,3 +787,9 @@ bool arangodb::basics::regex_isoDuration(std::string const& isoDuration,
return true; return true;
} }
std::string arangodb::basics::formatDate(std::string const& formatString,
arangodb::tp_sys_clock_ms const& dateValue) {
return ::executeDateFormatRegex(formatString, dateValue);
}

View File

@ -23,6 +23,8 @@
#ifndef ARANGODB_BASICS_DATETIME_H #ifndef ARANGODB_BASICS_DATETIME_H
#define ARANGODB_BASICS_DATETIME_H 1 #define ARANGODB_BASICS_DATETIME_H 1
#include "Basics/Common.h"
#include <chrono> #include <chrono>
#include <regex> #include <regex>
@ -32,9 +34,15 @@ using tp_sys_clock_ms =
std::chrono::time_point<std::chrono::system_clock, std::chrono::milliseconds>; std::chrono::time_point<std::chrono::system_clock, std::chrono::milliseconds>;
namespace basics { namespace basics {
bool parse_dateTime(std::string const& dateTime, tp_sys_clock_ms& date_tp); bool parseDateTime(std::string const& dateTime,
tp_sys_clock_ms& date_tp);
bool regex_isoDuration(std::string const& isoDuration, std::smatch& durationParts); bool regexIsoDuration(std::string const& isoDuration,
std::smatch& durationParts);
/// @brief formats a date(time) value according to formatString
std::string formatDate(std::string const& formatString,
tp_sys_clock_ms const& dateValue);
} // namespace basics } // namespace basics
} // namespace arangodb } // namespace arangodb

View File

@ -231,6 +231,11 @@ add_library(${LIB_ARANGO} STATIC
Logger/LoggerBufferFeature.cpp Logger/LoggerBufferFeature.cpp
Logger/LoggerFeature.cpp Logger/LoggerFeature.cpp
Logger/LoggerStream.cpp Logger/LoggerStream.cpp
Maskings/AttributeMasking.cpp
Maskings/Collection.cpp
Maskings/Maskings.cpp
Maskings/Path.cpp
Maskings/RandomStringMask.cpp
ProgramOptions/Option.cpp ProgramOptions/Option.cpp
ProgramOptions/ProgramOptions.cpp ProgramOptions/ProgramOptions.cpp
ProgramOptions/Section.cpp ProgramOptions/Section.cpp

View File

@ -0,0 +1,94 @@
////////////////////////////////////////////////////////////////////////////////
/// DISCLAIMER
///
/// Copyright 2018 ArangoDB GmbH, Cologne, Germany
///
/// Licensed under the Apache License, Version 2.0 (the "License");
/// you may not use this file except in compliance with the License.
/// You may obtain a copy of the License at
///
/// http://www.apache.org/licenses/LICENSE-2.0
///
/// Unless required by applicable law or agreed to in writing, software
/// distributed under the License is distributed on an "AS IS" BASIS,
/// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
/// See the License for the specific language governing permissions and
/// limitations under the License.
///
/// Copyright holder is ArangoDB GmbH, Cologne, Germany
///
/// @author Frank Celler
////////////////////////////////////////////////////////////////////////////////
#include "AttributeMasking.h"
#include "Basics/StringUtils.h"
#include "Logger/Logger.h"
#include "Maskings/RandomStringMask.h"
using namespace arangodb;
using namespace arangodb::maskings;
void arangodb::maskings::InstallMaskings() {
AttributeMasking::installMasking("randomString", RandomStringMask::create);
}
std::unordered_map<std::string, ParseResult<AttributeMasking> (*)(Path, Maskings*, VPackSlice const&)> AttributeMasking::_maskings;
ParseResult<AttributeMasking> AttributeMasking::parse(Maskings* maskings,
VPackSlice const& def) {
if (!def.isObject()) {
return ParseResult<AttributeMasking>(
ParseResult<AttributeMasking>::PARSE_FAILED,
"expecting an object for collection definition");
}
std::string path = "";
std::string type = "";
for (auto const& entry : VPackObjectIterator(def, false)) {
std::string key = entry.key.copyString();
if (key == "type") {
if (!entry.value.isString()) {
return ParseResult<AttributeMasking>(ParseResult<AttributeMasking>::ILLEGAL_PARAMETER,
"type must be a string");
}
type = entry.value.copyString();
} else if (key == "path") {
if (!entry.value.isString()) {
return ParseResult<AttributeMasking>(ParseResult<AttributeMasking>::ILLEGAL_PARAMETER,
"path must be a string");
}
path = entry.value.copyString();
}
}
if (path.empty()) {
return ParseResult<AttributeMasking>(ParseResult<AttributeMasking>::ILLEGAL_PARAMETER,
"path must not be empty");
}
ParseResult<Path> ap = Path::parse(path);
if (ap.status != ParseResult<Path>::VALID) {
return ParseResult<AttributeMasking>(
(ParseResult<AttributeMasking>::StatusCode)(int)ap.status, ap.message);
}
auto const& it = _maskings.find(type);
if (it == _maskings.end()) {
return ParseResult<AttributeMasking>(
ParseResult<AttributeMasking>::UNKNOWN_TYPE,
"unknown attribute masking type '" + type + "'");
}
return it->second(ap.result, maskings, def);
}
bool AttributeMasking::match(std::vector<std::string> const& path) const {
return _path.match(path);
}

View File

@ -0,0 +1,70 @@
////////////////////////////////////////////////////////////////////////////////
/// DISCLAIMER
///
/// Copyright 2018 ArangoDB GmbH, Cologne, Germany
///
/// Licensed under the Apache License, Version 2.0 (the "License");
/// you may not use this file except in compliance with the License.
/// You may obtain a copy of the License at
///
/// http://www.apache.org/licenses/LICENSE-2.0
///
/// Unless required by applicable law or agreed to in writing, software
/// distributed under the License is distributed on an "AS IS" BASIS,
/// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
/// See the License for the specific language governing permissions and
/// limitations under the License.
///
/// Copyright holder is ArangoDB GmbH, Cologne, Germany
///
/// @author Frank Celler
////////////////////////////////////////////////////////////////////////////////
#ifndef ARANGODB_MASKINGS_ATTRIBUTE_MASKING_H
#define ARANGODB_MASKINGS_ATTRIBUTE_MASKING_H 1
#include "Basics/Common.h"
#include <velocypack/Builder.h>
#include <velocypack/Iterator.h>
#include <velocypack/Parser.h>
#include <velocypack/Slice.h>
#include <velocypack/velocypack-aliases.h>
#include "Maskings/MaskingFunction.h"
#include "Maskings/ParseResult.h"
#include "Maskings/Path.h"
namespace arangodb {
namespace maskings {
void InstallMaskings();
class AttributeMasking {
public:
static ParseResult<AttributeMasking> parse(Maskings*, VPackSlice const&);
static void installMasking(std::string const& name, ParseResult<AttributeMasking> (* func)(Path, Maskings*, VPackSlice const&)) {
_maskings[name] = func;
}
public:
AttributeMasking() = default;
AttributeMasking(Path const& path, MaskingFunction* func) : _path(path) {
_func.reset(func);
}
bool match(std::vector<std::string> const&) const;
MaskingFunction* func() const { return _func.get(); }
private:
static std::unordered_map<std::string, ParseResult<AttributeMasking> (*)(Path, Maskings*, VPackSlice const&)> _maskings;
private:
Path _path;
std::shared_ptr<MaskingFunction> _func;
};
} // namespace maskings
} // namespace arangodb
#endif

View File

@ -0,0 +1,99 @@
////////////////////////////////////////////////////////////////////////////////
/// DISCLAIMER
///
/// Copyright 2018 ArangoDB GmbH, Cologne, Germany
///
/// Licensed under the Apache License, Version 2.0 (the "License");
/// you may not use this file except in compliance with the License.
/// You may obtain a copy of the License at
///
/// http://www.apache.org/licenses/LICENSE-2.0
///
/// Unless required by applicable law or agreed to in writing, software
/// distributed under the License is distributed on an "AS IS" BASIS,
/// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
/// See the License for the specific language governing permissions and
/// limitations under the License.
///
/// Copyright holder is ArangoDB GmbH, Cologne, Germany
///
/// @author Frank Celler
////////////////////////////////////////////////////////////////////////////////
#include "Collection.h"
#include "Logger/Logger.h"
using namespace arangodb;
using namespace arangodb::maskings;
ParseResult<Collection> Collection::parse(Maskings* maskings, VPackSlice const& def) {
if (!def.isObject()) {
return ParseResult<Collection>(
ParseResult<Collection>::PARSE_FAILED,
"expecting an object for collection definition");
}
std::string type = "";
std::vector<AttributeMasking> attributes;
for (auto const& entry : VPackObjectIterator(def, false)) {
std::string key = entry.key.copyString();
if (key == "type") {
if (!entry.value.isString()) {
return ParseResult<Collection>(
ParseResult<Collection>::ILLEGAL_PARAMETER,
"expecting a string for collection type");
}
type = entry.value.copyString();
} else if (key == "maskings") {
if (!entry.value.isArray()) {
return ParseResult<Collection>(
ParseResult<Collection>::ILLEGAL_PARAMETER,
"expecting an array for collection maskings");
}
for (auto const& mask : VPackArrayIterator(entry.value)) {
ParseResult<AttributeMasking> am = AttributeMasking::parse(maskings, mask);
if (am.status != ParseResult<AttributeMasking>::VALID) {
return ParseResult<Collection>((ParseResult<Collection>::StatusCode)(
int)am.status,
am.message);
}
attributes.push_back(am.result);
}
}
}
CollectionSelection selection = CollectionSelection::FULL;
if (type == "full") {
selection = CollectionSelection::FULL;
} else if (type == "exclude") {
selection = CollectionSelection::EXCLUDE;
} else if (type == "masked") {
selection = CollectionSelection::MASKED;
} else if (type == "structure") {
selection = CollectionSelection::STRUCTURE;
} else {
return ParseResult<Collection>(ParseResult<Collection>::UNKNOWN_TYPE,
"found unknown collection type '" + type +
"'");
}
return ParseResult<Collection>(Collection(selection, attributes));
}
MaskingFunction* Collection::masking(std::vector<std::string> const& path) {
for (auto const& m : _maskings) {
if (m.match(path)) {
return m.func();
}
}
return nullptr;
}

63
lib/Maskings/Collection.h Normal file
View File

@ -0,0 +1,63 @@
////////////////////////////////////////////////////////////////////////////////
/// DISCLAIMER
///
/// Copyright 2018 ArangoDB GmbH, Cologne, Germany
///
/// Licensed under the Apache License, Version 2.0 (the "License");
/// you may not use this file except in compliance with the License.
/// You may obtain a copy of the License at
///
/// http://www.apache.org/licenses/LICENSE-2.0
///
/// Unless required by applicable law or agreed to in writing, software
/// distributed under the License is distributed on an "AS IS" BASIS,
/// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
/// See the License for the specific language governing permissions and
/// limitations under the License.
///
/// Copyright holder is ArangoDB GmbH, Cologne, Germany
///
/// @author Frank Celler
////////////////////////////////////////////////////////////////////////////////
#ifndef ARANGODB_MASKINGS_COLLECTION_H
#define ARANGODB_MASKINGS_COLLECTION_H 1
#include "Basics/Common.h"
#include <velocypack/Builder.h>
#include <velocypack/Iterator.h>
#include <velocypack/Parser.h>
#include <velocypack/Slice.h>
#include <velocypack/velocypack-aliases.h>
#include "Maskings/AttributeMasking.h"
#include "Maskings/CollectionFilter.h"
#include "Maskings/CollectionSelection.h"
#include "Maskings/ParseResult.h"
namespace arangodb {
namespace maskings {
class Collection {
public:
static ParseResult<Collection> parse(Maskings* maskings, VPackSlice const&);
public:
Collection() {}
Collection(CollectionSelection selection, std::vector<AttributeMasking> const& maskings)
: _selection(selection), _maskings(maskings) {}
CollectionSelection selection() const noexcept { return _selection; }
MaskingFunction* masking(std::vector<std::string> const& path);
private:
CollectionSelection _selection;
// LATER: CollectionFilter _filter;
std::vector<AttributeMasking> _maskings;
};
} // namespace maskings
} // namespace arangodb
#endif

View File

@ -0,0 +1,34 @@
////////////////////////////////////////////////////////////////////////////////
/// DISCLAIMER
///
/// Copyright 2018 ArangoDB GmbH, Cologne, Germany
///
/// Licensed under the Apache License, Version 2.0 (the "License");
/// you may not use this file except in compliance with the License.
/// You may obtain a copy of the License at
///
/// http://www.apache.org/licenses/LICENSE-2.0
///
/// Unless required by applicable law or agreed to in writing, software
/// distributed under the License is distributed on an "AS IS" BASIS,
/// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
/// See the License for the specific language governing permissions and
/// limitations under the License.
///
/// Copyright holder is ArangoDB GmbH, Cologne, Germany
///
/// @author Frank Celler
////////////////////////////////////////////////////////////////////////////////
#ifndef ARANGODB_MASKINGS_COLLECTION_FILTER_H
#define ARANGODB_MASKINGS_COLLECTION_FILTER_H 1
#include "Basics/Common.h"
namespace arangodb {
namespace maskings {
class CollectionFilter {};
} // namespace maskings
} // namespace arangodb
#endif

View File

@ -0,0 +1,34 @@
////////////////////////////////////////////////////////////////////////////////
/// DISCLAIMER
///
/// Copyright 2018 ArangoDB GmbH, Cologne, Germany
///
/// Licensed under the Apache License, Version 2.0 (the "License");
/// you may not use this file except in compliance with the License.
/// You may obtain a copy of the License at
///
/// http://www.apache.org/licenses/LICENSE-2.0
///
/// Unless required by applicable law or agreed to in writing, software
/// distributed under the License is distributed on an "AS IS" BASIS,
/// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
/// See the License for the specific language governing permissions and
/// limitations under the License.
///
/// Copyright holder is ArangoDB GmbH, Cologne, Germany
///
/// @author Frank Celler
////////////////////////////////////////////////////////////////////////////////
#ifndef ARANGODB_MASKINGS_COLLECTION_SELECTION_H
#define ARANGODB_MASKINGS_COLLECTION_SELECTION_H 1
#include "Basics/Common.h"
namespace arangodb {
namespace maskings {
enum class CollectionSelection { FULL, MASKED, EXCLUDE, STRUCTURE };
} // namespace maskings
} // namespace arangodb
#endif

View File

@ -0,0 +1,62 @@
////////////////////////////////////////////////////////////////////////////////
/// DISCLAIMER
///
/// Copyright 2018 ArangoDB GmbH, Cologne, Germany
///
/// Licensed under the Apache License, Version 2.0 (the "License");
/// you may not use this file except in compliance with the License.
/// You may obtain a copy of the License at
///
/// http://www.apache.org/licenses/LICENSE-2.0
///
/// Unless required by applicable law or agreed to in writing, software
/// distributed under the License is distributed on an "AS IS" BASIS,
/// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
/// See the License for the specific language governing permissions and
/// limitations under the License.
///
/// Copyright holder is ArangoDB GmbH, Cologne, Germany
///
/// @author Frank Celler
////////////////////////////////////////////////////////////////////////////////
#ifndef ARANGODB_MASKINGS_MASKING_FUNCTION_H
#define ARANGODB_MASKINGS_MASKING_FUNCTION_H 1
#include "Basics/Common.h"
#include "Basics/Utf8Helper.h"
#include <velocypack/Builder.h>
#include <velocypack/Iterator.h>
#include <velocypack/Parser.h>
#include <velocypack/Slice.h>
#include <velocypack/velocypack-aliases.h>
namespace arangodb {
namespace maskings {
class Maskings;
class MaskingFunction {
public:
static bool isNameChar(UChar32 ch) {
return u_isalpha(ch) || u_isdigit(ch) || ch == U'_' || ch == U'-';
}
public:
explicit MaskingFunction(Maskings* maskings) : _maskings(maskings) {}
virtual ~MaskingFunction() {}
public:
virtual VPackValue mask(bool, std::string& buffer) const = 0;
virtual VPackValue mask(std::string const&, std::string& buffer) const = 0;
virtual VPackValue mask(int64_t, std::string& buffer) const = 0;
virtual VPackValue mask(double, std::string& buffer) const = 0;
protected:
Maskings* _maskings;
};
} // namespace maskings
} // namespace arangodb
#endif

358
lib/Maskings/Maskings.cpp Normal file
View File

@ -0,0 +1,358 @@
////////////////////////////////////////////////////////////////////////////////
/// DISCLAIMER
///
/// Copyright 2018 ArangoDB GmbH, Cologne, Germany
///
/// Licensed under the Apache License, Version 2.0 (the "License");
/// you may not use this file except in compliance with the License.
/// You may obtain a copy of the License at
///
/// http://www.apache.org/licenses/LICENSE-2.0
///
/// Unless required by applicable law or agreed to in writing, software
/// distributed under the License is distributed on an "AS IS" BASIS,
/// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
/// See the License for the specific language governing permissions and
/// limitations under the License.
///
/// Copyright holder is ArangoDB GmbH, Cologne, Germany
///
/// @author Frank Celler
////////////////////////////////////////////////////////////////////////////////
#include "Maskings.h"
#include <iostream>
#include "Basics/FileUtils.h"
#include "Logger/Logger.h"
#include "Random/RandomGenerator.h"
#include <velocypack/Iterator.h>
#include <velocypack/Parser.h>
#include <velocypack/velocypack-aliases.h>
using namespace arangodb;
using namespace arangodb::maskings;
MaskingsResult Maskings::fromFile(std::string const& filename) {
std::string definition;
try {
definition = basics::FileUtils::slurp(filename);
} catch (std::exception const& e) {
std::string msg = "cannot read maskings file '" + filename + "': " + e.what();
LOG_TOPIC(DEBUG, Logger::CONFIG) << msg;
return MaskingsResult(MaskingsResult::CANNOT_READ_FILE, msg);
}
LOG_TOPIC(DEBUG, Logger::CONFIG) << "found maskings file '" << filename;
if (definition.empty()) {
std::string msg = "maskings file '" + filename + "' is empty";
LOG_TOPIC(DEBUG, Logger::CONFIG) << msg;
return MaskingsResult(MaskingsResult::CANNOT_READ_FILE, msg);
}
std::unique_ptr<Maskings> maskings(new Maskings{});
maskings.get()->_randomSeed = RandomGenerator::interval(UINT64_MAX);
try {
std::shared_ptr<VPackBuilder> parsed = velocypack::Parser::fromJson(definition);
ParseResult<Maskings> res = maskings->parse(parsed->slice());
if (res.status != ParseResult<Maskings>::VALID) {
return MaskingsResult(MaskingsResult::ILLEGAL_DEFINITION, res.message);
}
return MaskingsResult(std::move(maskings));
} catch (velocypack::Exception const& e) {
std::string msg = "cannot parse maskings file '" + filename + "': " + e.what();
LOG_TOPIC(DEBUG, Logger::CONFIG) << msg << ". file content: " << definition;
return MaskingsResult(MaskingsResult::CANNOT_PARSE_FILE, msg);
}
}
ParseResult<Maskings> Maskings::parse(VPackSlice const& def) {
if (!def.isObject()) {
return ParseResult<Maskings>(ParseResult<Maskings>::DUPLICATE_COLLECTION,
"expecting an object for masking definition");
}
for (auto const& entry : VPackObjectIterator(def, false)) {
std::string key = entry.key.copyString();
if (key == "*") {
LOG_TOPIC(TRACE, Logger::CONFIG) << "default masking";
if (_hasDefaultCollection) {
return ParseResult<Maskings>(ParseResult<Maskings>::DUPLICATE_COLLECTION,
"duplicate default entry");
}
} else {
LOG_TOPIC(TRACE, Logger::CONFIG) << "masking collection '" << key << "'";
if (_collections.find(key) != _collections.end()) {
return ParseResult<Maskings>(ParseResult<Maskings>::DUPLICATE_COLLECTION,
"duplicate collection entry '" + key + "'");
}
}
ParseResult<Collection> c = Collection::parse(this, entry.value);
if (c.status != ParseResult<Collection>::VALID) {
return ParseResult<Maskings>((ParseResult<Maskings>::StatusCode)(int)c.status,
c.message);
}
if (key == "*") {
_hasDefaultCollection = true;
_defaultCollection = c.result;
} else {
_collections[key] = c.result;
}
}
return ParseResult<Maskings>(ParseResult<Maskings>::VALID);
}
bool Maskings::shouldDumpStructure(std::string const& name) {
CollectionSelection select = CollectionSelection::EXCLUDE;
auto const itr = _collections.find(name);
if (itr == _collections.end()) {
if (_hasDefaultCollection) {
select = _defaultCollection.selection();
}
} else {
select = itr->second.selection();
}
switch (select) {
case CollectionSelection::FULL:
return true;
case CollectionSelection::MASKED:
return true;
case CollectionSelection::EXCLUDE:
return false;
case CollectionSelection::STRUCTURE:
return true;
}
// should not get here. however, compiler warns about it
TRI_ASSERT(false);
return false;
}
bool Maskings::shouldDumpData(std::string const& name) {
CollectionSelection select = CollectionSelection::EXCLUDE;
auto const itr = _collections.find(name);
if (itr == _collections.end()) {
if (_hasDefaultCollection) {
select = _defaultCollection.selection();
}
} else {
select = itr->second.selection();
}
switch (select) {
case CollectionSelection::FULL:
return true;
case CollectionSelection::MASKED:
return true;
case CollectionSelection::EXCLUDE:
return false;
case CollectionSelection::STRUCTURE:
return false;
}
// should not get here. however, compiler warns about it
TRI_ASSERT(false);
return false;
}
VPackValue Maskings::maskedItem(Collection& collection, std::vector<std::string>& path,
std::string& buffer, VPackSlice const& data) {
static std::string xxxx("xxxx");
if (path.size() == 1) {
if (path[0] == "_key" || path[0] == "_id" || path[0] == "_rev" ||
path[0] == "_from" || path[0] == "_to") {
if (data.isString()) {
velocypack::ValueLength length;
char const* c = data.getString(length);
buffer = std::string(c, length);
return VPackValue(buffer);
} else if (data.isInteger()) {
return VPackValue(data.getInt());
}
}
}
MaskingFunction* func = collection.masking(path);
if (func == nullptr) {
if (data.isBool()) {
return VPackValue(data.getBool());
} else if (data.isString()) {
velocypack::ValueLength length;
char const* c = data.getString(length);
buffer = std::string(c, length);
return VPackValue(buffer);
} else if (data.isInteger()) {
return VPackValue(data.getInt());
} else if (data.isDouble()) {
return VPackValue(data.getDouble());
} else {
return VPackValue(VPackValueType::Null);
}
} else {
if (data.isBool()) {
return func->mask(data.getBool(), buffer);
} else if (data.isString()) {
velocypack::ValueLength length;
char const* c = data.getString(length);
return func->mask(std::string(c, length), buffer);
} else if (data.isInteger()) {
return func->mask(data.getInt(), buffer);
} else if (data.isDouble()) {
return func->mask(data.getDouble(), buffer);
} else {
return VPackValue(VPackValueType::Null);
}
}
return VPackValue(xxxx);
}
void Maskings::addMaskedArray(Collection& collection, VPackBuilder& builder,
std::vector<std::string>& path, VPackSlice const& data) {
for (auto const& entry : VPackArrayIterator(data)) {
if (entry.isObject()) {
VPackObjectBuilder ob(&builder);
addMaskedObject(collection, builder, path, entry);
} else if (entry.isArray()) {
VPackArrayBuilder ap(&builder);
addMaskedArray(collection, builder, path, entry);
} else {
std::string buffer;
builder.add(maskedItem(collection, path, buffer, entry));
}
}
}
void Maskings::addMaskedObject(Collection& collection, VPackBuilder& builder,
std::vector<std::string>& path, VPackSlice const& data) {
for (auto const& entry : VPackObjectIterator(data, false)) {
std::string key = entry.key.copyString();
VPackSlice const& value = entry.value;
path.push_back(key);
if (value.isObject()) {
VPackObjectBuilder ob(&builder, key);
addMaskedObject(collection, builder, path, value);
} else if (value.isArray()) {
VPackArrayBuilder ap(&builder, key);
addMaskedArray(collection, builder, path, value);
} else {
std::string buffer;
builder.add(key, maskedItem(collection, path, buffer, value));
}
path.pop_back();
}
}
void Maskings::addMasked(Collection& collection, VPackBuilder& builder,
VPackSlice const& data) {
if (!data.isObject()) {
return;
}
std::vector<std::string> path;
std::string dataStr("data");
VPackObjectBuilder ob(&builder, dataStr);
addMaskedObject(collection, builder, path, data);
}
void Maskings::addMasked(Collection& collection, basics::StringBuffer& data,
VPackSlice const& slice) {
if (!slice.isObject()) {
return;
}
velocypack::StringRef dataStrRef("data");
VPackBuilder builder;
{
VPackObjectBuilder ob(&builder);
for (auto const& entry : VPackObjectIterator(slice, false)) {
velocypack::StringRef key = entry.key.stringRef();
if (key.equals(dataStrRef)) {
addMasked(collection, builder, entry.value);
} else {
builder.add(key, entry.value);
}
}
}
std::string masked = builder.toJson();
data.appendText(masked);
data.appendText("\n");
}
void Maskings::mask(std::string const& name, basics::StringBuffer const& data,
basics::StringBuffer& result) {
result.clear();
Collection* collection;
auto const itr = _collections.find(name);
if (itr == _collections.end()) {
if (_hasDefaultCollection) {
collection = &_defaultCollection;
} else {
result.copy(data);
return;
}
} else {
collection = &(itr->second);
}
if (collection->selection() == CollectionSelection::FULL) {
result.copy(data);
return;
}
result.reserve(data.length());
char const* p = data.c_str();
char const* e = p + data.length();
char const* q = p;
while (p < e) {
while (p < e && (*p != '\n' && *p != '\r')) {
++p;
}
std::shared_ptr<VPackBuilder> builder = VPackParser::fromJson(q, p - q);
addMasked(*collection, result, builder->slice());
while (p < e && (*p == '\n' || *p == '\r')) {
++p;
}
q = p;
}
}

91
lib/Maskings/Maskings.h Normal file
View File

@ -0,0 +1,91 @@
////////////////////////////////////////////////////////////////////////////////
/// DISCLAIMER
///
/// Copyright 2018 ArangoDB GmbH, Cologne, Germany
///
/// Licensed under the Apache License, Version 2.0 (the "License");
/// you may not use this file except in compliance with the License.
/// You may obtain a copy of the License at
///
/// http://www.apache.org/licenses/LICENSE-2.0
///
/// Unless required by applicable law or agreed to in writing, software
/// distributed under the License is distributed on an "AS IS" BASIS,
/// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
/// See the License for the specific language governing permissions and
/// limitations under the License.
///
/// Copyright holder is ArangoDB GmbH, Cologne, Germany
///
/// @author Frank Celler
////////////////////////////////////////////////////////////////////////////////
#ifndef ARANGODB_MASKINGS_MASKINGS_H
#define ARANGODB_MASKINGS_MASKINGS_H 1
#include "Basics/Common.h"
#include <velocypack/Builder.h>
#include <velocypack/Slice.h>
#include <velocypack/velocypack-aliases.h>
#include "Basics/StringBuffer.h"
#include "Maskings/Collection.h"
#include "Maskings/ParseResult.h"
namespace arangodb {
namespace maskings {
class Maskings;
struct MaskingsResult {
enum StatusCode : int {
VALID,
CANNOT_PARSE_FILE,
CANNOT_READ_FILE,
ILLEGAL_DEFINITION
};
MaskingsResult(StatusCode s, std::string const& m)
: status(s), message(m), maskings(nullptr) {}
explicit MaskingsResult(std::unique_ptr<Maskings>&& m)
: status(StatusCode::VALID), maskings(std::move(m)) {}
StatusCode status;
std::string message;
std::unique_ptr<Maskings> maskings;
};
class Maskings {
public:
static MaskingsResult fromFile(std::string const&);
public:
bool shouldDumpStructure(std::string const& name);
bool shouldDumpData(std::string const& name);
void mask(std::string const& name, basics::StringBuffer const& data,
basics::StringBuffer& result);
uint64_t randomSeed() const noexcept { return _randomSeed; }
private:
ParseResult<Maskings> parse(VPackSlice const&);
VPackValue maskedItem(Collection& collection, std::vector<std::string>& path,
std::string& buffer, VPackSlice const& data);
void addMaskedArray(Collection& collection, VPackBuilder& builder,
std::vector<std::string>& path, VPackSlice const& data);
void addMaskedObject(Collection& collection, VPackBuilder& builder,
std::vector<std::string>& path, VPackSlice const& data);
void addMasked(Collection& collection, VPackBuilder& builder, VPackSlice const& data);
void addMasked(Collection& collection, basics::StringBuffer&, VPackSlice const& data);
private:
std::map<std::string, Collection> _collections;
bool _hasDefaultCollection = false;
Collection _defaultCollection;
uint64_t _randomSeed = 0;
};
} // namespace maskings
} // namespace arangodb
#endif

View File

@ -0,0 +1,51 @@
////////////////////////////////////////////////////////////////////////////////
/// DISCLAIMER
///
/// Copyright 2018 ArangoDB GmbH, Cologne, Germany
///
/// Licensed under the Apache License, Version 2.0 (the "License");
/// you may not use this file except in compliance with the License.
/// You may obtain a copy of the License at
///
/// http://www.apache.org/licenses/LICENSE-2.0
///
/// Unless required by applicable law or agreed to in writing, software
/// distributed under the License is distributed on an "AS IS" BASIS,
/// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
/// See the License for the specific language governing permissions and
/// limitations under the License.
///
/// Copyright holder is ArangoDB GmbH, Cologne, Germany
///
/// @author Frank Celler
////////////////////////////////////////////////////////////////////////////////
#ifndef ARANGODB_MASKINGS_PARSE_RESULT_H
#define ARANGODB_MASKINGS_PARSE_RESULT_H
#include "Basics/Common.h"
template <typename T>
struct ParseResult {
enum StatusCode : int {
VALID,
PARSE_FAILED,
DUPLICATE_COLLECTION,
UNKNOWN_TYPE,
ILLEGAL_PARAMETER
};
ParseResult(StatusCode status) : status(status) {}
ParseResult(StatusCode status, std::string message)
: status(status), message(message), result(T()) {}
ParseResult(T&& result)
: status(StatusCode::VALID), result(std::move(result)) {}
StatusCode status;
std::string message;
T result;
};
#endif

149
lib/Maskings/Path.cpp Normal file
View File

@ -0,0 +1,149 @@
////////////////////////////////////////////////////////////////////////////////
/// DISCLAIMER
///
/// Copyright 2018 ArangoDB GmbH, Cologne, Germany
///
/// Licensed under the Apache License, Version 2.0 (the "License");
/// you may not use this file except in compliance with the License.
/// You may obtain a copy of the License at
///
/// http://www.apache.org/licenses/LICENSE-2.0
///
/// Unless required by applicable law or agreed to in writing, software
/// distributed under the License is distributed on an "AS IS" BASIS,
/// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
/// See the License for the specific language governing permissions and
/// limitations under the License.
///
/// Copyright holder is ArangoDB GmbH, Cologne, Germany
///
/// @author Frank Celler
////////////////////////////////////////////////////////////////////////////////
#include "Collection.h"
#include "Basics/StringUtils.h"
#include "Basics/Utf8Helper.h"
#include "Logger/Logger.h"
using namespace arangodb;
using namespace arangodb::maskings;
ParseResult<Path> Path::parse(std::string const& def) {
if (def.empty()) {
return ParseResult<Path>(ParseResult<Path>::ILLEGAL_PARAMETER,
"path must not be empty");
}
bool wildcard = false;
if (def[0] == '.') {
wildcard = true;
}
uint8_t const* p = reinterpret_cast<uint8_t const*>(def.c_str());
int32_t off = 0;
int32_t len = def.size();
UChar32 ch;
if (wildcard) {
U8_NEXT(p, off, len, ch);
}
std::vector<std::string> components;
std::string buffer;
while (off < len) {
U8_NEXT(p, off, len, ch);
if (ch < 0) {
return ParseResult<Path>(ParseResult<Path>::ILLEGAL_PARAMETER,
"path '" + def + "' contains illegal UTF-8");
} else if (ch == 46) {
if (buffer.size() == 0) {
return ParseResult<Path>(ParseResult<Path>::ILLEGAL_PARAMETER,
"path '" + def +
"' contains an empty component");
}
components.push_back(buffer);
buffer.clear();
} else if (ch == 96 || ch == 180) { // windows does not like U'`' and U'´'
UChar32 quote = ch;
U8_NEXT(p, off, len, ch);
if (ch < 0) {
return ParseResult<Path>(ParseResult<Path>::ILLEGAL_PARAMETER,
"path '" + def + "' contains illegal UTF-8");
}
while (off < len && ch != quote) {
basics::Utf8Helper::appendUtf8Character(buffer, ch);
U8_NEXT(p, off, len, ch);
if (ch < 0) {
return ParseResult<Path>(ParseResult<Path>::ILLEGAL_PARAMETER,
"path '" + def + "' contains illegal UTF-8");
}
}
if (ch != quote) {
return ParseResult<Path>(ParseResult<Path>::ILLEGAL_PARAMETER,
"path '" + def +
"' contains an unbalanced quote");
}
U8_NEXT(p, off, len, ch);
if (ch < 0) {
return ParseResult<Path>(ParseResult<Path>::ILLEGAL_PARAMETER,
"path '" + def + "' contains illegal UTF-8");
}
} else {
basics::Utf8Helper::appendUtf8Character(buffer, ch);
}
}
if (buffer.size() == 0) {
return ParseResult<Path>(ParseResult<Path>::ILLEGAL_PARAMETER,
"path '" + def + "' contains an empty component");
}
components.push_back(buffer);
if (components.empty()) {
return ParseResult<Path>(ParseResult<Path>::ILLEGAL_PARAMETER,
"path '" + def + "' contains no component");
}
return ParseResult<Path>(Path(wildcard, components));
}
bool Path::match(std::vector<std::string> const& path) const {
size_t cs = _components.size();
size_t ps = path.size();
if (!_wildcard) {
if (ps != cs) {
return false;
}
}
if (ps < cs) {
return false;
}
size_t pi = ps;
size_t ci = cs;
while (0 < ci) {
if (path[pi - 1] != _components[ci - 1]) {
return false;
}
--pi;
--ci;
}
return true;
}

51
lib/Maskings/Path.h Normal file
View File

@ -0,0 +1,51 @@
////////////////////////////////////////////////////////////////////////////////
/// DISCLAIMER
///
/// Copyright 2018 ArangoDB GmbH, Cologne, Germany
///
/// Licensed under the Apache License, Version 2.0 (the "License");
/// you may not use this file except in compliance with the License.
/// You may obtain a copy of the License at
///
/// http://www.apache.org/licenses/LICENSE-2.0
///
/// Unless required by applicable law or agreed to in writing, software
/// distributed under the License is distributed on an "AS IS" BASIS,
/// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
/// See the License for the specific language governing permissions and
/// limitations under the License.
///
/// Copyright holder is ArangoDB GmbH, Cologne, Germany
///
/// @author Frank Celler
////////////////////////////////////////////////////////////////////////////////
#ifndef ARANGODB_MASKINGS_PATH_H
#define ARANGODB_MASKINGS_PATH_H 1
#include "Basics/Common.h"
#include "Maskings/ParseResult.h"
namespace arangodb {
namespace maskings {
class Path {
public:
static ParseResult<Path> parse(std::string const&);
public:
Path() : _wildcard(false) {}
Path(bool wildcard, std::vector<std::string> const& components)
: _wildcard(wildcard), _components(components) {}
bool match(std::vector<std::string> const& path) const;
private:
bool _wildcard;
std::vector<std::string> _components;
};
} // namespace maskings
} // namespace arangodb
#endif

View File

@ -0,0 +1,73 @@
////////////////////////////////////////////////////////////////////////////////
/// DISCLAIMER
///
/// Copyright 2018 ArangoDB GmbH, Cologne, Germany
///
/// Licensed under the Apache License, Version 2.0 (the "License");
/// you may not use this file except in compliance with the License.
/// You may obtain a copy of the License at
///
/// http://www.apache.org/licenses/LICENSE-2.0
///
/// Unless required by applicable law or agreed to in writing, software
/// distributed under the License is distributed on an "AS IS" BASIS,
/// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
/// See the License for the specific language governing permissions and
/// limitations under the License.
///
/// Copyright holder is ArangoDB GmbH, Cologne, Germany
///
/// @author Frank Celler
////////////////////////////////////////////////////////////////////////////////
#include "RandomStringMask.h"
#include "Basics/StringUtils.h"
#include "Basics/fasthash.h"
#include "Maskings/Maskings.h"
static std::string const xxxx("xxxx");
using namespace arangodb;
using namespace arangodb::maskings;
ParseResult<AttributeMasking> RandomStringMask::create(Path path, Maskings* maskings,
VPackSlice const&) {
return ParseResult<AttributeMasking>(AttributeMasking(path, new RandomStringMask(maskings)));
}
VPackValue RandomStringMask::mask(bool value, std::string&) const {
return VPackValue(value);
}
VPackValue RandomStringMask::mask(std::string const& data, std::string& buffer) const {
uint64_t len = data.size();
uint64_t hash;
hash = fasthash64(data.c_str(), data.size(), _maskings->randomSeed());
std::string hash64 = basics::StringUtils::encodeBase64(
std::string((char const*)&hash, sizeof(decltype(hash))));
buffer.clear();
buffer.reserve(len);
buffer.append(hash64);
if (buffer.size() < len) {
while (buffer.size() < len) {
buffer.append(hash64);
}
buffer.resize(len);
}
return VPackValue(buffer);
}
VPackValue RandomStringMask::mask(int64_t value, std::string&) const {
return VPackValue(value);
}
VPackValue RandomStringMask::mask(double value, std::string&) const {
return VPackValue(value);
}

View File

@ -0,0 +1,48 @@
////////////////////////////////////////////////////////////////////////////////
/// DISCLAIMER
///
/// Copyright 2018 ArangoDB GmbH, Cologne, Germany
///
/// Licensed under the Apache License, Version 2.0 (the "License");
/// you may not use this file except in compliance with the License.
/// You may obtain a copy of the License at
///
/// http://www.apache.org/licenses/LICENSE-2.0
///
/// Unless required by applicable law or agreed to in writing, software
/// distributed under the License is distributed on an "AS IS" BASIS,
/// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
/// See the License for the specific language governing permissions and
/// limitations under the License.
///
/// Copyright holder is ArangoDB GmbH, Cologne, Germany
///
/// @author Frank Celler
////////////////////////////////////////////////////////////////////////////////
#ifndef ARANGODB_MASKINGS_ATTRIBUTE_RANDOM_STRING_MASK_H
#define ARANGODB_MASKINGS_ATTRIBUTE_RANDOM_STRING_MASK_H 1
#include "Maskings/AttributeMasking.h"
#include "Maskings/MaskingFunction.h"
#include "Maskings/ParseResult.h"
namespace arangodb {
namespace maskings {
class RandomStringMask : public MaskingFunction {
public:
static ParseResult<AttributeMasking> create(Path, Maskings*, VPackSlice const& def);
public:
VPackValue mask(bool, std::string& buffer) const override;
VPackValue mask(std::string const& data, std::string& buffer) const override;
VPackValue mask(int64_t, std::string& buffer) const override;
VPackValue mask(double, std::string& buffer) const override;
private:
explicit RandomStringMask(Maskings* maskings) : MaskingFunction(maskings) {}
};
} // namespace maskings
} // namespace arangodb
#endif

View File

@ -38,9 +38,15 @@ UniformCharacter::UniformCharacter(std::string const& characters)
UniformCharacter::UniformCharacter(size_t length, std::string const& characters) UniformCharacter::UniformCharacter(size_t length, std::string const& characters)
: _length(length), _characters(characters) {} : _length(length), _characters(characters) {}
std::string UniformCharacter::random() { return random(_length); } char UniformCharacter::randomChar() const {
size_t r = RandomGenerator::interval((uint32_t)(_characters.size() - 1));
std::string UniformCharacter::random(size_t length) { return _characters[r];
}
std::string UniformCharacter::random() const { return random(_length); }
std::string UniformCharacter::random(size_t length) const {
std::string buffer; std::string buffer;
buffer.reserve(length); buffer.reserve(length);

View File

@ -38,11 +38,12 @@ class UniformCharacter {
UniformCharacter(size_t length, std::string const& characters); UniformCharacter(size_t length, std::string const& characters);
public: public:
std::string random(); std::string random() const;
std::string random(size_t length); std::string random(size_t length) const;
char randomChar() const;
private: private:
size_t _length; size_t const _length;
std::string const _characters; std::string const _characters;
}; };
} // namespace arangodb } // namespace arangodb

View File

@ -38,7 +38,7 @@ SCENARIO("testing", "[datetime]") {
for (auto const& dateTime : datesToTest) { for (auto const& dateTime : datesToTest) {
GIVEN(dateTime) { GIVEN(dateTime) {
bool ret = parse_dateTime(dateTime, tp); bool ret = parseDateTime(dateTime, tp);
THEN(dateTime) { REQUIRE(ret == true); } THEN(dateTime) { REQUIRE(ret == true); }
} }
@ -46,7 +46,7 @@ SCENARIO("testing", "[datetime]") {
for (auto const& dateTime : datesToFail) { for (auto const& dateTime : datesToFail) {
GIVEN(dateTime) { GIVEN(dateTime) {
bool ret = parse_dateTime(dateTime, tp); bool ret = parseDateTime(dateTime, tp);
THEN(dateTime) { REQUIRE(ret == false); } THEN(dateTime) { REQUIRE(ret == false); }
} }

View File

@ -0,0 +1,66 @@
{ "maskings1": {
"type": "masked",
"maskings": [
{
"path": "´name´",
"type": "xifyFront",
"unmaskedLength": 1
},
{
"path": ".`name`",
"type": "xifyFront",
"unmaskedLength": 2
},
{
"path": "email",
"type": "xifyFront",
"unmaskedLength": 3
}
]
},
"maskings2": {
"type": "masked",
"maskings": [
{
"path": "random",
"type": "randomString"
},
{
"path": "zip",
"type": "zip"
},
{
"path": "date",
"type": "date",
"begin": "1900-01-01",
"end": "2017-12-31",
"format": "%yyyy %mm %dd"
},
{
"path": "integer",
"type": "integer",
"lower": -10,
"upper": 10
},
{
"path": "decimal",
"type": "decimal",
"lower": -10,
"upper": 10,
"scale": 2
},
{
"path": "ccard",
"type": "creditCard"
},
{
"path": "phone",
"type": "phone"
},
{
"path": "email",
"type": "email"
}
]
}
}

View File

@ -0,0 +1,88 @@
/*jshint globalstrict:false, strict:false, maxlen:4000, unused:false */
/*global arango */
// /////////////////////////////////////////////////////////////////////////////
// @brief tests for dump/reload
//
// @file
//
// DISCLAIMER
//
// Copyright 2019 ArangoDB GmbH, Cologne, Germany
// Copyright 2010-2012 triagens GmbH, Cologne, Germany
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
//
// Copyright holder is ArangoDB GmbH, Cologne, Germany
//
// @author Frank Celler
// /////////////////////////////////////////////////////////////////////////////
(function () {
'use strict';
var db = require("@arangodb").db;
var i, c;
try {
db._dropDatabase("UnitTestsDumpSrc");
} catch (err1) {
}
db._createDatabase("UnitTestsDumpSrc");
db._useDatabase("UnitTestsDumpSrc");
db._create("maskings1");
db.maskings1.save({
_key: "1",
name: "Hallo World! This is a t0st a top-level",
blub: {
name: "Hallo World! This is a t0st in a sub-object",
},
email: [
"testing arrays",
"this is another one",
{ something: "something else" },
{ email: "within a subject" },
{ name: [ "emails within a subject", "as list" ] }
],
sub: {
name: "this is a name leaf attribute",
email: [ "in this case as list", "with more than one entry" ]
}
});
db._create("maskings2");
db.maskings2.save({
_key: "2",
random: "a",
zip: "12345",
date: "2018-01-01",
integer: 100,
decimal: 100.12,
ccard: "1234 1234 1234 1234",
phone: "abcd 1234",
email: "me@you.here"
});
})();
return {
status: true
};

View File

@ -0,0 +1,84 @@
/*jshint globalstrict:false, strict:false, maxlen:4000 */
/*global assertEqual, assertTrue, assertFalse, assertNotNull */
// /////////////////////////////////////////////////////////////////////////////
// @brief tests for dump/reload
//
// @file
//
// DISCLAIMER
//
// Copyright 2019 ArangoDB GmbH, Cologne, Germany
// Copyright 2010-2012 triagens GmbH, Cologne, Germany
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
//
// Copyright holder is ArangoDB GmbH, Cologne, Germany
//
// @author Frank Celler
// /////////////////////////////////////////////////////////////////////////////
var internal = require("internal");
var jsunity = require("jsunity");
let users = require("@arangodb/users");
function dumpMaskingSuite () {
'use strict';
var db = internal.db;
return {
setUp : function () {
},
tearDown : function () {
},
testGeneral : function () {
var c = db._collection("maskings1");
var d = c.document("1");
assertNotNull(d, "document '1' was restored");
assertEqual(d.name, "xxxxo xxxxd xxxs xs a xxxt a xxxxxxxxl");
assertEqual(d.blub.name, "xxxlo xxxld xxis is a xxst in a xxxxxxxxct");
assertEqual(d.email.length, 5);
assertEqual(d.email[0], "xxxxing xxxays");
assertEqual(d.email[1], "xhis is xxxxher one");
assertEqual(d.email[2].something, "something else");
assertEqual(d.email[3].email, "within a subject");
assertEqual(d.email[4].name.length, 2);
assertEqual(d.email[4].name[0], "xxxxls xxxxin a xxxxxct");
assertEqual(d.email[4].name[1], "as xxst");
assertEqual(d.sub.name, "xxis is a xxme xxaf xxxxxxxte");
assertEqual(d.sub.email.length, 2);
assertEqual(d.sub.email[0], "in this case as list");
assertEqual(d.sub.email[1], "with more than one entry");
},
testRandomString : function () {
var c = db._collection("maskings2");
var d = c.document("2");
assertFalse(d.random === "a");
assertFalse(d.zip === "12345");
assertFalse(d.date === "2018-01-01");
assertFalse(d.integer === 100);
assertFalse(d.ccard === "1234 1234 1234 1234");
assertFalse(d.phone === "abcd 1234");
assertFalse(d.emil === "me@you.here");
}
};
}
jsunity.run(dumpMaskingSuite);
return jsunity.done();