From 11a8b1841b1c6a5a9cc1ed3e64935fa83623b16c Mon Sep 17 00:00:00 2001 From: Willi Goesgens Date: Thu, 3 Sep 2015 14:44:45 +0200 Subject: [PATCH] Restructure Documentation to reduce the number of top list items - combine HTTP-API Bulk interface descriptions and the tools that use them in an new top-level point 'Bulk Import/Export' - add information about the arangosh db-object implementation - add information about the arangod db-object implementation (fixes #1223) --- Documentation/Books/Makefile | 2 +- .../DatafileDebugger.mdpp} | 0 .../Books/Users/Advanced/README.mdpp | 1 + .../Books/Users/Advanced/ServerInternals.mdpp | 68 +++++++++++++++++++ .../WriteAheadLog.mdpp} | 0 .../Books/Users/FirstSteps/Arangosh.mdpp | 15 +++- .../Arangodump.mdpp} | 0 .../Arangoimp.mdpp} | 0 .../Arangorestore.mdpp} | 0 Documentation/Books/Users/SUMMARY.md | 20 +++--- 10 files changed, 95 insertions(+), 11 deletions(-) rename Documentation/Books/Users/{DatafileDebugger/README.mdpp => Advanced/DatafileDebugger.mdpp} (100%) create mode 100644 Documentation/Books/Users/Advanced/README.mdpp create mode 100644 Documentation/Books/Users/Advanced/ServerInternals.mdpp rename Documentation/Books/Users/{WriteAheadLog/README.mdpp => Advanced/WriteAheadLog.mdpp} (100%) rename Documentation/Books/Users/{Arangodump/README.mdpp => HttpBulkImports/Arangodump.mdpp} (100%) rename Documentation/Books/Users/{Arangoimp/README.mdpp => HttpBulkImports/Arangoimp.mdpp} (100%) rename Documentation/Books/Users/{Arangorestore/README.mdpp => HttpBulkImports/Arangorestore.mdpp} (100%) diff --git a/Documentation/Books/Makefile b/Documentation/Books/Makefile index d3e2f48106..6f63dfcea9 100644 --- a/Documentation/Books/Makefile +++ b/Documentation/Books/Makefile @@ -8,7 +8,7 @@ newVersionNumber = $(shell cat ../../VERSION) # per book targets check-summary: @find ppbooks/$(NAME) -name \*.md |sed -e "s;ppbooks/$(NAME)/;;" |grep -vf SummaryBlacklist.txt |sort > /tmp/is_md.txt - @cat $(NAME)/SUMMARY.md |sed -e "s;.*(;;" -e "s;).*;;" |sort |grep -v '# Summary' > /tmp/is_summary.txt + @cat $(NAME)/SUMMARY.md |grep '(' |sed -e "s;.*(;;" -e "s;).*;;" |sort |grep -v '# Summary' > /tmp/is_summary.txt @if test "`comm -3 /tmp/is_md.txt /tmp/is_summary.txt|wc -l`" -ne 0; then \ echo "not all files are mapped to the summary!"; \ echo " files found | files in summary"; \ diff --git a/Documentation/Books/Users/DatafileDebugger/README.mdpp b/Documentation/Books/Users/Advanced/DatafileDebugger.mdpp similarity index 100% rename from Documentation/Books/Users/DatafileDebugger/README.mdpp rename to Documentation/Books/Users/Advanced/DatafileDebugger.mdpp diff --git a/Documentation/Books/Users/Advanced/README.mdpp b/Documentation/Books/Users/Advanced/README.mdpp new file mode 100644 index 0000000000..2b506b7b15 --- /dev/null +++ b/Documentation/Books/Users/Advanced/README.mdpp @@ -0,0 +1 @@ +!SECTION more advanced ArangoDB Topics \ No newline at end of file diff --git a/Documentation/Books/Users/Advanced/ServerInternals.mdpp b/Documentation/Books/Users/Advanced/ServerInternals.mdpp new file mode 100644 index 0000000000..7946865c35 --- /dev/null +++ b/Documentation/Books/Users/Advanced/ServerInternals.mdpp @@ -0,0 +1,68 @@ +!SECTION Serverside db-Object implementation +We [already talked about the arangosh db Object implementation](Users/FirstSteps/Arangosh.html), Now a little more about the server version, so the following examples won't work properly in arangosh. + +Server-side methods of the *db object* will return an `[object ShapedJson]`. This datatype is a very lightweight JavaScript object that contains an internal pointer to where the document data are actually stored in memory or on disk. Especially this is not a fullblown copy of the document's complete data. + +When such an object's property is accessed, this will invoke an accessor function. For example, accessing `doc.name` of such an object will call a C++ function behind the scenes to fetch the actual value for the property `name`. When a property is written to this, it will also trigger an accessor function. This accessor function will first copy all property values into the object's own properties, and then discard the pointer to the data in memory. From this point on, the accessor functions will not do anything special, so the object will behave like a normal JavaScript object. + +All of this is done for performance reasons. It often allows ommitting the creation of big JavaScript objects that contain lots of data. For example, if all thats needed from a document is a single property, fully constructing the document as a JavaScript object has a high overhead (CPU time for processing, memory, plus later V8 garbage collection). + +Here's an example: +```js +var ShapedJson = require("org/arangodb").ShapedJson; +// fetches document from collection into a JavaScript object +var doc = db.test.any(); + +// check if the document object is a shaped object +// returns true for shaped objects, false otherwise +doc instanceof ShapedJson; + +// invokes the property read accessor. returns property value byValue +doc.name; + +// invokes the property write accessor. will copy document data +// into the JavaScript object once +// and store the value in the property as requested +doc.name = "test"; + +// doc will now behave like a regular object +doc.foo = "bar"; +``` + +There is one gotcha though with such objects: the accessor functions are only invoked when accessing top level properties. When accessing nested properties, the accessor will only be called for the top level property, which will then return the requested property's value. Accessing a subproperty of this returned property however does not have a reference to the original object. + +Here's an example: + +```js +// create an object with a nested property +db.test.save({ name: { first: "Jan" } }); +doc; +{ + "_id" : "test/2056013422404", + "_rev" : "2056013422404", + "_key" : "2056013422404", + "name" : { + "first" : "Jan" + } +} + +// now modify the nested property +doc.name.first = "test"; +doc; +{ + "_id" : "test/2056013422404", + "_rev" : "2056013422404", + "_key" : "2056013422404", + "name" : { + "first" : "Jan" /* oops */ + } +} +``` + +So what happened here? The statement `doc.name.first = "test"` calls the read accessor for property `name` first. This produces an object `{"name":"Jan"}` whose property `first` is modifed directly afterwards. But the object `{"name":"Jan"}` is a temporary object and not the same (`===`) object as `doc.name`. This is why updating the nested property effectively failed. + +There is no way to detect this in a read accessor unfortunately. It does not have any idea about what will be done with the returned object. So this case cannot be tracked/trapped in the accessor. + +A workaround for this problem would be to clone the object in the user code if the document is going to be modified. This will make all modification safe. The cloning can also be made conditional for cases when the object is an instance of `ShapedJson` or when nested properties are to be accessed. Cloning is not required when the object is no `ShapedJson` or when only top level properties are accessed. + +Only those documents that are stored in a collections datafile will be returned as `ShapedJson`. The documents, that are still in the write-ahead-log, will always be returned as regular JavaScript objects, as they are not yet shaped and the optimization would not work. However, when a document is transfered from the write-ahead-log to the collection's datafile cannot be safely predicted by an application, so the same document can be returned in one way or the other. The only safe way is to check if the object is an instance of `ShapedJson` as above. diff --git a/Documentation/Books/Users/WriteAheadLog/README.mdpp b/Documentation/Books/Users/Advanced/WriteAheadLog.mdpp similarity index 100% rename from Documentation/Books/Users/WriteAheadLog/README.mdpp rename to Documentation/Books/Users/Advanced/WriteAheadLog.mdpp diff --git a/Documentation/Books/Users/FirstSteps/Arangosh.mdpp b/Documentation/Books/Users/FirstSteps/Arangosh.mdpp index 68f01e1672..93477942da 100644 --- a/Documentation/Books/Users/FirstSteps/Arangosh.mdpp +++ b/Documentation/Books/Users/FirstSteps/Arangosh.mdpp @@ -51,4 +51,17 @@ CLIENT options: --server.password password to use when connecting (leave empty for prompt) --server.request-timeout request timeout in seconds (default: 300) --server.username username to use when connecting (default: "root") -``` \ No newline at end of file +``` + +!SECTION Database Wrappers +The *db*-Object is available in *arangosh* as well as on *arangod* i.e. if you're using [Foxx](/Foxx). While its Interface is persistant bewteen the *arangosh* and the *arangod* implementations, its underpinning are not. The *arangod* implementation is JavaScript wrappers around ArangoDBs native C++ implementation, the *arangosh* implementation wraps HTTP-Accesses to [ArangoDBs RESTfull API](/HttpApi/index.html). + +So while this code may produce similar results when executed in *arangosh* and *arangod*, the cpu-usage and time required will be really different: + +```js +for (i = 0; i < 100000; i++) { + db.test.save({ name: { first: "Jan" }, count: i}); +} +``` + +Since the *arangosh* version will be doing around 100k HTTP-Requests, and the *arangod* version [will directly write to the database](/Advanced/ServerInternals.mdpp). diff --git a/Documentation/Books/Users/Arangodump/README.mdpp b/Documentation/Books/Users/HttpBulkImports/Arangodump.mdpp similarity index 100% rename from Documentation/Books/Users/Arangodump/README.mdpp rename to Documentation/Books/Users/HttpBulkImports/Arangodump.mdpp diff --git a/Documentation/Books/Users/Arangoimp/README.mdpp b/Documentation/Books/Users/HttpBulkImports/Arangoimp.mdpp similarity index 100% rename from Documentation/Books/Users/Arangoimp/README.mdpp rename to Documentation/Books/Users/HttpBulkImports/Arangoimp.mdpp diff --git a/Documentation/Books/Users/Arangorestore/README.mdpp b/Documentation/Books/Users/HttpBulkImports/Arangorestore.mdpp similarity index 100% rename from Documentation/Books/Users/Arangorestore/README.mdpp rename to Documentation/Books/Users/HttpBulkImports/Arangorestore.mdpp diff --git a/Documentation/Books/Users/SUMMARY.md b/Documentation/Books/Users/SUMMARY.md index 2607119e76..e425ba5ca7 100644 --- a/Documentation/Books/Users/SUMMARY.md +++ b/Documentation/Books/Users/SUMMARY.md @@ -55,7 +55,6 @@ * [Locking and isolation](Transactions/LockingAndIsolation.md) * [Durability](Transactions/Durability.md) * [Limitations](Transactions/Limitations.md) -* [Write-ahead log](WriteAheadLog/README.md) * [AQL](Aql/README.md) * [How to invoke AQL](Aql/Invoke.md) * [Data modification queries](Aql/DataModification.md) @@ -151,9 +150,6 @@ * [Communication options](ConfigureArango/Communication.md) * [Authentication](ConfigureArango/Authentication.md) * [Emergency Console](ConfigureArango/EmergencyConsole.md) -* [Arangoimp](Arangoimp/README.md) -* [Arangodump](Arangodump/README.md) -* [Arangorestore](Arangorestore/README.md) * [HTTP API](HttpApi/README.md) * [Databases](HttpDatabase/README.md) * [To-Endpoint](HttpDatabase/DatabaseEndpoint.md) @@ -198,10 +194,6 @@ * [Replication Logger](HttpReplications/ReplicationLogger.md) * [Replication Applier](HttpReplications/ReplicationApplier.md) * [Other Replication Commands](HttpReplications/OtherReplication.md) - * [Bulk Imports](HttpBulkImports/README.md) - * [JSON Documents](HttpBulkImports/ImportingSelfContained.md) - * [Headers and Values](HttpBulkImports/ImportingHeadersAndValues.md) - * [Batch Requests](HttpBatchRequest/README.md) * [Tasks](HttpTasks/README.md) * [Monitoring](HttpAdministrationAndMonitoring/README.md) * [User Management](HttpUserManagement/README.md) @@ -210,6 +202,13 @@ * [Sharding](HttpShardingInterface/README.md) * [Miscellaneous functions](HttpMiscellaneousFunctions/README.md) * [General Handling](GeneralHttp/README.md) +* [Bulk Import / Export](HttpBulkImports/README.md) + * [JSON Documents HTTP-API](HttpBulkImports/ImportingSelfContained.md) + * [Headers & Values HTTP-API](HttpBulkImports/ImportingHeadersAndValues.md) + * [Batch Requests HTTP-API](HttpBatchRequest/README.md) + * [Arangoimp](HttpBulkImports/Arangoimp.md) + * [Arangodump](HttpBulkImports/Arangodump.md) + * [Arangorestore](HttpBulkImports/Arangorestore.md) * [JavaScript Modules](ModuleJavaScript/README.md) * ["console"](ModuleConsole/README.md) * ["fs"](ModuleFs/README.md) @@ -236,7 +235,10 @@ * [Fulltext Indexes](IndexHandling/Fulltext.md) * [Geo Indexes](IndexHandling/Geo.md) * [Cap Constraint](IndexHandling/Cap.md) -* [Datafile Debugger](DatafileDebugger/README.md) +* [Advanced](Advanced/README.md) + * [Write-ahead log](Advanced/WriteAheadLog.md) + * [Server Internals](Advanced/ServerInternals.md) + * [Datafile Debugger](Advanced/DatafileDebugger.md) * [Naming Conventions](NamingConventions/README.md) * [Database Names](NamingConventions/DatabaseNames.md) * [Collection Names](NamingConventions/CollectionNames.md)