Goal: Demonstrate Converting "upgraded" Buckets to Collections.
-
This function ConvertBucketToCollections demonstrates a simple technique to reorganize into buckets.
-
Requires Eventing Storage (or metadata collection) and a "source" collection listing to the sample "beer-sample".
-
The sample "beer-sample" is equivalent to an upgraded bucket where all data resides in
beer-sample
._default
._default
. -
Needs Bindings of type "Constant alias" (as documented in the Scriptlet).
-
Needs Bindings of type "bucket alias" (as documented in the Scriptlet).
-
If "Constant alias" DO_COPY is true
-
move all type` = "brewery" to COLLECTION
bulk
.data
.brewery
-
move all type` = "beer" to COLLECTION
bulk
.data
.beer
-
-
If "Constant alias" DO_DELETE is true
-
remove all type` = "brewery" from upgrade COLLECTION
beer-sample
._default
._default
-
remove all type` = "beer" from upgrade COLLECTION
beer-sample
._default
._default
-
-
The function should have higher throughput as you add more workers up to the # of vCPUs.
-
Example of performance using this technique:
-
Test cluster is a symmetric 3 x AWS r5.2xlarge (64 GiB of memory, 8 vCPUs, 64-bit platform).
-
Will process 93K ops/sec in a steady state.
-
250M small documents: takes 44 minutes to reorganize a bucket with 80 types into a new bucket with 80 collections.
-
1B small documents: takes 3 hours to reorganize a bucket with 80 types into a new bucket with 80 collections.
-
-
ConvertBucketToCollections
-
Input Data/Mutation
-
Output Data/Logged
// To run configure the settings for this Function, ConvertBucketToCollections, as follows:
//
// Version 7.0+
// "Listen to Location"
// beer-sample.data.source
// "Eventing Storage"
// rr100.eventing.metadata
// Binding(s)
// 1. "binding type", "alias name...", "bucket.scope.collection", "Access"
// "bucket alias", "src_col", "beer-sample._default._default", "read and write"
// "bucket alias", "brewery_col", "bulk.data.brewery", "read and write"
// "bucket alias", "beer_col", "bulk.data.beer", "read and write"
//
// 2. "binding type", "alias name...", "value",
// "Constant alias", "DO_COPY", true
// "Constant alias", "DO_DELETE", true
//
// Version 6.6.2 (not applicable)
// Upgrades `beer-sample` from a bucket paradigm to a collection/keyspace paradigm.
function OnUpdate(doc, meta) {
if (doc.type === 'beer') {
if (DO_COPY) beer_col[meta.id] = doc;
if (DO_DELETE) {
if (!beer_col[meta.id]) { // safety check
log("skip delete copy not found type=" + doc.type + ", meta.id=" + meta.id);
} else {
delete src_col[meta.id];
}
}
}
if (doc.type === 'brewery') {
if (DO_COPY) brewery_col[meta.id] = doc;
if (DO_DELETE) {
if (!brewery_col[meta.id]) { // safety check
log("skip delete copy not found type=" + doc.type + ", meta.id=" + meta.id);
} else {
delete src_col[meta.id];
}
}
}
}
LOAD THE SAMPLE `beer-sample` from the UI Settings page
CREATE two empty keyspaces `bulk`.`data`.`brewery` and `bulk`.`data`.`beer`
THE `beer-sample`.`_default`.`_default` keyspace is empty (0 documents)
The keyspace `bulk`.`data`.`brewery` has 1,412 documents of type == "brewery"
The keyspace `bulk`.`data`.`beer` has 5,891 documents of type == "beer"