Sub-Document Operations
- how-to
Sub-Document operations can be used to efficiently access and change parts of documents.
Sub-Document operations may be quicker and more network-efficient than full-document operations such as upsert, replace and get because they only transmit the accessed sections of the document over the network.
Sub-Document operations are also atomic, in that if one Sub-Document mutation fails then all will, allowing safe modifications to documents with built-in concurrency control.
Sub-Documents
While full-document retrievals retrieve the entire document and full document updates require sending the entire document, Sub-Document retrievals only retrieve relevant parts of a document and Sub-Document updates only require sending the updated portions of a document.
You should use Sub-Document operations when you are modifying only portions of a document, and full-document operations when the contents of a document is to change significantly.
The Sub-Document operations described on this page are for Key-Value requests only: they are not related to Sub-Document SQL++ (formerly N1QL) queries. Sub-Document SQL++ queries are explained in the section Querying with SQL++. |
In order to use Sub-Document operations you need to specify a path indicating the location of the Sub-Document. The path follows Path syntax. Considering the document:
{
"title": "Ayr (Scotland)",
"name": "Enterkine House Hotel",
"address": "by Annbank. Ayrshire",
"directions": "5 miles off A77, follow B742 to Mossblown then Annbank",
"phone": "+44 1292 520580",
"tollfree": null,
"email": null,
"fax": null,
"url": "http://www.enterkine.com",
"checkin": "2.00pm",
"checkout": "11am",
"price": "from £100",
"geo": {
"lat": 55.48034590743372,
"lon": -4.51612114906311,
"accuracy": "ROOFTOP"
},
"type": "hotel",
"id": 1368,
"country": "United Kingdom",
"city": "South Ayrshire",
"state": null,
"reviews": [],
"public_likes": ["Georgette Rutherford V", "Ms. Devante Bruen", "Anderson Schmidt", "Mr. Kareem Harvey", "Tessie Shields", "Floyd Bradtke III", "Maurice McDermott", "Michel Franecki", "Laila Ernser"],
"vacancy": true,
"description": "four star country house hotel situated in 350 acres of woodland estate yet only 10 mins from Prestwick ,Ayr and Troon. Award winning food by Paul Moffat and team",
"alias": null,
"pets_ok": false,
"free_breakfast": true,
"free_internet": false,
"free_parking": false
}
The paths name
, geo.lat
and public_likes[0]
are all valid paths.
Retrieving
The lookupIn operations query the document for certain path(s); these path(s) are then returned. You have a choice of actually retrieving the document path using the get Sub-Document operation, or simply querying the existence of the path using the exists Sub-Document operation. The latter saves even more bandwidth by not retrieving the contents of the path if it is not needed.
The examples use the following imports:
import static com.couchbase.client.java.kv.LookupInSpec.exists;
import static com.couchbase.client.java.kv.LookupInSpec.get;
import static com.couchbase.client.java.kv.MutateInOptions.mutateInOptions;
import static com.couchbase.client.java.kv.MutateInSpec.arrayAddUnique;
import static com.couchbase.client.java.kv.MutateInSpec.arrayAppend;
import static com.couchbase.client.java.kv.MutateInSpec.arrayInsert;
import static com.couchbase.client.java.kv.MutateInSpec.arrayPrepend;
import static com.couchbase.client.java.kv.MutateInSpec.decrement;
import static com.couchbase.client.java.kv.MutateInSpec.increment;
import static com.couchbase.client.java.kv.MutateInSpec.insert;
import static com.couchbase.client.java.kv.MutateInSpec.remove;
import static com.couchbase.client.java.kv.MutateInSpec.upsert;
import java.time.Duration;
import java.util.List;
import java.util.concurrent.CompletableFuture;
import java.util.concurrent.ExecutionException;
import java.util.stream.Stream;
import com.couchbase.client.core.error.CasMismatchException;
import com.couchbase.client.core.error.DocumentUnretrievableException;
import com.couchbase.client.core.error.DurabilityImpossibleException;
import com.couchbase.client.core.error.subdoc.PathExistsException;
import com.couchbase.client.core.error.subdoc.PathNotFoundException;
import com.couchbase.client.core.msg.kv.DurabilityLevel;
import com.couchbase.client.java.Bucket;
import com.couchbase.client.java.Cluster;
import com.couchbase.client.java.Collection;
import com.couchbase.client.java.Scope;
import com.couchbase.client.java.json.JsonArray;
import com.couchbase.client.java.json.JsonObject;
import com.couchbase.client.java.kv.GetResult;
import com.couchbase.client.java.kv.LookupInReplicaResult;
import com.couchbase.client.java.kv.LookupInResult;
import com.couchbase.client.java.kv.LookupInSpec;
import com.couchbase.client.java.kv.MutateInResult;
import com.couchbase.client.java.kv.MutateInSpec;
import com.couchbase.client.java.kv.MutationResult;
import com.couchbase.client.java.kv.PersistTo;
import com.couchbase.client.java.kv.ReplicateTo;
import reactor.core.publisher.Mono;
LookupInResult result = collection.lookupIn("hotel_1368",
List.of(get("geo.lat")));
try {
String str = result.contentAs(0, String.class);
System.out.println("getFunc: Latitude = " + str);
} catch (PathNotFoundException e) {
e.printStackTrace();
}
The operation used here is LookupInSpec.get , but we import this static method directly for readability.
|
LookupInResult result = collection.lookupIn("hotel_1368",
List.of(exists("address.does_not_exist")));
boolean pathExists = result.exists(0);
System.out.println("Non-existent path exists? " + pathExists);
Multiple operations can be combined:
LookupInResult result = collection.lookupIn("hotel_1368",
List.of(
get("geo.lat"), // index 0
exists("address.does_not_exist") // index 1
)
);
String lat = result.contentAs(0, String.class);
boolean otherExists = result.exists(1);
System.out.println("Latitude: " + lat);
System.out.println("Non-existent path exists? " + otherExists);
Choosing an API
The Java SDK provides three APIs for all operations. There’s the simple blocking one you’ve already seen, then this
asynchronous variant that returns Java CompletableFuture
:
CompletableFuture<LookupInResult> future = collection.async().lookupIn("hotel_1368",
List.of(get("geo.lat")));
try {
LookupInResult result = future.get();
System.out.println("future: Latitude: " + result.contentAs(0, Number.class));
} catch (InterruptedException | ExecutionException e) {
e.printStackTrace();
}
And a third that uses reactive programming primitives from Project Reactor:
Mono<LookupInResult> mono = collection.reactive().lookupIn("hotel_1368",
List.of(get("geo.lat")));
// Just for example, block on the result - this is not best practice
LookupInResult result = mono.block();
Mutating
Mutation operations modify one or more paths in the document. The simplest of these operations is upsert, which, similar to the fulldoc-level upsert, will either modify the value of an existing path or create it if it does not exist:
collection.mutateIn("hotel_1368", List.of(upsert("email", "hotel96@hotmail.com")));
Likewise, the insert operation will only add the new value to the path if it does not exist:
try {
collection.mutateIn("hotel_1368", List.of(insert("alt_email", "alt_hotel96@hotmail.com")));
} catch (PathExistsException err) {
System.out.println("insertFunc: exception caught, path already exists");
}
Dictionary values can also be replaced or removed, and you may combine any number of mutation operations within the same general mutateIn API. Here’s an example of one which replaces one path and removes another.
collection.mutateIn("hotel_1368", List.of(remove("tz"), insert("alt_email", "hotel84@hotmail.com")));
mutateIn is an atomic operation. If any single operation fails, then the entire document is left unchanged.
|
Array Append and Prepend
The arrayPrepend and arrayAppend operations are true array prepend and append operations. Unlike fulldoc append/prepend operations (which simply concatenate bytes to the existing value), arrayAppend and arrayPrepend are JSON-aware:
MutationResult result = collection.mutateIn("hotel_1368",
List.of(arrayAppend("public_likes", List.of("Mike Rutherford"))));
/*
public_likes is now:
["Georgette Rutherford V", "Ms. Devante Bruen", "Anderson Schmidt", "Mr. Kareem Harvey", "Tessie Shields",
"Floyd Bradtke III", "Maurice McDermott", "Michel Franecki", "Laila Ernser", "Mike Rutherford"]
*/
MutationResult result = collection.mutateIn("hotel_1368",
List.of(arrayPrepend("public_likes", List.of("John Smith"))));
/*
public_likes is now:
["John Smith", "Georgette Rutherford V", "Ms. Devante Bruen", "Anderson Schmidt", "Mr. Kareem Harvey", "Tessie Shields",
"Floyd Bradtke III", "Maurice McDermott", "Michel Franecki", "Laila Ernser", "Mike Rutherford"]
*/
If your document only needs to contain an array, you do not have to create a top-level object wrapper to contain it. Simply initialize the document with an empty array and then use the empty path for subsequent Sub-Document array operations:
collection.upsert("my_array", JsonArray.create());
collection.mutateIn("my_array",
List.of(arrayAppend("", List.of("some element"))));
// the document my_array is now ["some element"]
If you wish to create an array if it does not exist and also push elements to it within the same operation you may use the createPath option:
MutateInResult result = collection.mutateIn("hotel_14225",
List.of(arrayAppend("some.array", List.of("hello world")).createPath()));
Arrays as Unique Sets
Limited support also exists for treating arrays like unique sets, using the arrayAddUnique command. This will do a check to determine if the given value exists or not before actually adding the item to the array:
collection.mutateIn("hotel_14226", List.of(arrayAddUnique("unique", 95)));
try {
collection.mutateIn("hotel_14226", List.of(arrayAddUnique("unique", 95)));
throw new RuntimeException("should have thrown PathExistsException");
} catch (PathExistsException err) {
System.out.println("arrayUnique: caught exception, path already exists");
}
Note that currently the arrayAddUnique will fail with a PathMismatchException if the array contains JSON floats, objects, or arrays. The arrayAddUnique operation will also fail with CannotInsertValueException if the value to be added is one of those types as well.
Note that the actual position of the new element is undefined, and that the array is not ordered.
Array Insertion
New elements can also be inserted into an array.
While append will place a new item at the end of an array and prepend will place it at the beginning, insert allows an element to be inserted at a specific position.
The position is indicated by the last path component, which should be an array index.
For example, to insert "cruel"
as the second element in the array ["Hello", "world"]
, the code would look like:
MutateInResult result = collection.mutateIn("hotel_1501",
List.of(arrayInsert("foo[1]", List.of("cruel"))));
Note that the array must already exist and that the index must be valid (i.e. it must not point to an element which is out of bounds).
Counters and Numeric Fields
Counter operations allow the manipulation of a numeric value inside a document. These operations are logically similar to the increment and decrement full-document operations:
MutateInResult result = collection.mutateIn("hotel_1368", List.of(increment("logins", 1)));
// Counter operations return the updated count
Long count = result.contentAs(0, Long.class);
The increment and decrement operations perform simple arithmetic against a numeric value. The updated value is returned.
MutateInResult result = collection.mutateIn("hotel_1368", List.of(decrement("logouts", 150)));
// Counter operations return the updated count
Long count = result.contentAs(0, Long.class);
The existing value for counter operations must be within range of a 64 bit signed integer. If the value does not exist, the operation will create it (and its parents, if createPath is enabled).
Note that there are several differences as compared to the full-document counter operations:
-
Sub-Document counters have a range of -9223372036854775807 to 9223372036854775807, whereas full-document counters have a range of 0 to 18446744073709551615
-
Sub-Document counter operations protect against overflow and underflow, returning an error if the operation would exceed the range. Full-document counters will use normal C semantics for overflow (in which the overflow value is carried over above 0), and will silently fail on underflow, setting the value to 0 instead.
-
Sub-Document counter operations can operate on any numeric value within a document, while full-document counter operations require a specially formatted counter document with only the counter value.
Executing Multiple Operations
Multiple Sub-Document operations can be executed at once on the same document, allowing you to retrieve or modify several Sub-Documents at once. When multiple operations are submitted within the context of a single lookupIn or mutateIn command, the server will execute all the operations with the same version of the document.
Unlike batched operations which is simply a way of sending multiple individual operations efficiently on the network, multiple Sub-Document operations are formed into a single command packet, which is then executed atomically on the server. You can submit up to 16 operations at a time. |
When submitting multiple mutation operations within a single mutateIn command, those operations are considered to be part of a single transaction: if any of the mutation operations fail, the server will logically roll-back any other mutation operations performed within the mutateIn, even if those commands would have been successful had another command not failed.
When submitting multiple retrieval operations within a single lookupIn command, the status of each command does not affect any other command. This means that it is possible for some retrieval operations to succeed and others to fail. While their statuses are independent of each other, you should note that operations submitted within a single lookupIn are all executed against the same version of the document.
Creating Paths
Sub-Document mutation operations such as upsert or insert will fail if the immediate parent is not present in the document. Consider:
{
"level_0": {
"level_1": {
"level_2": {
"level_3": {
"some_field": "some_value"
}
}
}
}
}
Looking at the some_field
field (which is really level_0.level_1.level_2.level_3.some_field
), its immediate parent is level_3
.
If we were to attempt to insert another field, level_0.level_1.level_2.level_3.another_field
, it would succeed because the immediate parent is present.
However if we were to attempt to insert to level_1.level_2.foo.bar
it would fail, because level_1.level_2.foo
(which would be the immediate parent) does not exist.
Attempting to perform such an operation would result in a Path Not Found error.
By default the automatic creation of parents is disabled, as a simple typo in application code can result in a rather confusing document structure. Sometimes it is necessary to have the server create the hierarchy however. In this case, the createPath option may be used.
MutateInResult result = collection.mutateIn("hotel_1368",
List.of(
upsert("level_0.level_1.foo.bar.phone", JsonObject.create().put("num", "311-555-0101").put("ext", 16))
.createPath()));
Reading Sub-Documents From Replicas
Couchbase Server 7.6 and later support Sub-Doc lookup from replicas.
The collection.lookupInAnyReplica()
method returns the first response — from active or replica:
try {
LookupInResult result = collection.lookupInAnyReplica(
"hotel_1368",
List.of(LookupInSpec.get("geo.lat"))
);
String str = result.contentAs(0, String.class);
System.out.println("getFunc: Latitude = " + str);
} catch (PathNotFoundException e) {
System.out.println("The version of the document" +
" on the server node that responded quickest" +
" did not have the requested field.");
} catch (DocumentUnretrievableException e) {
System.out.println("Document was not present" +
" on any server node.");
}
The collection.lookupInAllReplicas()
fetches all available replicas (and the active copy), and returns all responses.
Stream<LookupInReplicaResult> results = collection.lookupInAllReplicas(
"hotel_1368",
List.of(LookupInSpec.get("geo.lat"))
);
results.forEach(it -> {
try {
String str = it.contentAs(0, String.class);
System.out.println("getFunc: Latitude = " + str);
} catch (PathNotFoundException e) {
System.out.println("The version of the document" +
" on one of the server nodes" +
" did not have the requested field.");
}
});
You may want to use lookupInAllReplicas
to build a consensus,
but it’s more likely that you’ll make use of lookupInAnyReplica
as a fallback to a lookupIn
, when the active node times out.
Concurrent Modifications
Concurrent Sub-Document operations on different parts of a document will not conflict. For example the following two blocks can execute concurrently without any risk of conflict:
Thread thread1 = new Thread() {
public void run() {
collection.mutateIn("hotel_1501",
List.of(arrayAppend("foo", List.of(99))));
}
};
Thread thread2 = new Thread() {
public void run() {
collection.mutateIn("hotel_1501",
List.of(arrayAppend("foo", List.of(101))));
}
};
thread1.start();
thread2.start();
Even when modifying the same part of the document, operations will not necessarily conflict. For example, two concurrent arrayAppend operations to the same array will both succeed, never overwriting the other.
So in some cases the application will not need to supply a CAS value to protect against concurrent modifications.
If CAS is required then it can be provided like this:
GetResult doc = collection.get("hotel_1368");
MutationResult result = collection.mutateIn("hotel_1368", List.of(decrement("logouts", 150)),
mutateInOptions().cas(doc.cas()));
Durability
Couchbase’s traditional 'client verified' durability, using PersistTo
and ReplicateTo
, is still available, particularly for talking to Couchbase Server 7.0 and earlier:
MutationResult result = collection.mutateIn("hotel_1368",
List.of(MutateInSpec.upsert("foo", "bar")),
mutateInOptions().durability(PersistTo.ACTIVE, ReplicateTo.ONE));
In Couchbase Server 7.0 and up, this is built upon with Durable Writes, which uses the concept of majority to indicate the number of configured Data Service nodes to which commitment is required:
MutationResult result = collection.mutateIn("hotel_1368",
List.of(MutateInSpec.upsert("foo", "bar")),
mutateInOptions().durability(DurabilityLevel.MAJORITY));
Error Handling
Sub-Document operations have their own set of errors. When programming with Sub-Document, be prepared for any of the full-document errors (such as DocumentDoesNotExistException) as well as special Sub-Document errors which are received when certain constraints are not satisfied. Some of the errors include:
-
PathNotFoundException: When retrieving a path, this means the path does not exist in the document. When inserting or upserting a path, this means the immediate parent does not exist.
-
PathExistsException: In the context of an insert, it means the given path already exists. In the context of arrayAddUnique, it means the given value already exists.
-
PathMismatchException: This means the path may exist in the document, but that there is a type conflict between the path in the document and the path in the command. Consider the document:
{ "tags": ["reno", "nevada", "west", "sierra"] }
The path
tags.sierra
is a mismatch, sincetags
is actually an array, while the path assumes it is a JSON object (dictionary). -
DocumentNotJsonException: This means you are attempting to modify a binary document using Sub-Document operations.
-
PathInvalidException: This means the path is invalid for the command. Certain commands such as arrayInsert expect array elements as their final component, while others such as upsert and insert expect dictionary (object) keys.
If a Sub-Document command fails a top-level error is reported (MultiMutationException), rather than an individual error code (e.g. PathNotFoundException). When receiving a top-level error code, you should traverse the results of the command to see which individual code failed.
Path Syntax
Path syntax largely follows SQL++ conventions: A path is divided into components, with each component referencing a specific level in a document hierarchy.
Components are separated by dots (.
) in the case where the element left of the dot is a dictionary, or by brackets ([n]
) where the element left of the bracket is an array and n
is the index within the array.
As a special extension, you can indicate the last element of an array by using an index of -1
, for example to get the last element of the array in the document
{"some":{"array":[1,2,3,4,5,6,7,8,9,0]}}
Use some.array[-1]
as the path, which will return the element 0
.
Each path component must conform as a JSON string, as if it were surrounded by quotes, and any character in the path which may invalidate it as a JSON string must be escaped by a backslash (\
).
In other words, the path component must match exactly the path inside the document itself.
For example:
{"literal\"quote": {"array": []}}
must be referenced as literal\"quote.array
.
If the path also has special path characters (i.e. a dot or brackets) it may be escaped using SQL++ escapes. Considering the document
{"literal[]bracket": {"literal.dot": true}}
A path such as `literal[]bracket`.`literal.dot`. You can use double-backticks (``) to reference a literal backtick.
If you need to combine both JSON and path-syntax literals you can do so by escaping the component from any JSON string characters (e.g.
a quote or backslash) and then encapsulating it in backticks (`path`
).
Currently, paths cannot exceed 1024 characters, and cannot be more than 32 levels deep. DJSON documents with more than 32 nested layers cannot be parsed, atttempting to do so will result in a`DocumentTooDeepException` exception. |
Extended Attributes
Extended Attributes (also known as XATTRs), built upon the Sub-Document API, allow developers to define application-specific metadata that will only be visible to those applications that request it or attempt to modify it. This might be, for example, meta-data specific to a programming framework that should be hidden by default from other frameworks or libraries, or possibly from other versions of the same framework. They are not intended for use in general applications, and data stored there cannot be accessed easily by some Couchbase services, such as Search.
JsonObject docContent = JsonObject.create().put("body", "value");
collection.mutateIn("hotel_14006",
List.of(MutateInSpec.upsert("foo", "bar").xattr().createPath(), MutateInSpec.replace("", docContent)));
The full document can be replaced using the Sub-Doc API.
In the above snippet, the full document is replaced, whilst xattrs are updated with the same command.
The empty ""
in MutateInSpec.replace("", docContent)
represents the full document.