CREATE INDEX
- reference
The CREATE INDEX
statement allows you to create a secondary index. Secondary indexes contain a filtered or a full set of keys in a given keyspace.
Secondary indexes are optional but increase query efficiency on a keyspace.
In Couchbase Server 7.0 and later, CREATE INDEX
allows you to make multiple concurrent index creation requests.
The command starts a task to create the index definition in the background.
If there is an index creation task already running, the Index Service queues the incoming index creation request.
CREATE INDEX
returns as soon as the index creation phase is complete.
By default, when the index creation phase is complete, the Index Service triggers the index build phase.
If you lose connectivity, the index build operation continues in the background.
You can also defer the index build phase using the defer_build
clause.
In deferred build mode, CREATE INDEX
creates the index definition, but does not trigger the index build phase.
You can then build the index using the BUILD INDEX command.
Index metadata provides a state field.
The index state may be scheduled for creation
, deferred
, building
, pending
, online
, offline
, or abridged
.
This state field and other index metadata can be queried using system:indexes.
You can also monitor the index state using the Couchbase Web Console.
If you kick off multiple index creation operations concurrently, you may sometimes see transient errors similar to the following. If this error occurs, the Index Service tries to run the failed operation again in the background until it succeeds, up to a maximum of 1000 retries.
If the Index Service still cannot create the index after the maximum number of retries, the index state is marked as |
You can create multiple identical secondary indexes on a keyspace and place them on separate nodes for better index availability.
In Couchbase Server Enterprise Edition, the recommended way to do this is using the num_replicas
option.
In Couchbase Server Community Edition, you need to create multiple identical indexes and place them using the nodes
option.
Refer to WITH Clause below for more details.
Prerequisites
RBAC Privileges
User executing the CREATE INDEX statement must have the Query Manage Index privilege granted on the keyspace. For more details about user roles, see Authorization.
Syntax
create-index ::= CREATE INDEX index-name ON keyspace-ref '(' index-key [ index-order ] [ ',' index-key [ index-order ] ]* ')' [ where-clause ] [ index-using ] [ index-with ]
- index-name
-
[Required] A unique name that identifies the index.
Valid GSI index names can contain any of the following characters:
A-Z
a-z
0-9
#
_
, and must start with a letter, [A-Z
a-z
]. The minimum length of an index name is 1 character and there is no maximum length set for an index name. When querying, if the index name contains a#
or_
character, you must enclose the index name within backticks.
We recommend that you do not create (or drop) secondary indexes when any node with a secondary index role is down, as this may result in duplicate index names. |
Keyspace Reference
keyspace-ref ::= keyspace-path | keyspace-partial
Specifies the keyspace for which the index needs to be created.
If there is a hyphen (-) inside any part of the keyspace reference, you must wrap that part of the keyspace reference in backticks (` `). Refer to the examples below. |
Keyspace Path
keyspace-path ::= [ namespace ':' ] bucket [ '.' scope '.' collection ]
If the keyspace is a named collection, or the default collection in the default scope within a bucket, the keyspace reference may be a keyspace path. In this case, the query context should not be set.
- namespace
-
(Optional) An identifier that refers to the namespace of the keyspace. Currently, only the
default
namespace is available. If the namespace name is omitted, the default namespace in the current session is used. - bucket
-
(Required) An identifier that refers to the bucket name of the keyspace.
- scope
-
(Optional) An identifier that refers to the scope name of the keyspace. If omitted, the bucket’s default scope is used.
- collection
-
(Optional) An identifier that refers to the collection name of the keyspace. If omitted, the default collection in the bucket’s default scope is used.
For example, default:`travel-sample`
indicates the default collection in the default scope in the travel-sample
bucket in the default
namespace.
Similarly, default:`travel-sample`.inventory.airline
indicates the airline
collection in the inventory
scope in the travel-sample
bucket in the default
namespace.
Keyspace Partial
keyspace-partial ::= collection
Alternatively, if the keyspace is a named collection, the keyspace reference may be just the collection name with no path. In this case, you must set the query context to indicate the required namespace, bucket, and scope.
- collection
-
(Required) An identifier that refers to the collection name of the keyspace.
For example, airline
indicates the airline
collection, assuming the query context is set.
Index Key
index-key ::= expr | array-expr
Refers to an attribute name or a scalar function or an ARRAY expression on the attribute. This constitutes an index-key for the index.
- expr
-
A N1QL expression over any fields in the document. This cannot use constant expressions, aggregate functions, or sub-queries.
- array-expr
-
An array expression. For details about array expressions, see Array Indexing.
Index Order
index-order ::= ASC | DESC
Specifies the sort order of the index key.
ASC
-
The index key is sorted in ascending order.
DESC
-
The index key is sorted in descending order.
This clause is optional; if omitted, the default is ASC
.
WHERE Clause
where-clause ::= WHERE cond
- cond
-
Specifies WHERE clause predicates to qualify the subset of documents to include in the index.
USING Clause
index-using ::= USING GSI
In Couchbase Server 6.5 and later, the index type for a secondary index must be Global Secondary Index (GSI).
The USING GSI
keywords are optional and may be omitted.
WITH Clause
index-with ::= WITH expr
Use the WITH clause to specify additional options.
- expr
-
An object with the following properties:
- nodes
-
[Optional] An array of strings, each of which represents a node name.
A node name passed to the nodes
property must include the cluster administration port, by default 8091. For exampleWITH {"nodes": ["192.0.2.0:8091"]}
instead ofWITH {"nodes": ["192.0.2.0"]}
. - defer_build
-
[Optional] Boolean.
- true
-
When set to
true
, theCREATE INDEX
operation queues the task for building the index but immediately pauses the building of the index of type GSI. Index building requires an expensive scan operation. Deferring building of the index with multiple indexes can optimize the expensive scan operation. Admins can defer building multiple indexes and, using theBUILD INDEX
statement, multiple indexes to be built efficiently with one efficient scan of bucket data. - false
-
When set to
false
, theCREATE INDEX
operation queues the task for building the index and immediately kicks off the building of the index of type GSI.
- num_replica
Usage
Array Indexing
Array indexing enables you to create global indexes on array elements and optimize the execution of queries involving array elements. For details, refer to Array Indexing.
Index Partitioning
Index partitioning helps increase the query performance by dividing and spreading a large index of documents across multiple nodes, horizontally scaling out an index as needed. For details, refer to Index Partitioning.
Using the meta().id
Function
For details, refer to Indexing Meta Info.
Using Indexes for Aggregates
If there is an index on the expression of an aggregate, that index may be used to satisfy the query.
For example, given the index alt_idx
created using the following statement:
CREATE INDEX alt_idx ON `travel-sample`.inventory.airport(geo.alt);
The query engine will use the index alt_idx
for the following query:
SELECT MIN(geo.alt), MAX(geo.alt) FROM `travel-sample`.inventory.airport;
For details, refer to Operator Pushdowns.
Index Replicas
In the Indexes screen in the Couchbase Web Console, index replicas are marked with their replica ID.
If you select view by server node
from the drop-down menu, you can see the server node where each index and index replica is placed.
You can also query the system:indexes catalog to find the ID of an index replica and see which node it is placed on.
By default, index replicas are used to serve index scans. The system automatically load-balances an index scan across the index and all its replicas. Adding index replicas enables you to scale scan throughput, in addition to providing high availability. |
Examples
Default Collection
The following example creates an index on the default collection in the default scope within the travel-sample
bucket.
First make sure the query context is not set.
-
Query Workbench
-
CBQ Shell
Use the query context drop-down menu to select query context
.
\UNSET -query_context;
Create a secondary index that contains airports with an alt
value greater than 1000 on the node 127.0.0.1
.
CREATE INDEX idx_default_over1000
ON `travel-sample`(geo.alt)
WHERE geo.alt > 1000
USING GSI
WITH {"nodes": ["127.0.0.1:8091"]};
Query Context
The following example is similar to Example 1, but creates an index on the airport
collection.
First set the query context to `travel-sample`.inventory
.
-
Query Workbench
-
CBQ Shell
Use the query context drop-down menu to select travel-sample.inventory
.
\SET -query_context travel-sample.inventory;
Create a secondary index that contains airports with an alt
value greater than 1000 on the node 127.0.0.1
.
CREATE INDEX idx_airport_over1000
ON airport(geo.alt)
WHERE geo.alt > 1000
USING GSI
WITH {"nodes": ["127.0.0.1:8091"]};
Named Collection
In each of the examples that follow, the path to the required keyspace is specified by the query, so you do not need to set the query context.
Create a secondary index with the defer_build
option.
CREATE INDEX idx_landmark_country
ON `travel-sample`.inventory.landmark(country)
USING GSI
WITH {"defer_build":true};
Query system:indexes
for the status of the index.
SELECT * FROM system:indexes WHERE name="idx_landmark_country";
[
{
"indexes": {
"bucket_id": "travel-sample",
"datastore_id": "http://127.0.0.1:8091",
"id": "d079aec40eb0c6cc",
"index_key": [
"`country`"
],
"keyspace_id": "landmark",
"name": "idx_landmark_country",
"namespace_id": "default",
"scope_id": "inventory",
"state": "deferred", (1)
"using": "gsi"
}
}
]
1 | The index is in the deferred state. |
Kick off a deferred build using the index name.
BUILD INDEX ON `travel-sample`.inventory.landmark(idx_landmark_country) USING GSI;
Query system:indexes
for the status of the index.
SELECT * FROM system:indexes WHERE name="idx_landmark_country";
[
{
"indexes": {
"bucket_id": "travel-sample",
"datastore_id": "http://127.0.0.1:8091",
"id": "d079aec40eb0c6cc",
"index_key": [
"`country`"
],
"keyspace_id": "landmark",
"name": "idx_landmark_country",
"namespace_id": "default",
"scope_id": "inventory",
"state": "online", (1)
"using": "gsi"
}
}
]
1 | The index has now been created. |