Indexes can speedup both selection queries and join queries if they are applied properly. The following two sections describe scenarios that indexes are explored inside the system for query processing purpose.
Secondary indexes do not store NULL values. Queries that select objects based on the existence of a NULL field are not accelerated by an index. |
Selection Queries
The query optimizer chooses to use a secondary index for query execution if both of the following conditions are met:
-
The query contains a conjunctive equality or range predicate over one or more fields, or a join predicate — see the next section. The conjunctive predicate has a form:
QualifiedName Operator Literal ( AND field Operator Literal )+
where
Operator
is=
,>
,>=
,<
,<=
, orBETWEEN
; -
There is an index with a key, such that the corresponding fields in the predicate form a prefix of that key. For example, suppose that there is an index on collection
foo
with key fieldsc_s
andc_d
.CREATE INDEX idx_s_d ON foo(c_s:STRING, c_d:DOUBLE);
The following query uses the index because it has an equality predicate (=) on a field (c_s) that is a prefix of the indexed key (c_s, c_d):
SELECT f.c_x as res FROM foo f WHERE f.c_s = 'world';
If you would like an available index to not be used for a particular query predicate (e.g., because there will be many, many matching objects), a skip-index hint can be used:
SELECT f.c_x as res FROM foo f WHERE f.c_s /*+ skip-index */ = 'world';
If multiple indexes are eligible access paths, there can be two cases:
-
Two or more indexes are sharing the same prefix and the predicate is on that prefix. For example:
indexA
is on (c1, c2),indexB
is on (c1, c3) and the predicate is on c1 (c1 = 100). In this case, the query optimizer picks the first index in the order of their names, e.g.,indexA
. -
No indexes share the same prefix and the predicate refers to fields from each individual index. For example:
indexA
in on (c1) andindexB
is on (c2), the predicate is on c1 and c2 (c1 = 100 and c2 = 200). In this case, both indexes will be used to retrieve matched primary keys and then the key sets will be intersected to further filter retrieved primary keys.
Join Queries
N1QL for Analytics supports joins from standard SQL in the following forms:
-
Inner join:
SELECT * FROM ds_outer, ds_inner WHERE <predicate>; SELECT * FROM ds_outer JOIN ds_inner ON <predicate>; SELECT * FROM ds_outer INNER JOIN ds_inner ON <predicate>;
-
Left outer join:
SELECT * FROM ds_outer LEFT JOIN ds_inner ON <predicate>; SELECT * FROM ds_outer LEFT OUTER JOIN ds_inner ON <predicate>;
-
Right outer join:
SELECT * FROM ds_outer RIGHT JOIN ds_inner ON <predicate>; SELECT * FROM ds_outer RIGHT OUTER JOIN ds_inner ON <predicate>;
ds_outer is the outer
branch and ds_inner is the inner
branch
(in the order in which they appear in the FROM clause).
The query optimizer picks an index for join evaluation if the following conditions are met:
-
The join predicate is an equality or range predicate that refers to fields from both branches of the join, in the form of:
fn(expr_outer) /*+ indexnl */ op field_inner
where
op
is<
,<=
,=
,>=
,>
, orBETWEEN
; -
field_inner
is a field from the inner collection on which the index is defined; -
The index join hint,
/*+ indexnl */
, is provided for the join predicate; -
fn()
is a function that returns the same data type as the type specified in the index for thefield_inner
. Usually,fn()
is one of the following: to_string(), to_double(), or to_bigint().
Note that if the type of expr_outer
is not known at compile time therefore the index join cannot be
selected if the join predicate is just expr_outer /*+ indexnl */ op field_inner
.
For example, suppose there are two shadow collections, foo1 and foo2, with an index on foo2:
CREATE index idx_f2 ON foo2(c_s2:string);
Then the following query would use this index for join evaluation:
SELECT f1.c_x1 as c1, f2.c_x2 as c2 FROM foo1 AS f1, foo2 AS f2 WHERE to_string(f1.c_s1) /*+ indexnl */ = f2.c_s2