This section describes the data types which Couchbase Analytics can operate on.
An instance of Couchbase Analytics data model can be a primitive type (BOOLEAN
, STRING
, BIGINT
, or
DOUBLE
), a special type (NULL
or MISSING
), or a composite type (OBJECT
, ARRAY
, or MULTISET
).
The type names are case-insensitive; for example, both BIGINT
and bigint
are acceptable.
Primitive Types
Boolean
The boolean
data type has only two values: true
and false
.
Examples:
true false
An expression that compares two values, such as weight > limit , might return a boolean value, or it might return one of the special values null or missing .
For details, see Comparison Operators.
|
String
A string
represents a sequence of characters.
The total length of the sequence can be up to 2,147,483,648.
A literal string can be enclosed in single or double quotes.
Examples:
"This is a string." 'Have you read "War and Peace"?' "I don't think so."
For more details on string literals, see Literals. |
Bigint
A bigint
is a 64-bit integer.
The range of bigint
is from -9223372036854775808 to 9223372036854775807.
Examples:
45 -27900
In general, arithmetic operations on two
|
Double (double precision)
The double
type represents approximate numeric data values using 8 bytes.
The range of a double
value can be from (2^(-1022))
to (2-2^(-52))·2^(1023)
for both positive and negative numbers.
Expressions that compute values outside these ranges will return INF
or -INF
.
The special value NaN
(not a number) is also supported.
A numeric literal is treated as a double
if it contains a decimal point or an exponent.
Examples:
45. -2.79E4 2.79E-4
All numbers in the incoming JSON which are not integers are parsed as double values.
|
Temporal Data
Temporal data (dates, times, and timestamps) are important for many applications.
Since JSON does not provide data types for temporal data, SQL++ for Analytics uses the string
and bigint
types for this purpose.
SQL++ for Analytics provides many builtin functions, called temporal functions, that operate on string
or bigint
values, interpreting them as dates or times.
For a list of these functions, see Temporal Functions.
Using bigint
for Temporal Data
Some temporal functions take bigint
values and interpret them as representing a number of milliseconds since the beginning of the "Unix Epoch" at 00:00:00 UTC on January 1, 1970.
Here are two of these temporal functions and some examples of their use.
-
now_millis( )
returns the current point in time as abigint
representing Unix Epoch milliseconds. -
date_diff_millis(t1, t2, unit)
returns the difference between two points in time (t1
andt2
), in terms of the designated units (years, months, weeks, days, hours, seconds, or milliseconds).
Examples:
-
now_millis( )
returns1655230396621
at the start of an experiment. -
now_millis( )
returns1655231924715
at the end of the experiment. -
date_diff_millis(1655231924715, 1655230396621, "minute")
returns25
, showing that the experiment lasted for 25 minutes. -
date_diff_millis(1655231924715, 1655230396621, "second")
returns1528
, providing a finer measurement: the experiment lasted for 1528 seconds.
Using string
for Temporal Data
A string in ISO-8601 format can represent a date, time, or timestamp. The ISO format is illustrated by the following example, which represents the 30.55 seconds after 15:10 (3:10pm) on June 14, 2022 in a timezone that (like San Francisco) is eight hours behind Coordinated Universal Time (UTC).
"2022-06-14T15:10:30.55-8:00"
When using a string to represent temporal data, various parts of the ISO format can be omitted. For example:
-
"2022-06-14"
represents the day June 14, 2022. -
"15:10:30"
represents a time 30 seconds past 15:10 (3:10pm).
In general, two temporal values can be compared only if they have the same components, for example two dates or two times.
Violations of this rule usually give null
results.
These rules are illustrated by the following examples, which use the date_diff_str
function to compare two temporal values and return the difference in a designated unit:
Examples:
-
date_diff_str("2022-06-22", "2022-06-15", "day")
returns7
. -
date_diff_str("09:20:00", "08:00:00", "hour")
returns1
. -
date_diff_str("09:20:00", "08:00:00", "minute")
returns80
. -
date_diff_str("2022-06-22", "10:00:00", "hour")
returnsnull
.
For more information about the ISO-8601 format for temporal data, see Date Formats.
Incomplete Information Types
Null
null
is a special value that is often used to represent an unknown value.
For example, a user might not be able to know the value of a field and let it be null
.
-
Example:
{ "field": null };
-
The expected result is:
{ "field": null }
Missing
missing
indicates that a name-value pair is missing from an object.
If a missing name-value pair is accessed, an empty result value is returned by the query.
As neither the data model nor the system enforces homogeneity for datasets or collections,
items in a dataset or collection can be of heterogeneous types and
so a field can be present in one object and missing
in another.
-
Example:
{ "field": missing };
-
The expected result is:
{ }
Since a field with value missing
means the field is absent, we get an empty object.
A missing value is converted to null when converted to JSON in the query results.
|
Composite Types
Object
An object
(a.k.a., JSON object) contains a set of fields, where each field is described by
its name.
Syntactically, object constructors are surrounded by curly braces "{…}". For example:
{ "name": "Joe Blow", "rank": "Sergeant", "serialno": 1234567 } { "rank": "Private", "serialno": 9876543 } { "name": "Sally Forth", "rank": "Major", "serialno": 2345678, "gender": "F" }
Array
An array
is a container that holds a fixed number of values.
An array
is an ordered collection of items.
Array constructors are denoted by brackets: "[…]".
An example would be:
["alice", 123, "bob", null]
An array can appear in the incoming JSON and can be constructed by the query.
Each SELECT statement with an ORDER BY clause returns an array .
|
Multiset
A multiset
is an unordered collection of items.
A multiset
allows multiple instances of its elements.
A multiset cannot appear in the incoming JSON.
A multiset is converted into a JSON array with an undefined order of elements in the query results.
The order of the items in the result might change from one query execution to another.
Each SELECT statement without an ORDER BY clause returns a multiset .
|