Query
- how-to
You can query for documents in Couchbase using the SQL++ query language, a language based on SQL, but designed for structured and flexible JSON documents. Querying can solve typical programming tasks such as finding a user profile by email address, facebook login, or user ID.
Our query service uses SQL++ (formerly N1QL), which will be fairly familiar to anyone who’s used any dialect of SQL. Further resources for learning about SQL++ are listed at the bottom of the page. Before you get started you may wish to checkout the SQL++ intro page, or just dive in with a query against our travel sample data set. In this case, the one thing that you need to know is that in order to make a Bucket queryable, it must have at least one index defined. You can define a primary index on a bucket. When a primary index is defined you can issue non-covered queries on the bucket as well.
Use cbq, our interactive Query shell. Open it, and enter the following:
CREATE PRIMARY INDEX ON `travel-sample`
or replace travel-sample with a different Bucket name to build an index on a different dataset.
The default installation places cbq in /opt/couchbase/bin/ on Linux, /Applications/Couchbase Server.app/Contents/Resources/couchbase-core/bin/cbq on OS X, and C:\Program Files\Couchbase\Server\bin\cbq.exe on Microsoft Windows.
|
Note that building indexes is covered in more detail on the Query concept page — and in the API Reference.
A Simple Query
Here’s the basics of how to run a simple query to fetch 10 random rows from travel-sample and print the results:
options = Cluster::QueryOptions.new
options.metrics = true
result = cluster.query('SELECT * FROM `travel-sample` LIMIT 10', options)
result.rows.each do |row|
puts row
end
#=>
# {"travel-sample"=>{"callsign"=>"MILE-AIR", "country"=>"United States", "iata"=>"Q5", "icao"=>"MLA", "id"=>10, "name"=>"40-Mile Air", "type"=>"airline"}}
# {"travel-sample"=>{"callsign"=>"TXW", "country"=>"United States", "iata"=>"TQ", "icao"=>"TXW", "id"=>10123, "name"=>"Texas Wings", "type"=>"airline"}}
# ...
puts "Reported execution time: #{result.meta_data.metrics.execution_time}"
#=> Reported execution time: 11.377766ms
A query is always performed at the Cluster
level, using the query()
method.
It takes the statement as a required argument and then allows additional options if needed.
A complete list of QueryOptions
can be found in the API docs.
Here we pass readonly
to explicitly mark a query as being read only, and not mutating any documents on the server side.
options = Cluster::QueryOptions.new
options.readonly = true
cluster.query(
'SELECT COUNT(*) AS airport_count FROM `travel-sample` WHERE type = "airport"',
options)
Queries & Placeholders
Placeholders allow you to specify variable constraints for an otherwise constant query. There are two variants of placeholders: postional and named parameters. Positional parameters use an ordinal placeholder for substitution and named parameters use variables. A named or positional parameter is a placeholder for a value in the WHERE, LIMIT or OFFSET clause of a query. Note that both parameters and options are optional.
options = Cluster::QueryOptions.new
options.positional_parameters(["France"])
result = cluster.query(
'SELECT COUNT(*) AS airport_count FROM `travel-sample` WHERE type = "airport" AND country = ?',
options)
puts "Airports in France: #{result.rows.first["airport_count"]}"
#=> Airports in France: 221
options = Cluster::QueryOptions.new
options.named_parameters({"country" => "France"})
result = cluster.query(
'SELECT COUNT(*) AS airport_count FROM `travel-sample` WHERE type = "airport" AND country = $country',
options)
puts "Airports in France: #{result.rows.first["airport_count"]}"
#=> Airports in France: 221
The Query Result
When performing a query, the response you receive is a QueryResult
.
If no exception gets raised the request succeeded and provides access to both the rows returned and also associated QueryMetaData
.
options = Cluster::QueryOptions.new
options.client_context_id = "user-44-#{rand}"
result = cluster.query(
'SELECT COUNT(*) AS airport_count FROM `travel-sample` WHERE type = "airport"',
options)
puts "client_context_id: #{result.meta_data.client_context_id}"
#=> client_context_id: user-44-0.9899233780544747
The QueryMetaData
provides insight into some basic profiling/timing information as well as information like the clientContextId
.
Name | Description |
---|---|
|
Returns the request identifer string of the query request. |
|
Returns the context ID either generated by the SDK or supplied by the user. |
|
Returns raw query execution status as returned by the query engine. |
|
Metrics as returned by the query engine, if enabled. |
|
Returns the signature as returned by the query engine which is then decoded as JSON object. |
|
Non-fatal errors are available to consume as warnings on this method. |
|
If enabled returns additional profiling information of the query. |
For example, here is how you can print the executionTime
of a query:
options = Cluster::QueryOptions.new
options.metrics = true
result = cluster.query(
'SELECT COUNT(*) AS airport_count FROM `travel-sample` WHERE type = "airport"',
options)
puts "Reported execution time: #{result.meta_data.metrics.execution_time}"
#=> Reported execution time: 2.516245ms
Scan Consistency
Setting a staleness parameter for queries, with scan_consistency
, enables a tradeoff between latency and (eventual) consistency.
-
A SQL++ query using the default Not Bounded Scan Consistency will not wait for any indexes to finish updating before running the query and returning results, meaning that results are returned quickly, but the query will not return any documents that are yet to be indexed.
-
With Scan Consistency set to request_plus, all document changes and index updates are processed before the query is run. Select this when consistency is always more important than performance.
-
For a middle ground, AtPlus is a "read your own write" (RYOW) option, which means it just waits for the new documents that you specify to be indexed, rather than an entire index of multiple documents.
options = Cluster::QueryOptions.new
options.scan_consistency = :request_plus
result = cluster.query(
'SELECT COUNT(*) AS airport_count FROM `travel-sample` WHERE type = "airport"',
options)
puts "Airports in the database: #{result.rows.first["airport_count"]}"
#=> Airports in the database: 1968
Querying at Scope Level
It is possible to query off the Scope
level with the latest version of Couchbase Server release, 7.0,
using the query()
method.
It takes the statement as a required argument, and then allows additional options if needed.
The code snippet below shows how to run a simple query to fetch 10 random rows from travel-sample and print the results,
the assumption is that the airline
collection exists within a scope us
.
bucket = cluster.bucket("travel-sample")
myscope = bucket.scope("us")
mycollection = "airline"
options = Couchbase::Cluster::QueryOptions.new
options.metrics = true
result = myscope.query("SELECT * FROM #{mycollection} LIMIT 10", options)
result.rows.each do |row|
puts row
end
puts "Reported execution time: #{result.meta_data.metrics.execution_time}"
#=>
#{"airline"=>{"callsign"=>"TXW", "iata"=>"TQ", "icao"=>"TXW", "name"=>"Texas Wings"}}
#{"airline"=>{"callsign"=>"SASQUATCH", "iata"=>"K5", "icao"=>"SQH", "name"=>"SeaPort Airlines"}}
#Reported execution time: 3.620224ms
A complete list of QueryOptions
can be found in the API docs.
Additional Resources
SQL++ is not the only query option in Couchbase. Be sure to check that your use case fits your selection of query service. |
For a deeper dive into SQL++ from the SDK, refer to our SQL++ SDK concept doc.
The Server doc SQL++ intro introduces up a complete guide to the SQL++ language, including all of the latest additions.
The SQL++ interactive tutorial is a good introduction to the basics of SQL++ use.