Querying with SQL++
You can query for documents in Couchbase using the SQL++ query language, a language based on SQL, but designed for structured and flexible JSON documents.
On this page we dive straight into using the Query Service API from the Python Columnar SDK. For a deeper look at the concepts, to help you better understand the Query Service, and the SQL++ language, see the links in the Further Information section at the end of this page.
Here we show queries against the Travel Sample collection, at cluster and scope level, and give links to information on adding other collections to your data.
Before You Start
This page assumes that you have installed the Python Columnar SDK, added your IP address to the allowlist, and created a Columnar cluster.
Create a collection to work upon by importing the travel-sample dataset into your cluster.
Querying Your Dataset
Most queries return more than one result, and you want to iterate over the results:
Scope Level Queries
- 
Sync API 
- 
Async API 
scope = cluster.database('travel-sample').scope('inventory')
query = """
        SELECT airline, COUNT(*) AS route_count, AVG(route.distance) AS avg_route_distance
        FROM route
        GROUP BY airline
        ORDER BY route_count DESC
        """
res = scope.execute_query(query)
print('Rows:')
for row in res.rows():
    print(row)
print(f'\nMetadata: {res.metadata()}')scope = cluster.database('travel-sample').scope('inventory')
query = """
        SELECT airline, COUNT(*) AS route_count, AVG(route.distance) AS avg_route_distance
        FROM route
        GROUP BY airline
        ORDER BY route_count DESC
        """
res = await scope.execute_query(query)
print('Rows:')
async for row in res.rows():
    print(row)
print(f'\nMetadata: {res.metadata()}')Cluster Level Queries
- 
Sync API 
- 
Async API 
query = """
        SELECT airline, COUNT(*) AS route_count, AVG(route.distance) AS avg_route_distance
        FROM `travel-sample`.inventory.route
        GROUP BY airline
        ORDER BY route_count DESC
        """
res = cluster.execute_query(query)query = """
        SELECT airline, COUNT(*) AS route_count, AVG(route.distance) AS avg_route_distance
        FROM `travel-sample`.inventory.route
        GROUP BY airline
        ORDER BY route_count DESC
        """
res = await cluster.execute_query(query)Positional and Named Parameters
Supplying parameters as individual arguments to the query allows the query engine to optimize the parsing and planning of the query. You can either supply these parameters by name or by position.
Positional Parameters
Execute a query with positional arguments:
- 
Sync API 
- 
Async API 
from couchbase_columnar.options import QueryOptions
query = """
        SELECT airline, COUNT(*) AS route_count, AVG(route.distance) AS avg_route_distance
        FROM route
        WHERE sourceairport=$1 AND distance>=$2
        GROUP BY airline
        ORDER BY route_count DESC
        """
res = scope.execute_query(query, QueryOptions(positional_parameters=['SFO', 1000]))from acouchbase_columnar.options import QueryOptions
query = """
        SELECT airline, COUNT(*) AS route_count, AVG(route.distance) AS avg_route_distance
        FROM route
        WHERE sourceairport=$1 AND distance>=$2
        GROUP BY airline
        ORDER BY route_count DESC
        """
res = await scope.execute_query(query, QueryOptions(positional_parameters=['SFO', 1000]))Named Parameters
Execute a query with named arguments:
- 
Sync API 
- 
Async API 
query = """
        SELECT airline, COUNT(*) AS route_count, AVG(route.distance) AS avg_route_distance
        FROM route
        WHERE sourceairport=$source_airport AND distance>=$min_distance
        GROUP BY airline
        ORDER BY route_count DESC
        """
res = scope.execute_query(query, QueryOptions(named_parameters={'source_airport': 'SFO', 'min_distance': 1000}))query = """
        SELECT airline, COUNT(*) AS route_count, AVG(route.distance) AS avg_route_distance
        FROM route
        WHERE sourceairport=$source_airport AND distance>=$min_distance
        GROUP BY airline
        ORDER BY route_count DESC
        """
res = await scope.execute_query(query, QueryOptions(named_parameters={'source_airport': 'SFO', 'min_distance': 1000}))Using the Query Result
Results from the Couchbase Columnar SDK can easily be used with several common Data Analytics Python libraries, including Pandas and PyArrow.
import pandas as pd
res = scope.execute_query(query)
df = pd.DataFrame.from_records(res.rows(), index='airline')
print(df.head())
#          route_count  avg_route_distance
# airline
# AA              2354         2314.884359
# UA              2180         2350.365407
# DL              1981         2350.494112
# US              1960         2101.417609
# WN              1146         1397.736500import pyarrow as pa
res = scope.execute_query(query)
table = pa.Table.from_pylist(res.get_all_rows())
print(table.to_string())
# pyarrow.Table
# route_count: int64
# avg_route_distance: double
# airline: stringFurther Information
The SQL++ for Analytics Reference offers a complete guide to the SQL++ language for both of our analytics services, including all of the latest additions.