Access Data
- Capella Columnar
- tutorial
You can set up different data sources to work with sample data in Capella Columnar.
This Capella Columnar guide includes example statements and queries that refer to sample datasets, including a Commerce
example dataset and the Couchbase travel-sample
and beer-sample
datasets.
Use the procedures on this page to create Capella Columnar collections for these datasets. Each section gives you practice setting up database objects in Capella Columnar so that you can work with data from different sources. For more information about Columnar database objects, see Access and Organize Data in Capella Columnar Services.
-
Set up standalone collections and populate them by inserting the
Commerce
dataset. -
Directly import the
travel-sample
data into a new database object in your Columnar cluster. -
Set up a remote data source—a link and a set of remote collections—to shadow
beer-sample
data in a Capella collection.The beer-sample
example requires you to set up a Capella operational cluster as a remote data source for Capella Columnar. TheCommerce
dataset andtravel-sample
do not require a Capella operational cluster or other remote data source to set up.
Prerequisites
-
If you’re just getting started with Capella Columnar, make sure that you created a Couchbase Capella account.
-
Create a project and Columnar cluster. For more information about how to set up your project and cluster in Columnar, see Create a Cluster.
-
If you want to work with the beer-sample dataset, you must have a Capella operational cluster deployed. This cluster is the remote data source for the sample data in the example. For more information about how to create a Capella operational cluster, see Create A Paid Cluster.
-
To follow along with the examples on this page, you must have the
Organization Owner
role in your organization, or one of the following project roles for the project that contains your cluster:
Install the Commerce Dataset in Standalone Collections
The Commerce dataset consists of two collections:
-
customers
, with the primary keycustid
which has string values -
orders
, with the primary keyorderno
which has integer values
To work with this dataset in Capella Columnar you create a standalone collection for each one. Then, you use INSERT INTO statements to populate them with data.
Create a Standalone Collection
-
In the Capella UI, select the Columnar tab and then click a cluster name. The workbench opens.
-
In the explorer, go to
. -
In the Database Name field, enter
sampleAnalytics
. -
Click Add Scope to New Database to create a scope named
Commerce
. -
Click Create.
-
In the explorer, point to the Commerce scope and go to
. -
In the Collection Name field, enter
customers
. -
For the Collection Primary Key, in the Field Name, enter
custid
. -
In the Field Type list, select string.
-
Click Create.
Populate a Standalone Collection
-
Use the query editor’s Query Context lists to select the
sampleAnalytics
database andCommerce
scope. -
In the query editor, begin an INSERT INTO statement as follows:
INSERT INTO customers (
If you type in the statement instead of copying and pasting this example, the query editor automatically supplies the closing parenthesis )
. -
Open the customers data, select the contents of the page and copy it.
-
To complete the statement, return to the query editor and paste the JSON document in between the parentheses. You’ll need to add the closing parenthesis
)
if you used copy and paste to supply theINSERT INTO
statement. -
Run the query to populate the
customers
collection. -
To verify that the collection contains data now, run the following query:
SELECT * FROM customers LIMIT 1;
Create another standalone collection for orders
, which uses the Field Name orderno
and a Field Type of int as a primary key.
Use the orders data to select and copy the data for this collection, then populate it using another INSERT INTO
statement.
You can also populate standalone collections by importing data from a file. After you create a standalone collection, point to the collection name, and go to | to upload a CSV, TSV, JSON, or JSONL file. You can configure import filters or create a new collection for your import.
Import the travel-sample
Collections
The Couchbase travel-sample
dataset is available to import from directly inside the Columnar workbench.
The travel-sample
consists of 5 collections of JSON documents: airline
, airport
, landmark
, hotel
, and route
.
To import the travel-sample
into your Columnar cluster:
-
In the explorer, click Import.
-
Click Sample.
-
Click Import.
Columnar creates a new database, travel-sample
, with the inventory
scope and all 5 collections.
After the import finishes, you can choose to import sample queries to work with the travel-sample
dataset.
Create Remote Collections for beer-sample
You can import the Couchbase beer-sample
dataset into a Capella operational cluster or self-managed Couchbase Server cluster.
This dataset consists of a single collection, which contains data on beers and breweries.
Set up remote collections to hold shadow copies of the beer-sample
data in Capella Columnar.
Use WHERE clauses to create multiple collections on Columnar, instead of creating only a single collection to match what’s in your remote data source in your Capella operational cluster or Couchbase Server.
Any Capella Columnar collections that use a WHERE clause apply that clause on an ongoing basis to continuously filter your incoming data event stream. Only documents that meet the WHERE clause criteria are upserted into your Capella Columnar collection.
Prepare to Ingest Data from the Remote System
Make sure the Capella operational cluster you want to use as a remote data source belongs to the same organization as your Capella Columnar cluster. |
To prepare your Capella operational cluster for creating a remote collection in Capella Columnar:
After you have imported the sample data, you can set up your Capella operational cluster as a remote data source in Columnar.
Create a Data Source for Remote Data
-
In the Capella UI, click the Couchbase Capella icon and from the Columnar tab select your cluster to open the workbench.
-
In the explorer, go to
. -
In the Database Name field, enter
remoteCapella
. -
Click Add Scope to New Cluster to create a scope named
remoteBeer
. -
Click Create.
-
In the explorer, go to
. -
Click Couchbase Capella.
-
Click Continue.
-
In the Link Name field, enter
capellaLink
. -
Select the cluster where you imported the
beer-sample
dataset and click Save & Continue.Capella creates the link between your Columnar cluster and your operational cluster.
-
Click Create Linked Collection.
-
In the lists, select your remoteCapella cluster and remoteBeer scope.
-
In the Collection Name field, enter the name for the first collection,
brewBelgium
. -
In the Configure Data Details section, select the beer-sample bucket, and _default scope and collection.
-
In the provided field, enter a WHERE clause for the documents in the collection to shadow:
country = "Belgium"
-
Click Create Collection.
-
(Optional) Click Create Another Collection and use the following WHERE clause to shadow data for French beers and breweries:
country = "France"
-
To start shadowing data from your Capella operational cluster in Capella Columnar, click Connect Link.
You’ll incur charges for usage after you connect this link. -
Click Yes, Continue.
-
Close the link creation pane.
-
Verify that your
brewBelgium
collection now contains a shadow copy of the data sourced from Capella by running the following query, with your query context set to remoteCapella and remoteBeer:SELECT * FROM brewBelgium LIMIT 1;
Next Steps
You can create more collections using the same remote link to your Capella operational cluster.
In the explorer, expand Links and go to brewGermany
for country = "Germany"
, brewUS
for country = "United States"
and so on.
You can also use the example DDL statements for creating remote Couchbase collections as guidelines for using SQL++ queries instead of the user interface controls. See the examples for Create a Remote Couchbase Collection.