Querying Across Relationships
- how-to
How to join data sources for a SQL++ selection query.
This guide is for Couchbase Server.
Introduction
You can use a join to read objects from one data source, combine them with corresponding objects from another data source, and return the joined objects. The first data source is said to be on the left-hand side of the join and the second is said to be on the right-hand side of the join.
If you want to try out the examples in this section, follow the instructions given in Do a Quick Install to install Couchbase Server, configure a cluster, and load a sample dataset. Read the following for further information about the tools available for editing and executing queries:
Creating a Join
SQL++ offers several types of join syntax. This guide focuses on ANSI join, which is the recommended join syntax. It enables you to join one data source to another using arbitrary fields.
To create a join:
-
Use the FROM clause to specify the data source on the left-hand side of the join. This may be a keyspace identifier, a subquery, or a generic expression.
-
Use the JOIN clause to specify the data source on the right-hand side of the join. For ANSI joins, this may be a keyspace identifier, a subquery, or a generic expression.
-
Use the ON keyword to specify the join predicate. This is a condition that must be met to join an object on the right-hand side to an object on the left-hand side.
To use a document key in the join predicate, use the META() function to return the id field from the document metadata.
|
For example, the following query selects a route and the associated airline.
Set the query context to the inventory
scope in the travel sample dataset.
For more information, see Setting the Query Context.
SELECT *
FROM route r (1)
JOIN airline a (2)
ON r.airlineid = META(a).id (3)
LIMIT 1;
1 | The route keyspace is the left-hand side of the join. |
2 | The airline keyspace is the right-hand side of the join. |
3 | The airlineid field on the left-hand side must be equal to the document key on the right-hand side. |
Before running a query containing a join, make sure all the required indexes exist. To check which indexes may be required, use the Index Advisor. |
For more information and examples, refer to ANSI JOIN Clause.
Join Types
ANSI joins support three types of join: inner joins, left outer joins, and right outer joins.
Inner Joins
The default join is an inner join. An inner join returns joined objects only where a source object from the left-hand side of the join matches a source object from the right-hand side of the join.
To create an inner join, omit the join type, or optionally include the INNER
keyword before the JOIN clause.
For example, the following query lists all the source airports and airlines that fly into San Francisco.
Set the query context to the inventory
scope in the travel sample dataset.
For more information, see Setting the Query Context.
SELECT route.airlineid, airline.name, route.sourceairport, route.destinationairport
FROM route
INNER JOIN airline
ON route.airlineid = META(airline).id
WHERE route.destinationairport = "SFO"
ORDER BY route.sourceairport;
[
{
"airlineid": "airline_5209",
"destinationairport": "SFO",
"name": "United Airlines",
"sourceairport": "ABQ"
},
{
"airlineid": "airline_5209",
"destinationairport": "SFO",
"name": "United Airlines",
"sourceairport": "ACV"
},
{
"airlineid": "airline_5209",
"destinationairport": "SFO",
"name": "United Airlines",
"sourceairport": "AKL"
},
// ...
Left Outer Joins
A left outer join returns joined objects using all the source objects from the left-hand side of the join, but only including source objects from the right-hand side of the join if they match.
To create a left outer join, include the LEFT
or LEFT OUTER
keywords before the JOIN clause.
For example, the following query lists all airports in the United States, and for each airport gives the first listed landmark in the same city, if any.
Set the query context to the inventory
scope in the travel sample dataset.
For more information, see Setting the Query Context.
SELECT DISTINCT MIN(aport.airportname) AS Airport__Name,
MIN(aport.tz) AS Airport__Time,
MIN(lmark.name) AS Landmark_Name
FROM airport aport (1)
LEFT JOIN landmark lmark (2)
ON aport.city = lmark.city
AND lmark.country = "United States"
GROUP BY aport.airportname
ORDER BY aport.airportname
LIMIT 4;
1 | The airport keyspace is on the left-hand side of the join. |
2 | The landmark keyspace is on the right-hand side of the join. |
[
{
"Airport__Name": "Abbeville",
"Airport__Time": "Europe/Paris",
"Landmark_Name": null (1)
},
{
"Airport__Name": "Aberdeen Regional Airport",
"Airport__Time": "America/Chicago",
"Landmark_Name": null
},
{
"Airport__Name": "Abilene Rgnl",
"Airport__Time": "America/Chicago",
"Landmark_Name": null
},
{
"Airport__Name": "Abraham Lincoln Capital",
"Airport__Time": "America/Chicago",
"Landmark_Name": null
}
]
1 | If there is no corresponding data object on the right-hand side of the join, fields from the right-hand side are missing or null. |
Right Outer Joins
A right outer join returns joined objects using all the source objects from the right-hand side of the join, but only including source objects from the left-hand side of the join if they match.
To create a right outer join, include the RIGHT
or RIGHT OUTER
keywords before the JOIN clause.
For example, the following query lists all landmarks in the United States, and for each landmark gives the first listed airport in the same city, if any.
Set the query context to the inventory
scope in the travel sample dataset.
For more information, see Setting the Query Context.
SELECT DISTINCT MIN(aport.airportname) AS Airport__Name,
MIN(aport.tz) AS Airport__Time,
MIN(lmark.name) AS Landmark_Name,
FROM airport aport (1)
RIGHT JOIN landmark lmark (2)
ON aport.city = lmark.city
AND aport.country = "United States"
GROUP BY lmark.name
ORDER BY lmark.name
LIMIT 4;
1 | The airport keyspace is on the left-hand side of the join. |
2 | The landmark keyspace is on the right-hand side of the join. |
[
{
"Airport__Name": "San Francisco Intl",
"Airport__Time": "America/Los_Angeles",
"Landmark_Name": ""Hippie Temptation" house"
},
{
"Airport__Name": null, (1)
"Airport__Time": null,
"Landmark_Name": "'The Argyll Arms Hotel"
},
{
"Airport__Name": null,
"Airport__Time": null,
"Landmark_Name": "'Visit the Hut of the Shadows and other End of the Road sculptures"
},
{
"Airport__Name": "London-Corbin Airport-MaGee Field",
"Airport__Time": "America/New_York",
"Landmark_Name": "02 Shepherd's Bush Empire"
}
]
1 | If there is no corresponding data object on the left-hand side of the join, fields from the left-hand side are missing or null. |
Chaining Joins
To chain joins, nests, and unnests, use the right-hand side of one JOIN, NEST, or UNNEST clause as the left-hand side of the next.
For example, the following query joins routes to airports by destination airport, and then nests landmarks in the same city as each airport.
Set the query context to the inventory
scope in the travel sample dataset.
For more information, see Setting the Query Context.
SELECT *
FROM route AS rte (1)
JOIN airport AS apt (2)
ON rte.destinationairport = apt.faa
NEST landmark AS lmk (3)
ON apt.city = lmk.city
LIMIT 5;
1 | The route keyspace is on the left-hand side of the join. |
2 | The airport keyspace is on the right-hand side of the join and the left-hand side of the nest. |
3 | The landmark keyspace is on the right-hand side of the nest. |
For more information and examples, refer to Left-Hand Side for the JOIN clause.