In this Appendix, we’ll look at how variables are bound and how names are resolved.
Names can appear in every clause of a query.
Sometimes a name consists of just a single identifier, e.g., region or revenue.
More often a name will consist of two identifiers separated by a dot, e.g., customer.address.
Occasionally a name may have more than two identifiers, e.g., policy.owner.address.zipcode.
Resolving a name means determining exactly what the (possibly multi-part) name refers to.
It is necessary to have well-defined rules for how to resolve a name in cases of ambiguity.
(In the absence of schemas, such cases arise more commonly, and also differently, than they do in SQL.)
The basic job of each clause in a query block is to bind variables. Each clause sees the variables bound by previous clauses and may bind additional variables. Names are always resolved with respect to the variables that are bound ("in scope") at the place where the name use in question occurs. It is possible that the name resolution process will fail, which may lead to an empty result or an error message.
One important bit of background: When the system is reading a query and resolving its names, it has a list of all the available dataverses and datasets.
As a result, it knows whether a.b is a valid name for dataset b in dataverse a.
However, the system does not in general have knowledge of the schemas of the data inside the datasets; remember that this is a much more open world.
As a result, in general the system cannot know whether any object in a particular dataset will have a field named c.
These assumptions affect how errors are handled.
If you try to access dataset a.b and no dataset by that name exists, you will get an error and your query will not run.
However, if you try to access a field c in a collection of objects, your query will run and return missing for each object that doesn’t have a field named c - this is because it’s possible that some object (someday) could have such a field.
Binding Variables
Variables can be bound in the following ways:
- 
WITHandLETclauses bind a variable to the result of an expression in a straightforward wayExamples: WITH cheap_parts AS (SELECT partno FROM parts WHERE price < 100)binds the variablecheap_partsto the result of the subquery.LET pay = salary + bonusbinds the variablepayto the result of evaluating the expressionsalary + bonus.
- 
FROM,GROUP BY, andSELECTclauses have optionalASsubclauses that contain an expression and a name (called an iteration variable in aFROMclause, or an alias inGROUP BYorSELECT).Examples: FROM customer AS c, order AS oGROUP BY salary + bonus AS total_paySELECT MAX(price) AS highest_priceAn ASsubclause always binds the name (as a variable) to the result of the expression (or, in the case of aFROMclause, to the individual members of the collection identified by the expression).It’s always a good practice to use the keyword ASwhen defining an alias or iteration variable. However, as in SQL, the syntax allows the keywordASto be omitted. For example, theFROMclause above could have been written like this:FROM customer c, order oOmitting the keyword ASdoes not affect the binding of variables. The FROM clause in this example binds variables c and o whether the keyword AS is used or not.In certain cases, a variable is automatically bound even if no alias or variable-name is specified. Whenever an expression could have been followed by an AS subclause, if the expression consists of a simple name or a path expression, that expression binds a variable whose name is the same as the simple name or the last step in the path expression. Here are some examples: FROM customer, orderbinds iteration variables namedcustomerandorderGROUP BY address.zipcodebinds a variable namedzipcodeSELECT item[0].pricebinds a variable namedpriceNote that a FROMclause iterates over a collection (usually a dataset), binding a variable to each member of the collection in turn. The name of the collection remains in scope, but it is not a variable. For example, consider thisFROMclause used in a self-join:FROM customer AS c1, customer AS c2This FROMclause joins the customer dataset to itself, binding the iteration variablesc1andc2to objects in the left-hand-side and right-hand-side of the join, respectively. After theFROMclause,c1andc2are in scope as variables, and customer remains accessible as a dataset name but not as a variable.
- 
Special rules for GROUP BY:- 
(3A): If a GROUP BYclause specifies an expression that has no explicit alias, it binds a pseudo-variable that is lexicographically identical to the expression itself. For example:GROUP BY salary + bonusbinds a pseudo-variable namedsalary + bonus.This rule allows subsequent clauses to refer to the grouping expression (salary + bonus) even though its constituent variables (salary and bonus) are no longer in scope. For example, the following query is valid: FROM employee GROUP BY salary + bonus HAVING salary + bonus > 1000 SELECT salary + bonus, COUNT(*) AS how_many While it might have been more elegant to explicitly require an alias in cases like this, the pseudo-variable rule is retained for SQL compatibility. Note that the expression salary + bonusis not actually evaluated in theHAVINGandSELECTclauses (and could not be sincesalaryandbonusare no longer individually in scope). Instead, the expressionsalary + bonusis treated as a reference to the pseudo-variable defined in theGROUP BYclause.
- 
(3B): The GROUP BYclause may be followed by aGROUP ASclause that binds a variable to the group. The purpose of this variable is to make the individual objects inside the group visible to subqueries that may need to iterate over them.The GROUP ASvariable is bound to a multiset of objects. Each object represents one of the members of the group. Since the group may have been formed from a join, each of the member-objects contains a nested object for each variable bound by the nearestFROMclause (and itsLETsubclause, if any). These nested objects, in turn, contain the actual fields of the group-member. To understand this process, consider the following query fragment:FROM parts AS p, suppliers AS s WHERE p.suppno = s.suppno GROUP BY p.color GROUP AS g Suppose that the objects in partshave fieldspartno,color, andsuppno. Suppose that the objects in suppliers have fieldssuppnoandlocation.Then, for each group formed by the GROUP BY, the variable g will be bound to a multiset with the following structure:[ { "p": { "partno": "p1", "color": "red", "suppno": "s1" }, "s": { "suppno": "s1", "location": "Denver" } }, { "p": { "partno": "p2", "color": "red", "suppno": "s2" }, "s": { "suppno": "s2", "location": "Atlanta" } }, ... ]
 
- 
Scoping
In general, the variables that are in scope at a particular position are those variables that were bound earlier in the current query block, in outer (enclosing) query blocks, or in a WITH clause at the beginning of the query.
More specific rules follow.
The clauses in a query block are conceptually processed in the following order:
- 
FROM(followed by LET subclause, if any)
- 
WHERE
- 
GROUP BY(followed by LET subclause, if any)
- 
HAVING
- 
SELECTorSELECT VALUE
- 
ORDER BY
- 
OFFSET
- 
LIMIT
During processing of each clause, the variables that are in scope are those variables that are bound in the following places:
- 
In earlier clauses of the same query block (as defined by the ordering given above). Example: FROM orders AS o SELECT o.dateThe variableoin theSELECTclause is bound, in turn, to each object in the datasetorders.
- 
In outer query blocks in which the current query block is nested. In case of duplication, the innermost binding wins. 
- 
In the WITHclause (if any) at the beginning of the query.
However, in a query block where a GROUP BY clause is present:
- 
In clauses processed before GROUP BY, scoping rules are the same as though no GROUP BY were present.
- 
In clauses processed after GROUP BY, the variables bound in the nearestFROM-clause (and itsLETsubclause, if any) are removed from scope and replaced by the variables bound in theGROUP BYclause (and itsLETsubclause, if any). However, this replacement does not apply inside the arguments of the five SQL special aggregating functions (MIN,MAX,AVG,SUM, andCOUNT). These functions still need to see the individual data items over which they are computing an aggregation. For example, afterFROM employee AS e GROUP BY deptno, it would not be valid to referencee.salary, butAVG(e.salary)would be valid.
Special case: In an expression inside a FROM clause, a variable is in scope if it was bound in an earlier expression in the same FROM clause.
Example:
FROM orders AS o, o.items AS i
The reason for this special case is to support iteration over nested collections.
Note that, since the SELECT clause comes after the WHERE and GROUP BY clauses in conceptual processing order, any variables defined in SELECT are not visible in WHERE or GROUP BY.
Therefore the following query will not return what might be the expected result (since in the WHERE clause, pay will be interpreted as a field in the emp object rather than as the computed value salary + bonus):
SELECT name, salary + bonus AS pay FROM emp WHERE pay > 1000 ORDER BY pay
The likely intent of the query above can be accomplished as follows:
FROM emp AS e LET pay = e.salary + e.bonus WHERE pay > 1000 SELECT e.name, pay ORDER BY pay
Note that in the phrase expr1 JOIN expr2 ON expr3, variables defined in expr1 are visible in expr3 but not in expr2. Here’s an example that will not work:
FROM orders AS o JOIN o.items AS i ON 1 = 1
The variable o, defined in the phrase before JOIN, cannot be used in the phrase immediately following JOIN. The probable intent of this example could be accomplished in either of the following ways:
FROM orders AS o UNNEST o.items AS i FROM orders AS o, o.items AS i
To summarize this rule: You may not use left-correlation in an explicit JOIN clause.
Resolving Names
The process of name resolution begins with the leftmost identifier in the name. The rules for resolving the leftmost identifier are:
- 
In a FROMclause: Names in aFROMclause identify the collections over which the query block will iterate. These collections may be stored datasets, views, synonyms, or may be the results of nested query blocks. A stored dataset may be in a named dataverse or in the default dataverse. Thus, if the two-part namea.bis in aFROMclause, a might represent a dataverse andbmight represent a dataset in that dataverse. Another example of a two-part name in aFROMclause isFROM orders AS o, o.items AS i. Ino.items,orepresents an order object bound earlier in theFROMclause, and items represents the items object inside that order.The rules for resolving the leftmost identifier in a FROMclause (including aJOINsubclause), or in the expression followingINin a quantified predicate, are as follows:- 
(1A): If the identifier matches a variable-name that is in scope, it resolves to the binding of that variable. (Note that in the case of a subquery, an in-scope variable might have been bound in an outer query block; this is called a correlated subquery). 
- 
(1B): Otherwise, if the identifier is the first part of a two-part name like a.b, the name is treated asdataverse.dataset. If the identifier stands alone as a one-part name, it is treated as the name of a dataset in the default dataverse. If the designated dataset exists then the identifier is resolved to that dataset, othwerise if a view with given name exists then the identifier is resolved to that view, otherwise if a synonym with given name exists then the identifier is resolved to the target dataset or the target view of that synonym (potentially recursively if this synonym points to another synonym). An error will result if the designated dataset, view, or a synonym with this name does not exist.Datasets and views take precedence over synonyms, so if both a dataset (or a view) and a synonym have the same name then the resolution is to the dataset. Note that there cannot be a dataset and a view with the same name. 
 
- 
- 
Elsewhere in a query block: In clauses other than FROM, a name typically identifies a field of some object. For example, if the expressiona.bis in aSELECTorWHEREclause, it’s likely thatarepresents an object andbrepresents a field in that object.The rules for resolving the leftmost identifier in clauses other than the ones listed in Rule 1 are: - 
(2A): If the identifier matches a variable-name that is in scope, it resolves to the binding of that variable. (In the case of a correlated subquery, the in-scope variable might have been bound in an outer query block). 
- 
(2B): (The "Single Variable Rule"): Otherwise, if the FROMclause in the current query block binds exactly one variable, the identifier is treated as a field access on the object bound to that variable. For example, in the queryFROM customer SELECT address, the identifier address is treated as a field in the object bound to the variablecustomer. At runtime, if the object bound tocustomerhas noaddressfield, theaddressexpression will returnmissing. If theFROMclause in the current query block binds multiple variables, name resolution fails with an "ambiguous name" error. If there’s noFROMclause in the current query block, name resolution fails with an "undefined identifier" error. Note that the Single Variable Rule searches for bound variables only in the current query block, not in outer (containing) blocks. The purpose of this rule is to permit the compiler to resolve field-references unambiguously without relying on any schema information. Also note that variables defined byLETclauses do not participate in the resolution process performed by this rule.Exception: In a query that has a GROUP BYclause, the Single Variable Rule does not apply in any clauses that occur after theGROUP BYbecause, in these clauses, the variables bound by theFROMclause are no longer in scope. In clauses afterGROUP BY, only Rule (2A) applies.
 
- 
- 
In an ORDER BYclause following aUNION ALLexpression:The leftmost identifier is treated as a field-access on the objects that are generated by the UNION ALL. For example:query-block-1 UNION ALL query-block-2 ORDER BY salary In the result of this query, objects that have a foo field will be ordered by the value of this field; objects that have no foo field will appear at at the beginning of the query result (in ascending order) or at the end (in descending order.) 
- 
In a standalone expression: If a query consists of a standalone expression then identifiers inside that expression are resolved according to Rule 1. For example, if the whole query is ARRAY_COUNT(a.b)thena.bwill be treated as datasetbcontained in dataversea. Note that this rule only applies to identifiers which are located directly inside a standalone expression. Identifiers insideSELECTstatements in a standalone expression are still resolved according to Rules 1-3. For example, if the whole query isARRAY_SUM( (FROM employee AS e SELECT VALUE salary) )thensalaryis resolved ase.salaryfollowing the "Single Variable Rule" (Rule (2B)).
- 
Once the leftmost identifier has been resolved, the following dots and identifiers in the name (if any) are treated as a path expression that navigates to a field nested inside that object. The name resolves to the field at the end of the path. If this field does not exist, the value missingis returned.
Resolving Database Entities
This section explains how Couchbase Analytics resolves database entity references in the FROM clause — that is, when the first identifier is not a variable reference. This explanation also applies to resolving database entity references in DDL statements.
A database entity reference (or qualified name) may consist of an optional Analytics scope name, and the database entity’s local name, separated by a dot.
- 
The Analytics scope name may consist of one or two identifiers, separated by a dot. 
- 
The local name consists of a single identifier. 
Analytics uses the following three resources to resolve database entity references:
- 
(Optional) The preceding USE statement. 
- 
(Optional) The query_contextparameter.
- 
The database entity reference provided in the query. 
The USE Statement
The USE statement enables you to specify the Analytics scope for the statement immediately following.
For more details, refer to Use Statements.
The Query Context Parameter
The query_context parameter enables you to specify the Analytics scope for a statement.
You can set the query_context parameter using the cbq shell
or the Analytics Service REST API.
The value of the query_context parameter must start with default:,
followed by an Analytics scope name.
The default: prefix is a dummy and is ignored when resolving a collection name or synonym name.
The default value of the query_context parameter is default:Default, meaning the Default Analytics scope.
For more details, refer to Analytics Query Parameters.
| You can also set the query context in the Analytics workbench using the query context drop-down menu. | 
The Database Entity Reference
The database entity reference is resolved according to the following rules:
- 
If the database entity reference contains multiple identifiers, all identifiers except the last are assumed to refer to the Analytics scope, and the last identifier is assumed to be the database entity’s local name. Analytics attempts to find the database entity using the specified scope and local name. If the database entity is not found, an error is generated. 
- 
If the database entity reference only contains one identifier, this is assumed to be the database entity’s local name. - 
If the USE statement has been used to specify an Analytics scope, Analytics attempts to find the database entity using the scope provided by the USE statement and the specified local name. 
- 
Otherwise, if the query_contextparameter has been set, Analytics attempts to find the database entity using the scope provided by the query context and the specified local name.
- 
Otherwise, Analytics attempts to find the database entity using the DefaultAnalytics scope and the specified local name.
 If the database entity is not found, an error is generated. 
- 
If the database entity is an Analytics synonym, the synonym is then resolved to find the Analytics collection.
Default Scopes and Collections
In the Data service, each bucket may contain a default scope called _default,
which in turn may contain a default collection also called _default.
The Query service enables you to refer to the default scope within the default collection using just the bucket name.
However, Analytics does not support this type of reference.
To simulate this behavior in Analytics,
you can create an Analytics synonym in the Default scope whose name represents the bucket name.
So for example, you could create an Analytics synonym travel-sample in the Default Analytics scope,
which points to the _default Analytics collection in the `travel-sample`._default Analytics scope.
This is the approach taken by the ALTER COLLECTION statement.
Unlike the Query service,  Analytics does not enable you to resolve individual missing elements
in the middle of a database entity reference.
For example, you cannot use analytics..customers to refer to the the customers Analytics collection in the
`analytics`._default Analytics scope.