Query without Indexes

  • Capella Operational
  • concept
    +
    Sequential scans enable you to query a keyspace, even if the keyspace has no indexes.
    Examples on this Page

    The examples in this topic use the travel-sample dataset which is supplied with Couchbase Capella. For instructions on how to install the sample bucket, see Import Sample Data.

    Sequential Scans

    If a keyspace does not have any suitable primary or secondary indexes for a query, the Query service can fall back on a sequential scan of the data to retrieve the document keys. A sequential scan uses an underlying mechanism known as a K/V range scan.

    Sequential scans are intended for simple, ready access to data, and are not intended as a high performance solution.

    Sequential scans are best suited to small collections where key order is unimportant, or where the overhead of maintaining an index can’t be justified. For larger collections and greater performance, define the appropriate indexes to speed up your queries. For ordered document key operations, a primary index provides the same functionality, and will outperform a sequential scan.

    Sequential scans are unavailable on Memory Only buckets.

    Use Sequential Scans

    To query an index using sequential scan, run the query as usual.

    If there is a primary index or any suitable secondary index available for the keyspace, the Query service uses that in preference to sequential scan.

    Prerequisites

    RBAC Privileges

    Users must have the Query Use Sequential Scans role on the keyspace to be able to execute a request with a sequential scan. For more details about user roles, see Authorization.

    Examples

    To try the examples in this section, set the query context to the tenant_agent_01 scope in the travel sample dataset. For more information, see Query Context.

    Example 1. Check that a collection has no indexes
    Query
    SELECT * FROM system:indexes
    WHERE scope_id = "tenant_agent_01"
      AND keyspace_id = "users";
    Result
    {
      "results": []
    }

    You should see that collection has no primary or secondary indexes.

    Example 2. Run a query without indexes
    Query
    SELECT name FROM users;
    Result
    [
      {
        "name": "Haley Rohan"
      },
      {
        "name": "Johnnie Lind"
      },
      {
        "name": "Verdie Jaskolski"
      },
      {
        "name": "Marc Mills"
      },
      {
        "name": "Valentine Funk"
      },
      {
        "name": "Jocelyn Wuckert"
      },
      {
        "name": "Gretchen Auer"
      },
      {
        "name": "Meghan Homenick"
      },
      {
        "name": "Kraig Hilll"
      },
      {
        "name": "Destini Turcotte"
      },
      {
        "name": "Sienna Cummerata"
      }
    ]

    The query returns 11 documents. Notice that the query takes one or more seconds.

    Example 3. Check explain plan for query without index
    Query
    EXPLAIN SELECT name FROM users;
    Result
    [
      {
        "plan": {
          "#operator": "Sequence",
          "~children": [
            {
              "#operator": "PrimaryScan3",
              "bucket": "travel-sample",
              "index": "#sequentialscan",
              "index_projection": {
                "primary_key": true
              },
              "keyspace": "users",
              "namespace": "default",
              "scope": "tenant_agent_01",
              "using": "sequentialscan"
            },
            {
              "#operator": "Fetch",
              "bucket": "travel-sample",
              "early_projection": [
                "name"
              ],
              "keyspace": "users",
              "namespace": "default",
              "scope": "tenant_agent_01"
            },
            {
              "#operator": "Parallel",
              "~child": {
                "#operator": "Sequence",
                "~children": [
                  {
                    "#operator": "InitialProject",
                    "discard_original": true,
                    "preserve_order": true,
                    "result_terms": [
                      {
                        "expr": "(`users`.`name`)"
                      }
                    ]
                  }
                ]
              }
            }
          ]
        },
        "text": "SELECT name FROM users;"
      }
    ]

    The explain plan includes a primary scan operator, using sequentialscan rather than gsi.

    The explain plan reports that the primary scan operator uses an index called #sequentialscan. This name is a placeholder — in reality there is no index.

    Monitor Sequential Scans

    You can monitor sequential scans using the system:completed_requests catalog.

    • Completed requests which used sequential scan include a primaryScan.Seq property within the request’s phaseCounts, phaseOperators, and phaseTimes, in addition to the primaryScan property.

    • In contrast, queries which used a primary index include a primaryScan.GSI property within the request’s phaseCounts, phaseOperators, and phaseTimes, in addition to the primaryScan property.

    The system:completed_requests catalog also includes a ~qualifier field, which indicates the reason why any request was captured. A completed requests qualifier automatically captures any requests where more than 10000 keys have been returned by sequential scans. In most cases, this indicates that you should create an index to support the request.

    Statistics on sequential scan usage are also available in request profiling information.

    For more details, see Manage and Monitor Queries.

    Examples

    Example 4. Get completed requests which used sequential scan
    Query
    SELECT * FROM system:completed_requests
    WHERE phaseCounts.`primaryScan.Seq` IS NOT MISSING;

    You must wrap the property name primaryScan.Seq in backquotes, because the property name contains a period. The period after phaseCounts is a separator between nested property names, whereas the period within primaryScan.Seq is actually part of the property name.

    Result
    [
      {
        "completed_requests": {
          "clientContextID": "4eb44ea6-170a-4700-ae79-e22f57100e43",
          "cpuTime": "820.464µs",
          "elapsedTime": "4.728840089s",
          "errorCount": 0,
          "errors": [],
          "n1qlFeatCtrl": 76,
          "node": "127.0.0.1:8091",
          "phaseCounts": {
            "fetch": 11,
            "primaryScan": 11,
            "primaryScan.Seq": 11
          },
          "phaseOperators": {
            "authorize": 1,
            "fetch": 1,
            "primaryScan": 1,
            "primaryScan.Seq": 1,
            "project": 1,
            "stream": 1
          },
          "phaseTimes": {
            "authorize": "8.471µs",
            "fetch": "107.915507ms",
            "instantiate": "19.769µs",
            "parse": "870.813µs",
            "plan": "293.479µs",
            "plan.index.metadata": "17.998µs",
            "plan.keyspace.metadata": "7.601µs",
            "primaryScan": "4.72730895s",
            "primaryScan.Seq": "4.72730895s",
            "project": "161.687µs",
            "run": "4.727550611s",
            "stream": "234.174µs"
          },
          "queryContext": "default:travel-sample.tenant_agent_01",
          "remoteAddr": "127.0.0.1:43164",
          "requestId": "d80b0d08-1794-4932-8188-af7e7e57b7b3",
          "requestTime": "2024-02-09T15:05:09.343Z",
          "resultCount": 11,
          "resultSize": 435,
          "scanConsistency": "unbounded",
          "serviceTime": "4.728754078s",
          "state": "completed",
          "statement": "SELECT name FROM users;",
          "statementType": "SELECT",
          "useCBO": true,
          "userAgent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:122.0) Gecko/20100101 Firefox/122.0",
          "users": "builtin:Administrator",
          "~qualifier": "threshold"
        }
      },
      // ...
    ]

    The query returns details of completed requests using a sequential scan, if any are logged.

    Manage Sequential Scans

    For a free operational cluster, sequential scan is available in the Query tab, and via cluster access credentials in the SDKs or Couchbase Shell.

    For paid service plans, sequential scan is not available by default. To enable sequential scan, please contact support.