Autonomous Operator Troubleshooting

      +
      If you run into issues with the Autonomous Operator, you can troubleshoot by examining the logs and events that it generates.

      The Autonomous Operator generates logs that can be used for auditing and troubleshooting purposes. This page describes logging that is specific to the Autonomous Operator itself. For information about Couchbase cluster logging, refer to Manage Couchbase Server Logging.

      Overview

      The Autonomous Operator generates logs that include information about itself and the various other Kubernetes components that make up the Operator deployment. These logs are distinct from the logs that are generated by the Couchbase Server application.

      This page provides information about how to collect and scrutinize logging information that is produced by the Autonomous Operator. When troubleshooting the Autonomous Operator, it is important to first rule out Kubernetes itself as the root cause of the problem. The Kubernetes Troubleshooting Guide contains a great deal of helpful information about debugging applications within a Kubernetes cluster.

      Familiarity with the Operator’s configuration settings can be helpful when troubleshooting the Autonomous Operator.

      Collecting Autonomous Operator Logs

      Using kubectl or oc, you can choose to print the Autonomous Operator logs to to standard console output.

      • Kubernetes

      • OpenShift

      Start by getting the name of the Autonomous Operator pod.

      $ kubectl get po -lapp=couchbase-operator
      NAME                                  READY     STATUS    RESTARTS   AGE
      couchbase-operator-1917615544-h20bm   1/1       Running   0          20h

      Use the pod name to get the logs.

      $ kubectl logs couchbase-operator-1917615544-h20bm
      time="2018-01-23T22:56:34Z" level=info msg="couchbase-operator v1.1.0 (release)" module=main
      time="2018-01-23T22:56:34Z" level=info msg="Obtaining resource lock" module=main
      time="2018-01-23T22:56:34Z" level=info msg="Starting event recorder" module=main
      time="2018-01-23T22:56:34Z" level=info msg="Attempting to be elected the couchbase-operator leader" module=main
      time="2018-01-23T22:56:51Z" level=info msg="I'm the leader, attempt to start the operator" module=main
      time="2018-01-23T22:56:51Z" level=info msg="Creating the couchbase-operator controller" module=main

      Alternatively, you can specify the Autonomous Operator deployment to get the logs.

      $ kubectl logs deployment/couchbase-operator

      Since there is only one instance of the Autonomous Operator in the deployment, the the underlying command will automatically select the correct pod and print the logs.

      Start by getting the name of the Autonomous Operator pod.

      $ oc get po -lapp=couchbase-operator
      NAME                                  READY     STATUS    RESTARTS   AGE
      couchbase-operator-1917615544-h20bm   1/1       Running   0          20h

      Use the pod name to get the logs.

      $ oc logs couchbase-operator-1917615544-h20bm
      time="2018-01-23T22:56:34Z" level=info msg="couchbase-operator v1.1.0 (release)" module=main
      time="2018-01-23T22:56:34Z" level=info msg="Obtaining resource lock" module=main
      time="2018-01-23T22:56:34Z" level=info msg="Starting event recorder" module=main
      time="2018-01-23T22:56:34Z" level=info msg="Attempting to be elected the couchbase-operator leader" module=main
      time="2018-01-23T22:56:51Z" level=info msg="I'm the leader, attempt to start the operator" module=main
      time="2018-01-23T22:56:51Z" level=info msg="Creating the couchbase-operator controller" module=main

      Alternatively, you can specify the Autonomous Operator deployment to get the logs.

      $ oc logs deployment/couchbase-operator

      Since there is only one instance of the Autonomous Operator in the deployment, the the underlying command will automatically select the correct pod and print the logs.

      If you’re troubleshooting the Autonomous Operator, watch for the following messages which indicate that the Operator is unable to reconcile a Couchbase cluster into a desired state:

      • Logs with level=error

      • Operator is unable to get cluster state after N retries

      Profiling the Autonomous Operator

      For more advanced troubleshooting, the Autonomous Operator supports the Go language pprof feature and serves profiling data on its default listen address localhost:8080. You can access this endpoint by running a remote shell or forwarding the port to your local system.

      • Kubernetes

      • OpenShift

      To access goroutine stack traces using a shell:

      $ kubectl exec -it couchbase-operator-599bcf47f-8wswh sh
      $ wget -O- 'http://localhost:8080/debug/pprof/goroutine?debug=1' | less

      To access Go memory usage using a port forward:

      $ kubectl port-forward couchbase-operator-599bcf47f-8wswh 8080:8080
      $ go tool pprof localhost:8080/debug/pprof/heap
      (pprof) traces

      To access goroutine stack traces using a shell:

      $ oc exec -it couchbase-operator-599bcf47f-8wswh sh
      $ wget -O- 'http://localhost:8080/debug/pprof/goroutine?debug=1' | less

      To access Go memory usage using a port forward:

      $ oc port-forward couchbase-operator-599bcf47f-8wswh 8080:8080
      $ go tool pprof localhost:8080/debug/pprof/heap
      (pprof) traces

      Kubernetes Events

      Kubernetes Events provide insights into what is happening inside a Kubernetes cluster. They record significant occurrences and changes in the state of resources, such as the creation, deletion, or failure of pods, nodes, services, and other Kubernetes objects.

      They can be used to monitor changes that have occurred in the cluster, and can be helpful when troubleshooting issues with the Autonomous Operator. However, they expire after a certain period of time, typically one hour. You can use the Kubernetes Event Collector tool to collect and store events for longer periods of time.

      The Kubernetes Event Collector (KEL) watches for Kubernetes events within a namespace and stores them to a buffer which can be stashed. It can be deployed and configured using helm

      $ helm install event-collector charts/event-collector

      For more details about the tool and how to use it, refer to the repo README: https://github.com/couchbase/couchbase-k8s-event-collector