Failover
Failover is a process whereby a node can be taken out of a Couchbase cluster with speed.
Failover Types
There are two basic types of failover: graceful and hard.
-
Graceful: The ability to remove a Data Service node from the cluster proactively, in an orderly and controlled fashion. This involves no downtime, and allows continued application-access to data. The process promotes replica vBuckets on the remaining cluster-nodes to active status, and the active vBuckets on the affected node to dead. Throughout the process, the cluster maintains all 1024 active vBuckets for each bucket.
Graceful failover can only be used on nodes that run the Data Service. If controlled removal of a non-Data Service node is required, Removal should be used.
-
Hard: The ability to drop a node from the cluster reactively, because the node has become unavailable. If the lost node was running the Data Service, active vBuckets have been lost: therefore the hard failover process promotes replica vBuckets on the remaining cluster-nodes to active status, until 1024 active vBuckets again exist for each bucket.
Hard failover should not be used on a responsive node, since this may disrupt ongoing operations (such as the writes and replications that occur on a Data Service node). Instead, available nodes should be taken out of the cluster by means of either graceful failover (if they are Data Service nodes) or removal (if they are nodes of any kind).
Graceful failover must be manually initiated. Hard failover can be manually initiated. Hard failover can also be initiated automatically by Couchbase Server: this is known as automatic failover. The Cluster Manager detects the unavailability of a node, and duly initiates a hard failover, without administrator intervention.
Note that when a node is failed over (as opposed to removed), some replica vBuckets are lost from the surviving nodes; since some are promoted to active status, and are not replaced with new replica-copies. By contrast, removal creates new copies of those replica vBuckets that would otherwise be lost. This maintains the cluster’s previous level of data-availability; but results in greater competition for memory resources, across the surviving nodes.
Ideally, after any failover, rebalance should be performed. This is especially important when a Data Service node has been failed over, since the rebalance will ensure an optimal ratio of active to replica vBuckets across all the remaining Data Service nodes.
Detecting Node-Failure
Hard failover is performed after a node has failed. It can be initiated either by administrative intervention, or through automatic failover.
When automatic failover is used, the Cluster Manager handles both the detection of failure, and the initiation of hard failover, without administrative intervention: however, the Cluster Manager does not identify the cause of failure. Following failover, administrator-intervention is required, to identify and fix problems, and to initiate rebalance, whereby the cluster is returned to a healthy state.
If manual failover is to be used, administrative intervention is required to detect that a failure has occurred. This can be achieved either by assigning an administrator to monitor the cluster; or by creating an externally based monitoring system that uses the Couchbase REST API to monitor the cluster, detect problems, and either provide notifications, or itself trigger failover. Such a system might be designed to take into account system or network components beyond the scope of Couchbase Server.
Failover and Replica Promotion
When failover has occurred, and active vBuckets have thereby been replaced through the promotion of replicas, the resulting imbalance in the ratio of active to replica vBuckets on the surviving nodes should be corrected, by means of rebalance.
The variations in vBucket numbers and ratios that occur due to outage, failover, and rebalance are demonstrated by the following tables.
Table 1 shows the disposition of data across a cluster of four nodes. One bucket, of 31,591 items, has been configured with three replicas. The ratio of active to replica items is therefore 1:3. (Note that some numbers in the table are approximated, and are therefore shown in italics.)
Host | Active Items | Replica Items |
---|---|---|
Node 1 |
7,932 |
23,600 |
Node 2 |
7,895 |
23,600 |
Node 3 |
7,876 |
23,700 |
Node 4 |
7,888 |
23,700 |
Total |
31,591 |
63,000 |
Table 2 shows the result of failing over node 4.
Host | Active Items | Replica Items |
---|---|---|
Node 1 |
11,000 |
20,500 |
Node 2 |
10,200 |
21,100 |
Node 3 |
10,200 |
21,300 |
Node 4 |
7,888* |
23,700* |
Total |
39,288 |
86,600 |
Active and replica vBuckets for the failed over node, 4, are still counted, but are not available (and so are marked here with asterisks). To compensate, replicas on nodes 1 to 3 have been promoted to active status; this being evident from their modified numbers and ratios. For example, on node 1, the number of active items is now raised (from its former total of 7,932) to 11,000; while the number of replica items is now lowered (from its former total of 23,600) to 20,500.
Subsequent rebalance corrects the ratios, as shown by Table 3, below.
Host | Active Items | Replica Items |
---|---|---|
Node 1 |
10,500 |
21,000 |
Node 2 |
10,500 |
21,000 |
Node 3 |
10,500 |
21,000 |
Node 4 |
NA |
NA |
Total |
31,500 |
63,000 |
Ratios on nodes 1 to 3 are now 1:2, indicating that rebalance has reduced the number of replicas for the bucket from 3 to 2, in correspondence with the reduced node-count.
For further examples of rebalance in the context of node removal, see Removal.
Node Removal
Node removal uses the rebalance process to remove a node from a cluster in a controlled fashion. It creates on the remaining nodes new copies of replica vBuckets that would otherwise be lost when the selected node is taken offline. See Removal for a conceptual description of node-removal. For practical steps, see Remove a Node and Rebalance.