Quorum checking is performed via SPS for Linux communication paths. A node has quorum when it is able to communicate with the majority of the nodes in the cluster.This quorum mode is available on clusters with three or more nodes. A node dedicated for witness checking needs to be added when using a two-node configuration.
Note: Due to requirements for node majority, it is recommended that clusters always be configured with an odd number of nodes (the count includes the quorum node).
Set QUORUM_MODE to majority in /etc/default/LifeKeeper. No other setting is required for this mode.
The following witness modes are available for majority mode. For details on each mode, please refer to "Available Witness Mode".
The scenarios listed below shows the SPS for Linux behavior of a three-node cluster with Node A (resources are in-service), Node B (resources are on stand-by), and Node W (a witness-only node without protected resources).
The following three events may change the resource status on a node failure.
A communication path fails between Node A and B
In this case, the following will happen:
A communication path fails between Node A and Node W
Since all nodes can and will act as witness nodes when the quorum/witness package is installed, this scenario is the same as the previous. In this case, Node A and Node W will determine that the other is still alive by consulting with Node B.
Node A fails and stops
In this case, Node B will do the following:
With resources being in-service on Node B, Node A is powered on and establishes communications with the other nodes
In this case, Node A will process an LCM_AVAIL event. Node A will determine that it has quorum and not bring resources in service because they are currently in service on Node B. Next, a COMM_UP event will be processed between Node A and Node B and also between Node A and Node W (processed twice at node A). Each node will determine that it has quorum during the COMM_UP events and will not bring resources in service because they are currently in service on Node B.
With resources being in-service on Node B, Node A is powered on and cannot establish communications to the other nodes
In this case, Node A will process an LCM_AVAIL event and Node B and Node W will do nothing since they can’t communicate with Node A. Node A will determine that it does not have quorum since it can only communicate with one of the three nodes (Node A itself). Because it does not have quorum, Node A will not bring resources in service.
A failure occurs with the network for Node A (Node A is running without communications to other nodes)
In this case, Node A will do the following:
Node B will do the following:
With resources being in-service at Node B, communication resumes for Node A
In this case, Node B will process a COMM_UP event, determine that it has quorum (all three of the nodes are visible) and that it has the resources in service. Node A will process a COMM_UP event, determine that it also has quorum and that the resources are in service on Node B. Node A will not bring resources in service at this time.
© 2018 SIOS Technology Corp., the industry's leading provider of business continuity solutions, data replication for continuous data protection.