With this mode each node writes information about itself to a shared storage device on a regular basis and periodically reads the information written by the other nodes. A cluster is considered to have quorum consensus when a majority of nodes are able to access the shared storage device and update their quorum object as well as see that the quorum objects for other nodes are being updated. The node information located on the shared storage device is called a quorum witness kit (QWK) object or QWK object for short. QWK objects are required for every node configured in the cluster.
Quorum checking determines that a node has quorum when it has access to the shared storage device. Witness checking accesses the QWK objects for the other nodes to determine that node’s current state. During a check it is verifying that updates to the QWK objects of the other nodes are still occurring on a regular basis. If no updates have occurred on a particular node after a certain period of time, the node will be considered in a failed state. During this time the checking node will update its own QWK object. Witness checking is performed when quorum checking is performed.
When “storage” is selected for quorum mode, “storage” must be selected for witness mode.
This quorum setting is recommended for clusters with an even number of nodes. The shared storage used for storing QWK objects for all the nodes must be configured separately. If a node loses access to the shared storage, it affects bringing resources in service. Select a shared storage device which is always accessible from all the nodes.
Available Shared Storage
The purpose of the quorum/witness function is to avoid a split brain scenario. Therefore, correctly configuring the storage quorum mode choice is critical to ensure all nodes in the cluster can see all the QWK objects. This is accomplished by placing all the QWK objects in the same shared storage location, either in the same SMB share or the same S3 bucket.
The available shared storage choices are shown below. Specify the type of shared storage being used via the QWK_STORAGE_TYPE setting in the %LKROOT%/etc/default/LifeKeeper configuration file.
file (Supports SMB) |
When using SMB for shared storage, allocate one QWK object as follows: 1 QWK object = 1 file in the SMB file share
|
aws_s3 | When using Amazon Simple Storage Service (S3) for shared storage, allocate one QWK object as follows: 1 QWK object = 1 S3 object Use S3 in a region different from the region where LifeKeeper is running. This is because when using S3 in the same region, if the connection between AZs(Availability Zones) is broken, the S3 where replication is placed in multiple AZs will also be broken at the same time, and consistent access to S3 cannot be expected. |
The size of 1 QWK object is 4096 bytes.
Quorum witness checking performs a read and/or /write to its own QWK object and will only read the QWK objects of other nodes. Set the access rights appropriately, so that all nodes have read access to all objects, and have write access to their own object.
Storage Mode Configuration
QUORUM_MODE and WITNESS_MODE should be configured as “storage” in the %LKROOT%/etc/default/LifeKeeper configuration file. The following configuration parameters are also available when using storage:
- QWK_STORAGE_TYPE – Specifies the type of shared storage being used.
[REQUIRED]
- QWK_STORAGE_HBEATTIME – Specifies the interval in seconds between reading and writing the QWK objects. This setting must be greater than or equal to the LCMHBEATTIME default setting.
[REQUIRED]
- QWK_STORAGE_NUMHBEATS – Specifies the number of consecutive heartbeat checks that when missed indicates the target node has failed. A missed heartbeat occurs when the QWK object has not been updated since the last check. This setting must be greater than or equal to the LCMNUMHBEATS default setting.
[REQUIRED]
Note: Based on the added traffic and no traffic time comparisons, you can tune the number of heartbeats and the time mentioned above. Defaults are 6 (minimum of 5, maximum of 10) seconds for heart beat time and 4 (minimum of 3) missed heart beats.
In the %LKROOT%/etc/default/LifeKeeper file, SIOS recommends editing the QWK_STORAGE_NUMHBEATS value, changing it to 9.
QWK_STORAGE_NUMHBEATS=9
- QWK_STORAGE_OBJECT_ – Specifies the path to the QWK object for each node in the cluster. Entries for all nodes in the cluster are required.
[REQUIRED]
- HTTP_PROXY, HTTPS_PROXY, NO_PROXY – Set this parameter when using HTTP proxy for accessing the service endpoint. The value set here will be passed to AWS CLI.
[OPTIONAL]
See the “Quorum Parameter List” for more information.
How to use Storage Mode
Initialization is required in order to use the storage quorum mode. The initialization steps for all the nodes in the cluster are as follows.
- Set up all the nodes and make sure that they can communicate with each other.
- Create communication paths between all the nodes.
- Configure the quorum setting in the %LKROOT%/etc/default/LifeKeeper configuration file on all nodes.
Edit the %LKROOT%/etc/default/LifeKeeper file and change LCMNUMHBEATS to 9:
LCMNUMHBEATS=9
- Run the qwk_storage_init command on all nodes. This command will wait until the initialization of the QWK objects on all nodes is complete. Quorum/Witness functions will become available in the storage mode once the init completes on all nodes.
Expected Behaviors for Storage Mode (Assuming Default Modes)
Behavior of a two-node cluster; Node A (resources are in-service) and Node B (resources are on stand-by), is shown below.
The following three events may change the resource status on a node failure:
- COMM_DOWN event
An event called when all the communication paths between the nodes are disconnected.
- COMM_UP event
An event called when the communication paths are recovered from a COMM_DOWN state.
- LCM_AVAIL event
An event called after LCM initialization is completed and it is called only once when starting LifeKeeper. Once this state has been reached heartbeat, transmission to other nodes in the cluster begins over the established communication paths. It is also ready to receive heartbeat requests from other nodes in the cluster. LCM_AVAIL will always processed before processing a COMM_UP event.
Scenario 1
The communication paths fail between Node A and Node B (Both Node A and Node B can access the shared storage)
In this case, the following will happen:
- Both Node A and Node B will begin processing a COMM_DOWN event, though not necessarily at exactly the same time.
- Both nodes will perform the quorum check and determine that they still have quorum (both A and B can access the shared storage).
- Each node will check the QWK object for the node with whom it has lost communication to see if it is still being updated on a regular basis. Both nodes will find that the other’s QWK object is being updated on a regular as both nodes are still running witness checks.
- It will be determined, via the witness checking on each node, that the other is still alive so no failover processing will take place. Resources will be left in service at Node A.
Scenario 2
Node A fails and stops
In this case, Server B will do the following:
- Begin processing a COMM_DOWN event from Node A.
- Determine that it can still access the shared storage and thus has quorum.
- Check to see that updates to the QWK object for Node A have stopped (witness checking).
- Verify via witness checking that Node A really appears to be lost and begins the usual failover activity. Node B will continue processing and bring the protected resources in service.
With resources being in-service on Node B, Node A is powered on and establishes communications with the other nodes and is able to access the QWK shared storage
In this case, Node A will process a LCM_AVAIL event. Node A will determine that it has quorum and not bring resources in service because they are currently in service on Node B. Next, a COMM_UP event will be processed between Node A and Node B.
Each node will determine that it has quorum during the COMM_UP events and Node A will not bring resources in service because they are currently in service on Node B.
With resources being in-service on Node B, Node A is powered on and cannot establish communications to the other nodes but is able to access the QWK shared storage
In this case, Node A will process a LCM_AVAIL event. Node A will determine that it has quorum since it can access the shared storage for the QWK objects. It will then perform witness checks to determine the status for Node B since the communication to Node B is down. Since Node B is running and has been updating its QWK object, Node A detects this and does not bring resources in service. Node B will do nothing since it can’t communicate with Node A and already has the resource in-service.
Scenario 3
A failure occurs with the network for Node A (Node A is running without communication paths to the other nodes and does not have access to the QWK objects on shared storage)
In this case, Node A will do the following:
- Begin processing a COMM_DOWN event from Node B.
- Determine that it cannot access the shared storage and thus does not have quorum.
- Performs the osu quorum loss operation (takes the hierarchies out of service, and enters the Quorum Quarantine state)
- After QUORUM_QUARANTINE_SECS, LifeKeeper will restart and attempt to establish communication.
Also, in this case, Node B will do the following:
- Begin processing a COMM_DOWN event from Node A.
- Determine that it can still access the shared storage and thus has quorum.
- Verify that the updating for the QWK objects for Node A has stopped (witness checking).
- Verify via witness checking that Node A really appears to be lost and, begin the usual failover activity. Node B will now have the protected resources in service.
With resources being in-service on Node B, and after waiting QUORUM_QUARANTINE_SECS, Node A is able to access the QWK shared storage and is able to communicate with Node B.
In this case, Node A will process a LCM_AVAIL event. Node A will determine that it has quorum and not bring resources in service because they are currently in service on Node B. Next, a COMM_UP event will be processed between Node A and Node B.
Each node will determine that it has quorum during the COMM_UP events and Node A will not bring resources in service because they are currently in service on Node B.
Post your comment on this topic.