With this mode each node writes information about itself to a shared storage device on a regular basis and periodically reads the information written by the other nodes. A cluster is considered to have quorum consensus when each node is able to access the shared storage device and update its quorum object as well as see that the quorum objects for all other nodes are being updated. The node information located on the shared storage device is called a quorum (QWK) object or QWK object for short. QWK objects are required for every node configured in the cluster.
Quorum checking determines that a node has quorum when it has access to the shared storage device. Witness checking accesses the QWK objects for the other nodes to determine that node’s current state. During a check it is verifying that updates to the QWK objects of the other nodes are still occurring on a regular basis. If no updates have occurred on a particular node after a certain period of time, the node will be considered in a failed state. During this time the checking node will update its own QWK object. Witness checking is performed when quorum checking is performed.
When “storage” is selected for quorum mode, “storage” must be selected for witness mode.
This quorum mode setting can be used for a two-node, three node, or four node cluster. The shared storage used for storing QWK objects for all the nodes must be configured separately. If a node loses access to the shared storage, it affects bringing resources in service. Select a shared storage device which is always accessible from all the nodes.
Available Shared Storage
The purpose of the quorum/witness function is to avoid a split brain scenario. Therefore, correctly configuring the storage quorum mode choice is critical to ensure all nodes in the cluster can see all the QWK objects. This is accomplished by placing all the QWK objects in the same type of shared storage: block devices, regular files, EFS or S3 objects.
The available shared storage choices are shown below. Specify the type of shared storage being used via the QWK_STORAGE_TYPE setting in the /etc/default/LifeKeeper configuration file.
block | When using physical storage, RDM (physical compatibility), iSCSI (in-VM initiator) for shared storage, allocate one QWK object in one of the following ways: (a) 1 QWK object = 1 partition (b) 1 QWK object = 1 LU In the case of (a), since multiple hosts will write to one LU, align the offset with 4K (sector size of the storage device) within the LU of the partition. Also, do not mix partitions used for other purposes. |
When using VMDK for shared storage, allocate one QWK object as follows: 1 QWK object = 1 VMDK Do not create partitions. Also, no file system needs to be created. thick (eager zeroed) |
|
file (Supports NFS and EFS) |
When using either EFS or NFS for shared storage, allocate one QWK object as follows: 1 QWK object = 1 regular file system in the NFS or EFS file system Note: The following options are not relevant for EFS, however the filesystem should be placed in /etc/fstab. EFS has it’s own set of parameters. Set the export option for the NFS server (not applicable for EFS) as follows: rw,no_root_squash,sync,no_wdelay Set the mount option for the NFS server as follows: soft,timeo=20,retrans=1,noac Configure /etc/fstab to mount automatically after rebooting the OS. |
aws_s3 | When using Amazon Simple Storage Service (S3) or Amazon S3-compatible object storage for shared storage, allocate one QWK object as one S3 object. 1 QWK object = 1 S3 object Use S3 in a region different from the region where LifeKeeper is running. This is because when using S3 in the same region, if the connection between AZs (Availability Zones) is broken, the S3 where replication is placed in multiple AZs will also be broken at the same time, and consistent access to S3 cannot be expected. Be sure to grant the following permissions on all of the nodes configured in the cluster so that the root user can access the S3 object. For more information, see the AWS CLI documentation and the object storage documentation for each platform. • s3:ListBucket • s3:GetObject • s3:PutObject • s3:GetBucketLocation Note: If the path name for the AWS CLI executable files are not already specified as a part of the “PATH” parameter in the LifeKeeper defaults file /etc/default/LifeKeeper, you must append the path to the AWS CLI executables for LifeKeeper to function correctly when using S3 objects. If you use Amazon S3-compatible object storage, see Available Amazon S3-compatible Object Storage. |
The size of 1 QWK object is 4096 bytes.
Quorum witness checking performs a read and/or /write to its own QWK object and will only read the QWK objects of other nodes. Set the access rights appropriately (be careful of permission restrictions such as granting Persistent Reservation to the shared storage).
Storage Mode Configuration
QUORUM_MODE and WITNESS_MODE should be configured as storage in the /etc/default/LifeKeeper configuration file. The following configuration parameters are also available when using storage:
- QWK_STORAGE_TYPE – Specifies the type of shared storage being used.
- QWK_STORAGE_HBEATTIME – Specifies the interval in seconds between reading and writing the QWK objects. This setting must be greater than or equal to the LCMHBEATTIME default setting.
- QWK_STORAGE_NUMHBEATS – Specifies the number of consecutive heartbeat checks that when missed indicates the target node has failed. A missed heartbeat occurs when the QWK object has not been updated since the last check. This setting must be greater than or equal to the LCMNUMHBEATS default setting.
Note: Based on the added traffic and no traffic time comparisons, you can tune the number of heartbeats and the time mentioned above. Defaults are 6 (minimum of 5, maximum of 10) seconds for heart beat time and 4 (minimum of 3) missed heart beats.
- QWK_STORAGE_OBJECT_ – Specifies the path to the QWK object for each node in the cluster. Entries for all nodes in the cluster are required.
- HTTP_PROXY, HTTPS_PROXY, NO_PROXY – Set this parameter when using HTTP proxy for accessing the service endpoint. The value set here will be passed to AWS CLI.
See the “Quorum Parameter List” for more information.
How to use Storage Mode
Initialization is required in order to use the storage quorum mode. The initialization steps for all the nodes in the cluster are as follows.
- Set up all the nodes and make sure that they can communicate with each other.
- On all the nodes run the LifeKeeper for Linux setup and enable “Use Quorum/Witness Functions” to install the Quorum/Witness package.
- Create communication paths between all the nodes.
- Configure the quorum setting in the /etc/default/LifeKeeper configuration file on all nodes.
- Run the qwk_storage_init command on all nodes. This command will wait until the initialization of the QWK objects on all nodes is complete. Quorum/Witness functions will become available in the storage mode once the init completes on all nodes.
Reinitialization is necessary to add/delete cluster nodes after initial configuration, or when quorum parameters are changed in the /etc/default/LifeKeeperconfiguration file. Please reinitialize according to the following steps.
- Execute the qwk_storage_exit command on all nodes.
- Delete communication paths between the node that is being deleted and all the other nodes.
Create communication paths between the node that is being added and all the other nodes. - Modify the quorum parameters in the /etc/default/LifeKeeper configuration file on all nodes.
- Execute the qwk_storage_init command on all nodes.
Expected Behaviors for Storage Mode (Assuming Default Modes)
Behavior of a two-node cluster; Node A (resources are in-service) and Node B (resources are on stand-by), is shown below.
The following three events may change the resource status on a node failure:
- COMM_DOWN event
An event called when all the communication paths between the nodes are disconnected.
- COMM_UP event
An event called when the communication paths are recovered from a COMM_DOWN state.
- LCM_AVAIL event
An event called after LCM initialization is completed and it is called only once when starting LifeKeeper. Once this state has been reached, heartbeat transmission to other nodes in the cluster begins over the established communication paths. It is also ready to receive heartbeat requests from other nodes in the cluster. LCM_AVAIL will always processed before processing a COMM_UP event.
Scenario 1
The communication paths fail between Node A and Node B (Both Node A and Node B can access the shared storage)
In this case, the following will happen:
- Both Node A and Node B will begin processing a COMM_DOWN event, though not necessarily at exactly the same time.
- Both nodes will perform the quorum check and determine that they still have quorum (both A and B can access the shared storage).
- Each node will check the QWK object for the node with whom it has lost communication to see if it is still being updated on a regular basis. Both nodes will find that the other’s QWK object is being updated on a regular as both nodes are still running witness checks.
- It will be determined, via the witness checking on each node, that the other is still alive so no failover processing will take place. Resources will be left in service at Node A.
Scenario 2
Node A fails and stops
In this case, Server B will do the following:
- Begin processing a COMM_DOWN event from Node A.
- Determine that it can still access the shared storage and thus has quorum.
- Check to see that updates to the QWK object for Node A have stopped (witness checking).
- Verify via witness checking that Node A really appears to be lost and begins the usual failover activity. Node B will continue processing and bring the protected resources in service.
With resources being in-service on Node B, Node A is powered on and establishes communications with the other nodes and is able to access the QWK shared storage
In this case, Node A will process a LCM_AVAIL event. Node A will determine that it has quorum and not bring resources in service because they are currently in service on Node B. Next, a COMM_UP event will be processed between Node A and Node B.
Each node will determine that it has quorum during the COMM_UP events and Node A will not bring resources in service because they are currently in service on Node B.
With resources being in-service on Node B, Node A is powered on and cannot establish communications to the other nodes but is able to access the QWK shared storage
In this case, Node A will process a LCM_AVAIL event. Node A will determine that it has quorum since it can access the shared storage for the QWK objects. It will then perform witness checks to determine the status for Node B since the communication to Node B is down. Since Node B is running and has been updating its QWK object, Node A detects this and does not bring resources in service. Node B will do nothing since it can’t communicate with Node A and already has the resource in-service.
Scenario 3
A failure occurs with the network for Node A (Node A is running without communication paths to the other nodes and does not have access to the QWK objects on shared storage)
In this case, Node A will do the following:
- Begin processing a COMM_DOWN event from Node B.
- Determine that it cannot access the shared storage and thus does not have quorum.
- Immediately force-quit (“fastkill”, default behavior of QUORUM_LOSS_ACTION).
Also, in this case, Node B will do the following:
- Begin processing a COMM_DOWN event from Node A.
- Determine that it can still access the shared storage and thus has quorum.
- Verify that the updating for the QWK objects for Node A has stopped (witness checking).
- Verify via witness checking that Node A really appears to be lost and, begin the usual failover activity. Node B will now have the protected resources in service.
With resources being in-service on Node B, Node A is powered on and establishes communications with the other nodes and is able to access the QWK shared storage
Same as scenario 2.
With the resources being in-service on Node B, Node A powered-on but is not able to access the QWK shared storage
In this case, Node A will process an LCM_AVAIL event. Node A will determine that it does not have quorum and will not bring resources in service.
If the communication paths to Node B are available, then a COMM_UP event will be processed. However, because Node A does not have quorum, it will not bring resources in service.
Post your comment on this topic.