Symptom Possible Cause
Switching over a VMDK resource causes a system down on the original active node. Cause : When the quickCheck daemon PID file /tmp/LK-vmdk-* of a VMDK resource is deleted, a new quickCheck daemon is started by the recovery process. The old daemon will not be stopped. Only the quickCheck daemon corresponding to the PID file is stopped when resources are removed, so the old daemon detects the VMDK detachment and stops the system.

Action : Do not delete the PID file.
In general, files under /tmp that have not been updated for a certain period of time are automatically deleted. This is accomplished by running tmpwatch script periodically or systemd-tmpfiles, etc. Please configure exclusions according to your environment to prevent PID files from being deleted.

Example: for systemd-tmpfiles
echo "x /tmp/LK-vmdk-*" > /etc/tmpfiles.d/lifekeeper.conf

How to check if multiple quickCheck daemons are running and what to do if they are running
Run
for tag in `ins_list -f, -a scsi -r vmdk | cut -d, -f4`; do echo -n "$tag: "; pgrep -f "vmdk_quickCheck.ps1.*$tag$" | wc -l; done
and if the result is not “<vmdk tag name>: 1”, execute the following:
pkill -INT -f "^/opt/LifeKeeper/bin/pwsh /opt/LifeKeeper/lkadm/subsys/scsi/vmdk/bin/vmdk_quickCheck.ps1"
pkill lkcheck
Mount point is not included in the selection when creating resources Possible causes are as follows:
  • PowerShell/PowerCLI is not installed
  • An ESXi host is not registered
  • disk.enableUUID parameter is not set
  • The virtual hard disk is on a datastore that is not shared
  • SCSI controller sharing is configured as “virtual” or “physical”

Error details are recorded in /var/log/lifekeeper.log. Check the log and review the settings.
It takes longer to bring the VMDK resource in service. Cause: The processing performed during bringing the resource in service takes more time in proportion to the number of virtual machines running on the ESXi host.

Action: A fundamental fix is under consideration for a future release. As an immediate workaround for this issue, please consider the following:
  • Reduce the number of VMDK resources.
    Since the process is performed for each VMDK resource, the time required for the process increases in proportion to the number of VMDK resources. If multiple partitions or file systems are required for a single resource hierarchy, create multiple partitions or file systems on a single VMDK resource rather than using multiple VMDK resources.
  • If ESXi hosts that are not related to the cluster node are registered with the VMDK resource, unregister them.
    How to manage ESXi host information
  • Reduce the number of running virtual machines.

Feedback

Was this helpful?

Yes No
You indicated this topic was not helpful to you ...
Could you please leave a comment telling us why? Thank you!
Thanks for your feedback.

Post your comment on this topic.

Post Comment