VMDK Troubleshooting

Symptom	Possible Cause
Switching over a VMDK resource causes a system down on the original active node.	Cause : When the quickCheck daemon PID file /tmp/LK-vmdk-* of a VMDK resource is deleted, a new quickCheck daemon is started by the recovery process. The old daemon will not be stopped. Only the quickCheck daemon corresponding to the PID file is stopped when resources are removed, so the old daemon detects the VMDK detachment and stops the system. Action : Do not delete the PID file. In general, files under /tmp that have not been updated for a certain period of time are automatically deleted. This is accomplished by running tmpwatch script periodically or systemd-tmpfiles, etc. Please configure exclusions according to your environment to prevent PID files from being deleted. Example: for systemd-tmpfiles echo "x /tmp/LK-vmdk-" > /etc/tmpfiles.d/lifekeeper.conf How to check if multiple quickCheck daemons are running and what to do if they are running Run for tag in `ins_list -f, -a scsi -r vmdk \| cut -d, -f4`; do echo -n "$tag: "; pgrep -f "vmdk_quickCheck.ps1.$tag$" \| wc -l; done and if the result is not “<vmdk tag name>: 1”, execute the following: pkill -INT -f "^/opt/LifeKeeper/bin/pwsh /opt/LifeKeeper/lkadm/subsys/scsi/vmdk/bin/vmdk_quickCheck.ps1" pkill lkcheck
Mount point is not included in the selection when creating resources.	Possible causes are as follows: PowerShell/PowerCLI is not installed An ESXi host is not registered disk.enableUUID parameter is not set The virtual hard disk is on a datastore that is not shared SCSI controller sharing is configured as “virtual” or “physical” Error details are recorded in /var/log/lifekeeper.log. Check the log and review the settings.

Symptom

Possible Cause

Switching over a VMDK resource causes a system down on the original active node.

Cause : When the quickCheck daemon PID file /tmp/LK-vmdk-* of a VMDK resource is deleted, a new quickCheck daemon is started by the recovery process. The old daemon will not be stopped. Only the quickCheck daemon corresponding to the PID file is stopped when resources are removed, so the old daemon detects the VMDK detachment and stops the system.

Action : Do not delete the PID file.
In general, files under /tmp that have not been updated for a certain period of time are automatically deleted. This is accomplished by running tmpwatch script periodically or systemd-tmpfiles, etc. Please configure exclusions according to your environment to prevent PID files from being deleted.

Example: for systemd-tmpfiles

echo "x /tmp/LK-vmdk-*" > /etc/tmpfiles.d/lifekeeper.conf

How to check if multiple quickCheck daemons are running and what to do if they are running
Run

for tag in `ins_list -f, -a scsi -r vmdk | cut -d, -f4`; do echo -n "$tag: "; pgrep -f "vmdk_quickCheck.ps1.*$tag$" | wc -l; done

and if the result is not “<vmdk tag name>: 1”, execute the following:

pkill -INT -f "^/opt/LifeKeeper/bin/pwsh /opt/LifeKeeper/lkadm/subsys/scsi/vmdk/bin/vmdk_quickCheck.ps1"
pkill lkcheck

Mount point is not included in the selection when creating resources.

Possible causes are as follows:

PowerShell/PowerCLI is not installed
An ESXi host is not registered
disk.enableUUID parameter is not set
The virtual hard disk is on a datastore that is not shared
SCSI controller sharing is configured as “virtual” or “physical”

Error details are recorded in /var/log/lifekeeper.log. Check the log and review the settings.

VMDK Maintenance

VMDK Error Messages

Feedback

Post your comment on this topic.

Feedback

Was this helpful?