For each out-of-service (OSU) resource, lkcheck periodically calls the OSUquickCheck script for the resource. The OSUquickCheck script performs a quick health check for the resource. If it determines that the resource cannot start successfully, it changes the state of the resource to OSF and sends this information to the administrator by email or SNMP event forwarding. This monitoring is performed at the same interval as the normal LifeKeeper resource monitoring (/etc/default/LifeKeeper setting LKCHECKINTERVAL).

Monitored Resources

The following can be monitored with OSU Resource Monitoring:

Resource Name
Monitoring Details
IP Resource Verify the NIC link is up (disable with /etc/default/LifeKeeper setting IP_NOLINKCHECK=1).

Also, verify network reachability (if a ping list is configured).

DMMP Disk Resource Verify that the paths to the monitored disk are functional.

OSU Resource Monitoring Configuration

Set the SNHC_IPCHECK and SNHC_DISKCHECK settings in the /etc/default/LifeKeeper configuration file. You may also need to configure the following setting. See Standby Node Health Check Parameters List for details.

  • SNHC_IPCHECK_SLEEPTIME

Recovery from Failure

If an error is detected during OSU resource monitoring, the state of the corresponding resource is changed to OSF (out of service with failure). When the status is changed, OSU resource monitoring is no longer performed for the resource. After checking the details of the notified failure and addressing it, you should change the resource state to OSU. The state can be changed from OSF to OSU using the following command:

/opt/LifeKeeper/lkadm/bin/retstate <resource tag>

Feedback

Was this helpful?

Yes No
You indicated this topic was not helpful to you ...
Could you please leave a comment telling us why? Thank you!
Thanks for your feedback.

Post your comment on this topic.

Post Comment