If all resources on a node are out of service, LifeKeeper considers it a standby node and calls the node monitoring script. The node monitoring script monitors CPU and memory utilization. If it determines that the node cannot be switched to successfully (due to high CPU or memory load), it sends this information to the administrator by email or SNMP event forwarding. This monitoring is performed at the same interval as the normal LifeKeeper resource monitoring (/etc/default/LifeKeeper setting LKCHECKINTERVAL).
Monitored Resources
The following can be monitored with Node Monitoring:
CPU Utilization | Check CPU Utilization in /proc/stat file |
Memory Utilization | Check Memory Utilization in /proc/meminfo file |
Node Monitoring Configuration
Set the SNHC_CPUCHECK and SNHC_MEMCHECK settings in the /etc/default/LifeKeeper configuration file. You will also need to configure the following settings. See Standby Node Health Check Parameters List for details.
- SNHC_CPUCHECK_THRESHOLD
- SNHC_CPUCHECK_TIME
- SNHC_MEMCHECK_THRESHOLD
- SNHC_MEMCHECK_TIME
Post your comment on this topic.