If the NFS export point for the DB2 instance home directory becomes unavailable while the DB2 instances are running, the system will hang while waiting for the export point to become available again. Many system operations will not work correctly, including a system reboot. You should be aware that the NFS server for the DB2 multiple partitions cluster should be protected by LifeKeeper and should not be manually taken out of service unless all the partitions in the DB2 cluster are also taken out of service before shutting down the NFS resource. Additionally, the DB2 partitions cannot be brought into service unless the NFS resource is in service.
To avoid accidentally causing your cluster to hang by inadvertently stopping the NFS server, we make the following recommendations:
Use additional servers: It is highly recommended that you have a separate cluster for the NFS export point from which the DB2 instance home is mounted. The NFS export point on this cluster should be protected with the LifeKeeper NFS Server Recovery Kit.
If you do not have at least two additional servers available, you can reduce the chances of experiencing the problem described above by adding one additional server to the DB2 cluster. This additional server would export the NFS hierarchy. One of the other nodes in the cluster would serve as a backup. In this configuration the symptoms could occur if the NFS hierarchy were to failover to the backup node. The NFS export point on this cluster should be protected with the LifeKeeper NFS Server Recovery Kit.
If you cannot use additional servers: This is the least desirable option. However, if you decide to run your NFS server in the same cluster as your DB2 multiple partitions, the NFS export point should be protected with the LifeKeeper NFS Server Recovery Kit. You should note that LifeKeeper currently is not aware of the relationship between the DB2 partitions and the NFS server managing the DB2 partitions. Therefore, you must follow these manual procedures before stopping or starting LifeKeeper on any node in the cluster.
- If you wish to stop LifeKeeper on a single server, you must make sure that the NFS server is active on another server in the cluster. Failure to do this may cause the LifeKeeper shutdown to hang trying to take the DB2 partitions out of service. Generally, you should make sure that all DB2 partitions are either switched to another server or manually taken out of service before you stop LifeKeeper to ensure you don’t have problems trying to restart LifeKeeper.
- To shut down the entire cluster, you should manually take all DB2 partition resources out of service. Next, take all the DB2 NFS server resources out of service, and finally shut down LifeKeeper.
- If you remembered to take the DB2 resource out of service before shutting down LifeKeeper, you should be able to restart LifeKeeper normally. Then bring the NFS server resources into service, followed by any DB2 partitions you wish to restart.
- If you forgot to take the DB2 partition out of service before shutting down LifeKeeper, you must make sure that the NFS server resources for that partition are active elsewhere in the cluster before you restart LifeKeeper.
Post your comment on this topic.