The NFS Server Recovery Kit provides a High Availability NFS service in hierarchical cooperation with the Filesystem Recovery Kit (provided as part of the steeleye-lk package) and the IP Recovery Kit (steeleye-lkIP).
The kit ensures that an IP resource and a file system resource containing the shared mount point are always in-service on the same server in the cluster. Clients who mount the file system using the LifeKeeper-protected IP resource can continue processing files on the volume virtually uninterrupted while the actual export service is switched between servers in the cluster (either manually or in response to a failure). Client recovery times will depend on the interaction between the client and the NFS server. For example, with NFSv3, the protocol timeouts for TCP are longer than that of UDP. In order to determine the best transport layer protocol to use with NFS, consider the recommendations of the OS vendor, the advantages and disadvantages of each transport protocol and your specific environment.
Beginning with Version 7.4 of the NFS Server Recovery Kit, an NFS v4 pseudo file system export is now supported providing clients with seamless access to all exported objects on the server. Prior to Version 7.4, clients were forced to mount each shared server file system for access. With NFS Version 4, the server still specifies export controls for each server directory or file system to be exported for NFS access, and from these export controls, the server renders a single directory tree of all the exported data filling in gaps between the exported directories. This tree is known as a pseudo file system, and it starts at the NFS Version 4 server’s pseudo root. This pseudo file system model allows an NFS v4 client, depending on its implementation, to perform a single mount of the server’s pseudo root in order to access all the server’s exported data.
All files on the file system become temporarily unavailable while a switchover or failover is in progress, but they become available again transparently when the resource transfer is complete. For a switchover, this can take between 5 and 30 seconds. For a failover, the recovery time depends on how long it takes to repair the file system. It is strongly recommended that you format the underlying disk volume with a Journaling File System (JFS) which is extremely robust to failure and can be repaired in a few seconds.
You may also choose to use a Linux file system (ext2) as the underlying file system, but in that case, failover times will be much longer.
Post your comment on this topic.