When a server failover or an Oracle database failure occurs, users can be severely disrupted. Typically the user’s connections to the database will be lost along with most work in progress. Upon the completion of the failover (or recovery of the Oracle database), clients will have to restart their application and reconnect to the database. With the Transparent Application Failover (TAF) feature of Oracle, this disruption can be reduced or eliminated by masking some types of failures. To configure TAF in a LifeKeeper environment, there are tasks that must be performed on both the LifeKeeper server side and the Oracle client side.
For clients to effectively take advantage of the TAF feature, the client application must use failover-aware API calls from the Oracle Call Interface (OCI). The clients must also configure the appropriate TAF support using the Oracle Net parameters in the tnsnames.ora file. TAF mode can be configured by including a FAILOVER_MODE parameter under the CONNECT_DATA section of the tnsnames.ora connect descriptor. The TAF mechanism supports several sub-parameters to control and affect the behavior of a client connection during failover. The LifeKeeper for Linux Oracle Recovery Kit supports the following TAF configuration sub-parameters:
TYPE= (SELECT or SESSION).
This value determines how TAF will handle client connection failover. When the type is set to SELECT, Oracle keeps track of all select statements issued during transition. Upon establishment of a new connection, the select statements are re-executed, and the cursors repositioned so clients can continue to fetch rows. When type is set to SESSION only a new connection is created; work in progress may be lost.
METHOD= (BASIC).
With this method TAF will attempt a reconnect only after the primary connection fails. The alternative method is PRECONNECT, LifeKeeper does not currently support the use of PRECONNECT as a method.
DELAY= (#sec).
This value is the number of seconds that TAF will wait between attempts to connect following a failure. This value should be carefully determined for your client application and environment.
RETRIES= (#number of tries).
This value is the number of times that TAF will attempt to retry a failed connection before giving up. The combination of DELAY and RETRIES must allow enough time for a complete recovery of Oracle in the event of a server failure. This will give TAF enough time to restart after the server failover has completed.
An excerpt from a sample tnsnames.ora file for a client system is included below.
LKproDB= (DESCRIPTION= (ADDRESS_LIST= (ADDRESS = (PROTOCOL=TCP) (HOST=<switchableIP>) (PORT=<port number>)) ) (CONNECT_DATA= (SID=LKroDB) (SERVER=DEDICATED) (FAILOVER_MODE= (TYPE=SELECT) (METHOD=BASIC) (DELAY=5) (RETRIES=30) ) ) )
The normal location of tnsnames.ora is in $ORACLE_HOME/network/admin. The most common port number is 1521. The tnsnames.ora files can also be located in user’s home directories as well. Also, keep in mind, if the $ORACLE_HOME directory has been installed on non-shared storage, a copy of listener.ora and tnsnames.ora will need to be on both systems.
On the LifeKeeper server protecting the Oracle database, the listener should be configured using a LifeKeeper-protected switchable IP address. Refer to the Configuring the Oracle Net Listener for LifeKeeper Protection section above for details on configuring Oracle Net and listener support.
Post your comment on this topic.