Open topic with navigation
Resource Policy Management in Steeleye Protection Suite for Linux and Steeleye vAppKeeper provides behavior management of resource local recovery and failover (or VMware HA integration). Resource policies are managed with the lkpolicy command line tool (CLI).
Steeleye Protection Suite and SteelEye vAppKeeper are designed to monitor individual applications and groups of related applications, periodically performing local recoveries or notifications when protected applications fail. Related applications, by example, are hierarchies where the primary application depends on lower-level storage or network resources. When an application or resource failure occurs, the default behavior is:
Local Recovery: First, attempt local recovery of the resource or application. An attempt will be made to restore the resource or application on the local server without external intervention. If local recovery is successful, then Steeleye Protection Suite/vAppKeeper will not perform any additional action.
Failover (or VMware HA integration): Second, if a local recovery attempt fails to restore the resource or application (or the recovery kit monitoring the resource has no support for local recovery), then a failover will be initiated. Failovers can take two different forms:
Steeleye Protection Suite for Linux: In this configuration, used for high availability clusters, the failover action attempts to bring the application (and all dependent resources) into service on another server within the cluster.
SteelEye vAppKeeper: In this configuration, used for application monitoring in VMware environments, a failover action alerts VMware HA that an application failure occurred in the virtual machine (VM) guest. Typical VMware HA response is to immediately, without warning, restart the VM guest to rectify the problem. In some cases, VMware HA can also move the VM guest to a different VM host or take another action. How VMware HA handles the condition is independent of the SteelEye vAppKeeper configuration.
Please see SteelEye Protection Suite Fault Detection and Recovery Scenarios or
Steeleye Protection Suite/vAppKeeper Version 7.5 supports the ability to set additional policies that modify the default recovery behavior. There are four policies that can be set for individual resources (see the section below about precautions regarding individual resource policies) or for an entire server. The recommended approach is to alter policies at the server level. The available policies are:
Failover (For vAppKeeper, this leverages VMware HA integration, which initiates a restart of the VM). This policy setting can be used to turn on/off resource failover. (Note: In order for reservations to be handled correctly, Failover cannot be turned off for individual scsi resources.)
LocalRecovery - Steeleye Protection Suite/vAppKeeper, by default, will attempt to recover protected resources by restarting the individual resource or the entire protected application prior to performing a failover. This policy setting can be used to turn on/off local recovery.
TemporalRecovery - Normally, Steeleye Protection Suite will perform local recovery of a failed resource. If local recovery fails, Steeleye Protection Suite will perform a resource hierarchy failover to another node (vAppKeeper will trigger VMware HA). If the local recovery succeeds, failover will not be performed.
There may be cases where the local recovery succeeds, but due to some irregularity in the server, the local recovery is re-attempted within a short time; resulting in multiple, consecutive local recovery attempts. This may degrade availability for the affected application.
To prevent this repetitive local recovery/failure cycle, you may set a temporal recovery policy. The temporal recovery policy allows an administrator to limit the number of local recovery attempts (successful or not) within a defined time period.
Example: If a user sets the policy definition to limit the resource to three local recovery attempts in a 30-minute time period, Steeleye Protection Suite will fail over when a third local recovery attempt occurs within the 30-minute period.
Defined temporal recovery policies may be turned on or off. When a temporal recovery policy is off, temporal recovery processing will continue to be done and notifications will appear in the log when the policy would have fired; however, no actions will be taken.
Note: It is possible to disable failover and/or local recovery with a temporal recovery policy also in place. This state is illogical as the temporal recovery policy will never be acted upon if failover or local recovery are disabled.
The "meta" policies are the ones that can affect more than one other policy at the same time. These policies are usually used as shortcuts for getting certain system behaviors that would otherwise require setting multiple standard policies.
NotificationOnly - This mode allows administrators to put Steeleye Protection Suite or vAppKeeper in a "monitoring only" state. Both local recovery and failover of a resource (or all resources in the case of a server-wide policy) are affected. The user interface will indicate a Failure state if a failure is detected; but no recovery or failover action will be taken. Note: The administrator will need to correct the problem that caused the failure manually and then bring the affected resource(s) back in service to continue normal Steeleye Protection Suite operations.
Resource level policies are policies that apply to a specific resource only, as opposed to an entire resource hierarchy or server.
- file system
In the above resource hierarchy, app depends on both IP and file system. A policy can be set to disable local recovery or failover of a specific resource. This means that, for example, if the IP resource's local recovery fails and a policy was set to disable failover of the IP resource, then the IP resource will not fail over or cause a failover of the other resources. However, if the file system resource's local recovery fails and the file system resource policy does not have failover disabled, then the entire hierarchy will fail over.
Note: It is important to remember that resource level policies apply only to the specific resource for which they are set.
This is a simple example. Complex hierarchies can be configured, so care must be taken when setting resource-level policies.
The lkpolicytool is the command-line tool that allows management (querying, setting, removing) of policies on servers running Steeleye Protection Suite for Linux or SteelEye vAppKeeper. lkpolicy supports setting/modifying policies, removing policies and viewing all available policies and their current settings. In addition, defined policies can be set on or off, preserving resource/server settings while affecting recovery behavior.
The general usage is :
lkpolicy [--list-policies | --get-policies | --set-policy | --remove-policy] <name value pair data...>
The <name value pair data...> differ depending on the operation and the policy being manipulated, particularly when setting policies. For example: Most on/off type policies only require --on or --off switch, but the temporal policy requires additional values to describe the threshold values.
The lkpolicytool communicates with Steeleye Protection Suite and vAppKeeper servers via an API that the servers expose. This API requires authentication from clients like the lkpolicy tool. The first time the lkpolicy tool is asked to access a Steeleye Protection Suite or vAppKeeper server, if the credentials for that server are not known, it will ask the user for credentials for that server. These credentials are in the form of a username and password and:
Clients must have Steeleye Protection Suite/vAppKeeper admin rights. This means the username must be in the lkadmin group according to the operating system's authentication configuration (via pam). It is not necessary to run as root, but the root user can be used since it is in the appropriate group by default.
The credentials will be stored in the credential store so they do not have to be entered manually each time the tool is used to access this server.
See Configuring Credentials for SteelEye Protection Suite or Configuring Credentials for vAppKeeper for more information on the credential store and its management with the credstore utility.
An example session with lkpolicy might look like this:
[root@thor49 ~]# lkpolicy -l -d v6test4 Please enter your credentials for the system 'v6test4'. Username: root Password: Confirm password: Failover LocalRecovery TemporalRecovery NotificationOnly [root@thor49 ~]# lkpolicy -l -d v6test4 Failover LocalRecovery TemporalRecovery NotificationOnly [root@thor49 ~]#
lkpolicy --get-policies tag=\*
lkpolicy --get-policies --verbose tag=mysql\* # all resources starting with mysql
lkpolicy --get-policies tag=mytagonly
lkpolicy --set-policy Failover --off
lkpolicy --set-policy Failover --on tag=myresource
lkpolicy --set-policy Failover --on tag=\*
lkpolicy --set-policy LocalRecovery --off tag=myresource
lkpolicy --set-policy NotificationOnly --on
lkpolicy --set-policy TemporalRecovery --on recoverylimit=5 period=15
lkpolicy --set-policy TemporalRecovery --on --force recoverylimit=5 period=10
lkpolicy --remove-policy Failover tag=steve
Note: NotificationOnly is a policy alias. Enabling NotificationOnly is the equivalent of disabling the corresponding LocalRecovery and Failover policies.
© 2012 SIOS Technology Corp., the industry's leading provider of business continuity solutions, data replication for continuous data protection.
Open topic with navigation