This topic describes the categories of information provided in the detailed status display as shown in the following example of output from the lcdstatus command. For information on how to display this information, see the LCD(1M) man page. At the command line, you can enter either man lcdstatus or man LCD. For status information available in the LifeKeeper GUI, see Viewing the Status of a Server or Viewing the Status of Resources.

Example of detailed status display:

Resource Hierarchy Information

Resource hierarchies for machine “wileecoyote”:

ROOT of RESOURCE HIERARCHY

apache-home.fred: id=apache-home.fred app=webserver type=apache state=ISP

initialize=(AUTORES_ISP) automatic restore to IN-SERVICE by LifeKeeper

info=/home/fred /usr/sbin/httpd

reason=restore action has succeeded

depends on resources: ipeth0-172.17.104.25,ipeth0-172.17.106.10,ipeth0-172.17.106.105

Local priority = 1

SHARED equivalency with “apache-home.fred” on “roadrunner”, priority = 10

FAILOVER ALLOWED

ipeth0-172.17.104.25: id=IP-172.17.104.25 app=comm type=ip state=ISP

initialize=(AUTORES_ISP) automatic restore to IN-SERVICE by LifeKeeper

info=wileecoyote eth0 172.17.104.25 fffffc00

reason=restore action has succeeded

these resources are dependent: apache-home.fred

Local priority = 1

SHARED equivalency with “ipeth0-172.17.104.25” on “roadrunner”, priority = 10

FAILOVER ALLOWED

ipeth0-172.17.106.10: id=IP-172.17.106.10 app=comm type=ip state=ISP

initialize=(AUTORES_ISP) automatic restore to IN-SERVICE by LifeKeeper

info=wileecoyote eth0 172.17.106.10 fffffc00

reason=restore action has succeeded

these resources are dependent: apache-home.fred

Local priority = 1

SHARED equivalency with “ipeth0-172.17.106.10” on “roadrunner”, priority = 10

FAILOVER ALLOWED

ipeth0-172.17.106.105: id=IP-172.17.106.105 app=comm type=ip state=ISP

initialize=(AUTORES_ISP) automatic restore to IN-SERVICE by LifeKeeper

info=wileecoyote eth0 172.17.106.105 fffffc00

reason=restore action has succeeded

These resources are dependent: apache-home.fred

Local priority = 1

SHARED equivalency with “ipeth0-172.17.106.105” on “roadrunner”, priority = 10

FAILOVER ALLOWED

Communication Status Information

The following LifeKeeper servers are known:

machine=wileecoyote state=ALIVE

machine=roadrunner state=DEAD (eventslcm detected failure at Wed Jun 7 15:45:14 EDT 2000)

The following LifeKeeper network connections exist:

to machine=roadrunner type=TCP addresses=192.168.1.1/192.168.105.19

state=“DEAD” priority=2 #comm_downs=0

LifeKeeper Flags

The following LifeKeeper flags are on:

shutdown_switchover

Shutdown Strategy

The shutdown strategy is set to: switchover.

Resource Hierarchy Information

LifeKeeper displays the resource status beginning with the root resource. The display includes information about all resource dependencies.

Elements common to multiple resources appear only once under the first root resource. The first line for each resource description displays the resource tag name followed by a colon (:), for example: device13557:. These are the information elements that may be used to describe the resources in the hierarchy:

  • id. Unique resource identifier string used by LifeKeeper.
  • app. Identifies the type of application, for example the sample resource is a webserver application.
  • type. Indicates the resource class type, for example the sample resource is an Apache application.
  • state. Current state of the resource:
    • ISP—In-service locally and protected.
    • ISU—In-service, unprotected.
    • OSF—Out-of-service, failed.
    • OSU—Out-of-service, unimpaired.
  • initialize. Specifies the way the resource is to be initialized, for example LifeKeeper restores the application resource, but the host adapter initializes without LifeKeeper.
  • info. Contains object-specific information used by the object’s remove and restore scripts.
  • reason. If present, describes the reason the resource is in its current state. For example, an application might be in the OSU state because it is in-service (ISP or ISU) on another server. Shared resources can be active on only one of the grouped servers at a time.
  • depends on resources. If present, lists the tag names of the resources on which this resource depends.
  • these resources are dependent. If present, indicates the tag names of all parent resources that are directly dependent on this object.
  • Local priority. Indicates the failover priority value of the targeted server, for this resource.
  • SHARED equivalency. Indicates the resource tag and server name of any remote resources with which this resource has a defined equivalency, along with the failover priority value of the remote server, for that resource.
  • FAILOVER ALLOWED. If present, indicates that LifeKeeper is operational on the remote server identified in the equivalency on the line above, and the application is protected against failure. FAILOVER INHIBITED means that the application is not protected due to either the shutting down of LifeKeeper or the stopping of the remote server.

Communication Status Information

This section of the status display lists the servers known to LifeKeeper and their current state, followed by information about each communications path.

These are the communications information elements you can see on the status display:

  • State. Status of communications path. These are the possible communications state values:
    • ALIVE — Functioning normally
    • DEAD — No longer functioning normally
  • priority. The assigned priority value for the communications path.This item is displayed only for TCP paths.
  • #comm_downs. The number of times the port has failed and caused a failover. The path failure causes a failover only if no other communications paths are marked “ALIVE” at the time of the failure.

In addition, the status display can provide any of the following statistics maintained only for TTY communications paths:

  • wrpid. Each TTY communications path has unique reader and writer processes. The wrpid field contains the process ID for the writer process. The writer process sleeps until one of two conditions occurs:
    • Heartbeat timer expires, causing the writer process to send a message.
    • Local process requests the writer process to transmit a LifeKeeper maintenance message to the other server. The writer process transmits the message, using its associated TTY port, to the reader process on that port on the other system.
  • rdpid. Each TTY communications path has unique reader and writer processes. The rdpid field contains the process ID for the reader process. The reader process sleeps until one of two conditions occurs:
    • Heartbeat timer expires and the reader process must determine whether the predefined heartbeat intervals have expired. If so, the reader process marks the communications path in the DEAD state, which initiates a failover event if there are no other communications paths marked ALIVE.
    • Remote system writer process transmits a LifeKeeper maintenance message, causing the reader process to perform the protocol necessary to receive the message.
  • #NAKs. Number of times the writer process received a negative acknowledgment (NAK). A NAK message means that the reader process on the other system did not accept a message packet sent by the writer process, and the writer process had to re-transmit the message packet. The #NAKs statistic can accumulate over a long period of time due to line noise. If, however, you see the numbers increasing rapidly, you should perform diagnostic procedures on the communications subsystem.
  • #chksumerr. Number of mismatches in the check sum message between the servers. This statistic can accumulate over a long period of time due to line noise. If, however, you see the numbers increasing rapidly, you should perform diagnostic procedures on the communications subsystem.
  • #incmpltmes. Number of times the incoming message packet did not match the expected size. A high number of mismatches may indicate that you should perform diagnostic procedures on the hardware port associated with the communications path.
  • #noreply. Number of times the writer process timed out while waiting for an acknowledgment and had to re-transmit the message. Lack of acknowledgment may indicate an overloaded server or it can signal a server failure.
  • #pacresent. Number of times the reader process received the same packet. This can happen when the writer process on the sending server times out and resends the same message.
  • #pacoutseq. Number of times the reader received packets out of sequence. High numbers in this field can indicate lost message packets and may indicate that you should perform diagnostic procedures on the communications subsystem.
  • #maxretrys. Metric that increments for a particular message when the maximum retransmission count is exceeded (for NAK and noreply messages). If you see a high number in the #maxretrys field, you should perform diagnostic procedures on the communications subsystem.

LifeKeeper Flags

Near the end of the detailed status display, LifeKeeper provides a list of the flags set for the system. A common type is a Lock LCD flag used to ensure that other processes wait until the process lock completes its action. The following is the standard LCD lock format:

!action!processID!time!machine:id.

These are examples of general LCD lock flags:

  • !action!02833!701236710!server1:filesys – The creation of a file system hierarchy produces a flag in this format in the status display. The filesys designation can be a different resource type for other application resource hierarchies, or app for generic or user-defined applications.
  • Other typical flags include !nofailover!machine, !notarmode!machine, and shutdown_switchover. The !nofailover!machine and !notarmode!machine flags are internal, transient flags created and deleted by LifeKeeper, which control aspects of server failover. The shutdown_switchover flag indicates that the shutdown strategy for this server has been set to switchover such that a shutdown of the server will cause a switchover to occur. See the LCDI-flag(1M) for more detailed information on the possible flags.

Shutdown Strategy

The last item on the detailed status display identifies the LifeKeeper shutdown strategy selected for this system. See Setting Server Shutdown Strategy for more information.

Feedback

Was this helpful?

Yes No
You indicated this topic was not helpful to you ...
Could you please leave a comment telling us why? Thank you!
Thanks for your feedback.

Post your comment on this topic.

Post Comment