This section provides information regarding issues that may be encountered with the use of DataKeeper for Linux. Where appropriate, additional explanation of the cause of an error is provided along with necessary action to resolve the error condition.

Messages specific to DataKeeper for Linux can be found in the DataKeeper Message Catalog. Messages from other SPS components are also possible. In these cases, please refer to the Combined Message Catalog which provides a listing of all error codes, including operational, administrative and GUI, that may be encountered while using SIOS Protection Suite for Linux and, where appropriate, provides additional explanation of the cause of the error code and necessary action to resolve the issue. This full listing may be searched for any error code received, or you may go directly to one of the individual Message Catalogs for the appropriate SPS component.

The following table lists possible problems and suggestions.

Suggested Action
After primary server panics, DataKeeper resource goes ISP on the secondary server, but when primary server reboots, the DataKeeper resource becomes OSF on both servers. Check the “switchback type” selected when creating your DataKeeper resource hierarchy. Automatic switchback is not supported for DataKeeper resources in this release. You can change the Switchback type to “Intelligent” from the resource properties window.
DataKeeper GUI wizard does not list a newly created partition The Linux OS may not recognize a newly created partition until the next reboot of the system. View the /proc/partitions file for an entry of your newly created partition. If your new partition does not appear in the file, you will need to reboot your system.
Errors during failover Check the status of your device. If resynchronization is in progress you cannot perform a failover.
Error creating a DataKeeper hierarchy on currently mounted NFS file system You are attempting to create a DataKeeper hierarchy on a file system that is currently exported by NFS. You will need to replicate this file system before you export it.
Extending to a target does not prompt for “Replication Type” to allow setting asynchronous or synchronous. When the mirror was created, “no” was selected for “Enable Asynchronous Replication.” Delete the mirror and recreate selecting “yes” to “Enable Asynchronous Replication” when prompted.
Installation/HADR rpm fails See Installation for complete instructions on manually installing these files.
NetRAID device not deleted after DataKeeper resource deletion.

Deleting a DataKeeper resource will not delete the NetRAID device if the NetRAID device is mounted. You can manually unmount the device and delete it by executing:

mdadm –S <md_device> (cat /proc/mdstat to determine the <md_device>).

Primary server cannot bring the resource ISP when it reboots after both servers became inoperable. If the primary server becomes operable before the secondary server, you can force the DataKeeper resource online by opening the resource properties dialog, clicking the Replication Status tab, clicking the Actions button, and then selecting Force Mirror Online. Click Continue to confirm, then Finish.
Replication Type is asynchronous instead of synchronous. Replication between two systems was initially configured for asynchronous replication, but synchronous replication is required instead. Unextend the mirror and extend again, selecting “synchronous” when prompted for the connection.
Replication Type is synchronous instead of asynchronous. Replication between two systems was initially configured for synchronous replication, but asynchronous replication is required instead. Unextend the mirror and extend again, selecting “asynchronous” when prompted for the connection.
Resources appear green (ISP) on both primary and backup servers.

This is a “split-brain” scenario that can be caused by a temporary communications failure. After communications are resumed, both systems assume they are primary.

DataKeeper will not resync the data because it does not know which system was the last primary system. Manual intervention is required.

If not using a bitmap:

You must determine which server was the last backup, then take the resource out of service on that server. DataKeeper will then perform a FULL resync.

If using a bitmap (2.6.18 and earlier kernel):

You should take both resources out of service, starting with the original backup node first. You should then dirty the bitmap on the primary node by executing: $LKROOT/lkadm/subsys/scsi/netraid/bin/bitmap –d /opt/LifeKeeper/bitmap_filesys

(where /opt/LifeKeeper/bitmap_filesys is the bitmap filename). This will force a full resync when the resource is brought into service. Next, bring the resource into service on the primary node and a full resync will begin.

If using a bitmap (2.6.19 and later kernel or with Red Hat Enterprise Linux 5.4 kernels 2.6.18-164 or later or a supported derivative of Red Hat 5.4 or later):

You must determine which server was the last backup, then take the resource out of service on that server. DataKeeper will then perform a partial resync.

Target(s) are out of sync waiting for the previous source. Connect the previous source to the cluster. If the previous source can not rejoin the cluster in a timely manner, then targets can be reconnected with a full resync by running the command “$LKROOT/bin/mirror_action fullresync <source> <target>” on the current mirror source.
Core – Language Environment Effects Some LifeKeeper scripts parse the output of Linux system utilities and rely on certain patterns in order to extract information. When some of these commands run under non-English locales, the expected patterns are altered and LifeKeeper scripts fail to retrieve the needed information. For this reason, the language environment variable LC_MESSAGES has been set to the POSIX “C” locale (LC_MESSAGES=C) in /etc/default/LifeKeeper. It is not necessary to install Linux with the language set to English (any language variant available with your installation media may be chosen); the setting of LC_MESSAGES in /etc/default/LifeKeeper will only influence LifeKeeper. If you change the value of LC_MESSAGES in /etc/default/LifeKeeper, be aware that it may adversely affect the way LifeKeeper operates. The side effects depend on whether or not message catalogs are installed for various languages and utilities and if they produce text output that LifeKeeper does not expect.
GUI – GUI login prompt may not re-appear when reconnecting via a web browser after exiting the GUI

When you exit or disconnect from the GUI applet and then try to reconnect from the same web browser session, the login prompt may not appear.

Workaround: Close the web browser, re-open the browser and then connect to the server. When using the Firefox browser, close all Firefox windows and re-open.

DataKeeper Create Resource fails

When using DataKeeper in certain environments (e.g., virtualized environments with IDE disk emulation, servers with HP CCISS storage, or solid state devices (SSD), an error may occur when a mirror is created:

ERROR 104052: Cannot get the hardware ID of the device “dev/hda3”

This is because LifeKeeper does not recognize the disk in question and cannot get a unique ID to associate with the device.

Workaround: Create a GUID partition and assign a unique ID to the partition. If you cannot create a GUID partition, add a disk pattern to the DEVNAME device_pattern file. For example:

  • # cat /opt/LifeKeeper/subsys/scsi/resources/DEVNAME/device_pattern

  • /dev/hda*

  • /dev/fio*(Fusion IO SDD)

  • /dev/hio*(PCI SSD)
In Multi-Site Cluster configurations, full synchronization sometimes occurs when stopping and resuming DataKeeper This could be caused by the incorrect selection for the bitmap file location. Bitmap files must be located on a file system shared between the local nodes in the Multi-site Cluster. This shared file system must be different than the shared file system used for replication. Once this has been corrected the hierarchy will need to be removed and recreated using the correct shared file system for replication and the correct shared file system for the bitmap.
The status of the mirror target becomes “Out of Sync” after upgrading

The use of an NU device is not recommended for LifeKeeper 9.2.2 or later. A “mirror out of sync” problem occurs in environments where DataKeeper resources are configured with NU devices.
When upgrading to LifeKeeper 9.2.2 or later, add the following settings to /etc/default/LifeKeeper if NU devices are used:


How to check whether NU devices are used:
Run the lcdstatus command. If a resource instance ID field contains a character string beginning with NU-, then NU devices are used.



