SAP HANA resource fails restore due to inconsistent or out-of-date data

The SAP HANA Recovery kit will not allow the SAP HANA resource (database) in-service on a system that may have inconsistent or out-of-date data. To maintain data consistency the following flags are used:

Flag Name Description
!HANA_DATA_OUT_OF_SYNC_<tag> This flag is used by the kit to determine that data on a system is not up-to-date or may be inconsistent. This flag will be set on all standby systems when the resource is brought in-service on a system. This flag is removed for a system when replication is established and the HSR status is “ACTIVE” to that system.
!HANA_LAST_OWNER_<tag> This flag is used by the kit to show where the data has been in-service (Primary) and may have unsynchronized data. This flag is set on the system that has a resource in-service. When a resource is stopped (taken out-of-service) on a system and all standby systems are in-sync (HSR status “ACTIVE”) then the flag is removed.
!HANA_DATA_CONSISTENCY_UNKNOWN_<tag> This flag is used by the kit when it is unable to determine if the data on a system is up-to-date. This flag is set on a system during startup if any system is not responding, if any system is in-service or any system has the LAST_OWNER flag. During LifeKeeper startup if all other systems are accessible, none are in-service and none have the LAST_OWNER flag then the DATA_CONSISTENCY_UNKNOWN flag is removed on all systems. This flag is removed during restore as replication is established to standby systems.
!HANA_STARTUP_DONE_<tag> This flag is used to avoid a race between the startup routine and an in-service operation. This flag will be created when the startup script finishes to allow an in-service while LifeKeeper initialization is being done. When an in-service operation completes this flag will be removed.

During startup of LifeKeeper the SAP HANA Recovery Kit will check the resource status on each system in the cluster.

  • If any system has the SAP HANA resource in-service then startup will let the quickCheck process on the in-service system synchronize the data.
  • If any system in the cluster is not responding during startup, the SAP HANA Recovery Kit will wait until all systems are available to determine consistency of the data before allowing the SAP HANA resource in-service.
    • When all systems rejoin the cluster LifeKeeper will automatically bring the resource in-service on the system set to automatically restore (has AUTORES_ISP set). if more than one server is set to automatically restore then the system administrator must decide on which system to force the resource in-service.
    • If all systems can not rejoin the cluster in a timely manner the administrator may choose to force the SAP HANA resource in-service. WARNING: This should only be done when the administrator knows there are no unsynchronized changes on the down system or is willing to lose those changes.
  • If there are 2 or more systems that have unsynchronized data then the system administrator must decide which system has the most up-to-date data and force the resource in-service. This will result in all systems being synchronized with the data on the system that was forced in-service.

Related messages:

Trying to start HANA when “last owner” has unsynchronized data

ERROR:hana:restore:HANA-SPS_HDB00:136280:The resource HANA-SPS_HDB00 protecting SAP HANA database HDB00 has the last owner flag on node1 indicating there are changes on node1 that have not replicated to all systems.  The resource should be brought in-service on node1 to allow the data to replicate to all systems.

This situation can occur when the SAP HANA resource on node1 was taken out-of-service when not all data was synchronized with all systems in the cluster. The system node2 was then started when the resource was out-of-service preventing it from being able to determine if its data was up-to-date. The resource should be brought in-service on node1 to synchronize data with node2.

Multiple servers rebooted when SAP HANA resource is not in-service

ERROR:hana:restore:HANA-SPS_HDB00:136282:The SAP HANA database HDB00 consistency can not be determined on node2. The resource HANA-SPS_HDB00 can not be brought in-service until all servers are available to determine consistency of the data. To protect the data, LifeKeeper will not restore HANA-SPS_HDB00 on node2. To force the resource in-service use "lkcli hana force --sys node2 --tag HANA-SPS_HDB00" or the GUI option "Force In Service".

This situation can occur when node2 starts up and the SAP HANA resource is not in-service on any system and there are one or more systems in the cluster that are not responding. The starting server is unable to determine if the systems not responding have unsynchronized changes. The SAP HANA Recovery Kit will not allow the SAP HANA resource to come in-service on node2 until all systems are available. The Recovery Kit uses the flag ‘!HANA_DATA_CONSISTENCY_UNKNOWN_<Tag>’ when it can not determine if the database on a system is consistent or has up-to-date data. The flag is removed when consistency is determined.

If this error occurs during startup where the resource is set to automatically start on bootup (AUTORES_ISP is set in the resource instance) then the restore will automatically occur when all systems in the cluster are online.

Forcing the SAP HANA resource in-service

When the resource is forced online by either using the lkcli command in the error message or using the GUI option “Force In Service” the following message will be displayed:

ERROR:hana:restore:HANA-SPS_HDB00:136285:The SAP HANA database HDB00 consistency can not be determined on node2.

Followed by:

INFO:hana:restore:HANA-SPS_HDB00:136286:The resource HANA-SPS_HDB00 protecting SAP HANA database HDB00 is being forced in-service.

Multiple-service with unsynchronized data

ERROR:hana:restore:HANA-SPS_HDB00:136281:The resource HANA-SPS_HDB00 protecting SAP HANA database HDB00 has the last owner flag on node1, node2 indicating there are changes on each system that were not replicated to all systems.  The resource should be brought in-service on the system with the most up-to-date data to allow the data to replicate to all systems.

This situation can occur where node1 has the resource in-service, fails and does not reboot. The resource fails over to node2 that brings the resource in-service and operates for some time but then it too fails. At this point both servers were “last owners” of the SAP HANA resources that may have unsynchronized changes. When node1 and node2 are repaired and restarted, LifeKeeper will not allow the SAP HANA resource to come in-service. The system administrator is required to force the resource in-service on the system with the best data. The Recovery Kit uses the flag ‘!HANA_LAST_OWNER_<Tag>’ to determine where the database was last used as the source of the database. In this case both systems have this flag. When a resource is stopped and taken out-of-service when the data is in-sync on all servers the last owner flag is removed.

Preventing the resource from coming in-service

ERROR:hana:restore:HANA-SPS_HDB00:136284:To protect the data, LifeKeeper will not restore HANA-SPS_HDB00 on node1. To force the resource in-service use "lkcli hana force --sys node1 --tag HANA-SPS_HDB00" or the GUI option "Force In Service".

This message is displayed when LifeKeeper determines it is unsafe to bring the resource in-service. An earlier error message will indicate the unsafe issue. Follow the instructions given in the message to force the resource in-service if necessary.

Forcing the resource in-service

When the resource is forced in-service the following message will occur:

INFO:hana:restore:HANA-SPS_HDB00:136286:The resource HANA-SPS_HDB00 protecting SAP HANA database HDB00 is being forced in-service.

Feedback

Was this helpful?

Yes No
You indicated this topic was not helpful to you ...
Could you please leave a comment telling us why? Thank you!
Thanks for your feedback.

Post your comment on this topic.

Post Comment