The first cluster node installation completed without too much effort. In a normal cluster where all the cluster nodes reside in the same subnet, the installation of the second cluster node would run just as smoothly. However, because this is AWS and the nodes reside in different subnets, there are some steps needed to address the unique requirements of a multi-subnet cluster.
Before we proceed, we have to fix the A records we created earlier. It was necessary to create those A records so that the first SAP node could be created properly. However, you will see that those A records in DNS are “static”. Static records cannot be updated by WSFC, which is necessary in a multi-subnet cluster.
Delete the A records from DNS so that WSFC Manager can re-register them as dynamic records. Right click on each of the two A records we created earlier and delete them.
In the WSFC Manager, bring the two name resources offline in each of the two cluster groups and then bring the resources back online. This process will re-register the two A records in DNS. This time they will be dynamic records.
Refresh your DNS zone and it should now look like this.
By default, the time to live (TTL) on each of those A records is 20 minutes. That is much too long for a client to wait after a failover to receive the new IP address. Instead we are going to adjust the TTL to 15 seconds.
To adjust the TTL on a cluster name resource, run the following PowerShell command once for each name resource. This command can be run from either cluster node.
Get-ClusterResource -Name “SAP DAB ERS NetName” | Set-ClusterParameter -Name HostRecordTTL -Value 15
Get-ClusterResource -Name “SAP DAB NetName” | Set-ClusterParameter -Name HostRecordTTL -Value 15
A multi-subnet cluster can handle client redirection a few ways. It’s beyond the scope of this article to discuss the differences. To handle SAP client redirection we need to ensure the RegisterAllProvidersIP property of each cluster name resource is set to 0. Run the following PowerShell command on each cluster node.
Get-ClusterResource -Name “SAP DAB ERS NetName” | Set-ClusterParameter RegisterAllProvidersIP 0
Get-ClusterResource -Name “SAP DAB NetName” | Set-ClusterParameter RegisterAllProvidersIP 0
It is important to bring both of the cluster resource offline and online again to ensure that the changes made in the last two sections are applied.
Add IP Address Resource to Support Multi-Subnet Cluster
When configuring clusters for applications like SQL Server, the installer recognizes when the cluster is a multi-subnet cluster. However, the SAP installer does NOT recognize that fact, so we need to perform some of the configuration steps manually. One of those steps is to create the cluster IP address resources that reside in the subnet of the secondary node. The steps are as follows:
- Create the IP Resources
- Assign the IP Address
- Create the “Or” Dependency
Follow the screenshots below to complete these steps on one of the cluster nodes for each of the two cluster resource groups.
Add a new IP address.
Configure the IP address so it is associated with the Network of the secondary node. Give it one of the unused secondary addresses configured earlier.
Make the server name resource dependent on this additional IP address using the “OR” functionality.
Complete the same process for the other server name resource as shown below.
It is normal for these addresses to be offline. They will only be online if the cluster workload is running in that subnet.
Change Cluster Resource Restart Policy
On occasion, the SAP ASCS Service will fail to start upon a switchover or failover. The reason this most often happens is that the service is dependent upon the clustered file share to be available. With the TTL set to 15 seconds, we have observed on occasion that the file share is not available before the ASCS service tries to start. When this occurs it fails to come online. Simply bringing the resource online again after the failure usually fixes the issue. However, that requires user intervention and defeats the purpose of failover clustering.
The fix to this problem is to adjust the Maximum restarts and Delay between restarts properties of the ASCS Service resource to give the ASCS service a little time to come online if the file server resource IP is not yet available. The settings pictured below are some sample settings that should be more than sufficient. If you have a large complex DNS environment, increase these parameters to meet the needs of your environment.
The other parameter that can impact the reconnection is the TTL. We set it to 15 seconds earlier. However, if you have a large AD environment that takes time to update the DNS zones, you may need to decrease the TTL even further and/or allow more restart attempts and increase the delay between restarts.
For good measure, it is recommended to do the same for your SAP Instance Resource.
Adjust Permissions on USR Folder
Add permissions for the person that is doing the install to the USR folder that was created on the replicated volume. The folder will be on the D drive (or whatever replicated volume you used).
From the secondary node, confirm that you can see the file share that was created using the server name resource.
Run the SAPINST on the Second Node
Follow the screenshots below to complete the installation of the additional cluster nodes.
If the above screen seems to hang here for a while, click on the message and the installation should to start progressing again.
The next step will timeout and fail after 5 minutes.
The process of installing the second moved all the resources to the second node.
To get the installer to complete after the Error and the installer times out, you may need to move both of the cluster resources back to the primary node and click on Retry to complete the installation.
Once both cluster roles are back in service on the primary node, click the Retry button on the SAPinstaller. The installation will then complete as shown below.