Replacing a node in a Windows Failover Cluster that uses DataKeeper Cluster Edition mirrored volumes involves making the following changes:
- Move any DataKeeper Volume resources from the node being replaced to another node in the cluster
- Using the registry editor, remove the node being replaced from the DataKeeper job for each mirrored volume. For 1×1 mirrors (2 node cluster) delete the DataKeeper job.
- For each mirrored volume, use EMCMD to delete mirrors from the source system to the target node which is being replaced
- Evict the node from the cluster
- Bring up the replacement node and add it to the cluster
- Use the DataKeeper GUI to re-create mirrors to the new node
There are two cases that require slightly different steps to achieve node replacement. The first case involves a cluster node that has been lost and cannot be recovered, the second is the case where a node is planned to be replaced, but is still up and running prior to replacement.
Within those two cases are two scenarios that also require slightly different steps. The first scenario is a two-node cluster, with one mirror for each clustered volume. The second scenario is a three-node cluster, or a two-node cluster with a node outside the cluster.
Case 1 – Node is lost and Not recoverable
SCENARIO 1: Two-node cluster with no nodes outside the cluster
In this example, there is a two-node DKCE cluster. The cluster nodes are:
- W19-1
- W19-2
There are two mirrored volumes – E: and F:.
Node W19-2 has been lost and is not recoverable. It will be replaced with a new node, also named W19-2.
Step 1 – Move any DataKeeper Volume resources from the node being replaced to another node in the cluster
DataKeeper Volume resources for the E: and F: volumes are Online on node W19-1.
Step 2 – Using the registry editor, delete the jobs that contain mirrored volumes.
DataKeeper Jobs are stored in the Windows registry, in the following registry key:
HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\ExtMirr\Parameters\Jobs
Start the registry editor and navigate to that key.
Each DataKeeper job has an ID, which you’ll see as a subkey in the “Jobs” key. For example, on this system there are two jobs “E” and “F”. The job ID is listed in the output of the “emcmd . getjobinfo” command:
|
---|
The registry shows the two Job IDs.
Each job should be deleted, unless it contains information about mirrored volumes that are not part of the cluster. To delete the job, right-click the key whose name is the job ID, and choose “Delete”. This removes the job completely from DataKeeper on this system.
Note: Some DataKeeper jobs contain information for more than one volume.
After completing this step on one of the cluster nodes, repeat it on all other cluster nodes.
Step 3 – For each mirrored volume, use EMCMD to delete mirrors from the source system to the target node which is being replaced
To delete a mirror using EMCMD, start a CMD prompt on the mirror source node. Then change directory to the DataKeeper install directory using the command “cd /d %ExtMirrBase%”.
To delete the mirror for a mirrored volume that has only one target, run this command:
emcmd . deletemirror <vol>
In this case, run the commands:
C:\Program Files (x86)\SIOS\DataKeeper>emcmd . deletemirror f |
---|
Step 4 – Evict the node from the cluster
At this point, the node has been completely removed from DataKeeper. The next step in the replacement process is to evict the node from the cluster. You can do this with the Failover Cluster Manager, or with the Remove-Clusternode powershell command.
Step 5 – Bring up the replacement node and add it to the cluster
Configure the new node, adding storage as appropriate. Then add it to the cluster.
Step 6 – Use the DataKeeper GUI to re-create mirrors to the new node
Start the DataKeeper GUI, connect to the new node, and create a mirror to it within the appropriate job.
SCENARIO 2: Three-or-more-node cluster, or two node cluster with 1 or more nodes outside the cluster
In this example, there is a three-node DKCE cluster. The cluster nodes are:
- W19-1
- W19-2
- W19-3
There are two mirrored volumes – E: and F:.
Node W19-3 has been lost and is not recoverable. It will be replaced with a new node, also named W19-3.
Step 1 – Move any DataKeeper Volume resources from the node being replaced to another node in the cluster
DataKeeper Volume resources for the E: and F: volumes are Online on node W19-1.
Step 2 – Using the registry editor, remove the node being replaced from the DataKeeper job for each mirrored volume.
DataKeeper Jobs are stored in the Windows registry. To modify a job that is configured on a node that is not accessible, update the registry values associated with the job.
DataKeeper Jobs are stored in the following registry key:
HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\ExtMirr\Parameters\Jobs
Start the registry editor and navigate to that key.
Each DataKeeper job has an ID, which you’ll see as a subkey in the “Jobs” key. For example, on this system there are two jobs “E” and “F”. The job ID is listed in the output of the “emcmd . getjobinfo” command:
|
---|
The registry shows the two Job IDs. Navigate into one of them – you will see that it contains 3 values: Name, Description, and Endpoints.
To remove a node from a job, the Endpoints value needs to be modified. Double-click the Endpoints value and find any lines containing the node that is to be removed.
In this case, the 2nd and 3rd lines should be removed. Highlight them and press the Delete button, then “OK” to save the value. In this case, it should have a single line left (for the mirror between W19-1 and W19-2).
Repeat these steps for all jobs. When completed, “emcmd . getjobinfo” will reflect the new job contents:
|
---|
Note: Some DataKeeper jobs contain information for more than one volume. In those cases, the same steps should be followed – remove any lines that contain references to the node being removed.
After completing this step on one of the cluster nodes, repeat it on all other cluster nodes. An alternative is to export the “Jobs” key to a file, and import that key on each of the other nodes. This ensures that job information is consistent across the nodes.
Step 3 – For each mirrored volume, use EMCMD to delete mirrors from the source system to the target node which is being replaced
To delete a mirror using EMCMD, start a CMD prompt on the mirror source node. Then change directory to the DataKeeper install directory using the command “cd /d %ExtMirrBase%”.
To delete the mirror whose target is the node being removed, run the command:
emcmd . deletemirror <vol> <target_ip>
using the volume letter and IP address of the node being removed. In this case, run the commands:
|
---|
Step 4 – Evict the node from the cluster
At this point, the node has been completely removed from DataKeeper. The next step in the replacement process is to evict the node from the cluster. You can do this with the Failover Cluster Manager, or with the Remove-Clusternode powershell command.
Step 5 – Bring up the replacement node and add it to the cluster
Configure the new node, adding storage as appropriate. Then add it to the cluster.
Step 6 – Use the DataKeeper GUI to re-create mirrors to the new node
Start the DataKeeper GUI, connect to the new node, and create a mirror to it within the appropriate job.
Case 2 – node is running and can be accessed prior to being replaced
If you are planning to replace a cluster node with a new one, the steps are very similar to what is done for Case 1 – node is lost and not recoverable. The steps are – before shutting down the node to be replaced:
- Move any DataKeeper Volume resources from the node being replaced to another node in the cluster
- Shut down the node that is going to be replaced. After this point, do NOT re-start this node, since it will have invalid mirror and job configuration.
- Follow the steps described in Case 1 – node is lost and not recoverable.
Post your comment on this topic.