5. High Availability - Non-Operational Host
Now that you have tested high availability in cases of virtual machine failure, you can examine a scenario in which the failure occurs on the host's side.
When a host's status displays as Non-Operational, it means that the host is accessible from the Red Hat Enterprise Virtualization Manager, however it cannot serve as a member of a cluster. This can be due to a fault — for example if a logical network is down, or if storage is inaccessible to the host. It can also be due to configuration mismatch — for example if the host CPU type is incompatible with the cluster, or if the host is missing a logical network.
A non-operational host will not be able to run new virtual machines; and if there are virtual machines running on the host before it becomes non-operational, they will be migrated to other hosts in the cluster.
This section demonstrates high availability when a host is non-operational because it is disconnected from the system's external storage resource. You can examine the outcomes of two different cases - when the host is acting as the Storage Pool Manager (SPM); and when it is not.
Before you begin these demonstrations, ensure that all your virtual machines are running. In addition, check which host has been configured as the SPM by clicking Hosts on the Tree pane. In this example, the Pacific
host is the SPM.
To demonstrate virtual machine high availability when storage network is down (Host: SPM)
On the
Hosts tab, select the
Pacific
host to display its details pane. Click the
Network Interfaces subtab. You should have at least one rhevm network and one storage network, as configured in
Section 5, “Configure Logical Networks”. In this example, the storage network is allocated on
eth1
.
Next, on the Tree pane click ISCSI-share. On the Storage tab, select the ISCSI-share domain and click Edit. On the Edit Domain dialog, click the + symbol to display the iSCSI target. Note that the storage target's submask address is the same as the storage network's, as seen under Address.
Now that you have determined the name and physical interface of your storage network, connect to the
Pacific
host via SSH. Check your available networks by running the following command:
[root@pacific ~]# ifconfig
Once you have determined the name of your storage network, run:
[root@pacific ~]# ifdown storage
[root@pacific ~]# ifdown eth1
You have now shut down the network between the
Pacific
host and
ISCSI-share
storage. On the
Hosts tab, the
Pacific
host changes to
Non Operational, then to
Reboot.
Because the Pacific
host was configured as the SPM, it is automatically rebooted. The highly available virtual machines are restarted on the other available host in the cluster - in this case it is Atlantic
- while the non highly available ones are suspended. However, once the Pacific
host is up and running again, the virtual machines which were originally running on it are migrated back to it in order to balance the workload between all hosts in the cluster.
You have now demonstrated high availability when the connection is disrupted between the storage and the host which is the SPM. However, storage disconnection can also occur with a host which is not acting as the SPM. In this case, the host moves into non-operational status, and its virtual machines migrate to other hosts in the cluster.
Previously, the Pacific host was running as the SPM. However, the role of SPM can be transferred, because it has to be filled by a running host. After Pacific had been rebooted, the Atlantic host gained the status of SPM. Therefore, in this example the Pacific
host is used again, as it can now play the role of the host which is not the SPM. Before running the next procedure, migrate several virtual machines from Atlantic to Pacific.
To demonstrate virtual machine high availability when storage network is down (Host: non-SPM)
As before, connect to the
Pacific
host via SSH. Check your available network by running the following command:
[root@pacific ~]# ifconfig
Once you have determined the name of your storage network, run:
[root@pacific ~]# ifdown storage
[root@pacific ~]# ifdown eth1
You have now shut down the network between the
Pacific
host and the
ISCSI-share
storage. If new data is being written onto the virtual machine's disk, the virtual machine detects that the storage connection has been lost, and pauses itself to prevent loss of data. When this happens, the
Hosts tab shows that the status of the
Pacific
host has changed to
Non Operational.
Click the VMs icon on the Tree pane to examine the virtual machines. All the virtual machines which were originally running on the Pacific
host are automatically migrated to the Atlantic
host. The highly available machines, which were set as high priority, are migrated before the non-highly available ones.
When virtual machines are live migrated, they do not experience any downtime. In rare cases, they will be paused, and then continued on the host they have been migrated to.
The contrast between these two scenarios is that when the host is the SPM, the host will be fenced, causing termination of its virtual machines, therefore only the highly available machines will be restarted. However when the host is not the SPM, all virtual machines will be migrated in order of their assigned priority, and leave the host in non-operational mode.