Product SiteDocumentation Site

6. High Availability - Non-Responsive Host

A host is deemed non-responsive when the Red Hat Enterprise Virtualization Manager cannot communicate with the Red Hat Enterprise Virtualization agent on the host. This can be either due to a networking issue, or failure on the host side (kernel panic, power failure and such) which stops all communication with the host.
When a host is non-responsive, it will be fenced to ensure that virtual machines are allowed to restart on other hosts in the cluster while avoiding "split brain" — a situation in which communication with the host is lost while the virtual machines are still partially running. This scenario is simulated in the following section, where you will disconnect the host's management network while the storage connection remains functional.
At this stage, the Pacific host is non-operational as its storage connection was cut in the previous section.
Non-operational Pacific host
Figure 57. Non-operational Pacific host

Restart it for the next demonstration. On the Tree pane, select the Pacific host. Click the Power Management button and select Restart. Because you have fenced the host, it automatically brings the storage and eth1 networks up again, and allows the host to run as normal.
When the host's status changes to Up, migrate several machines onto it. This example uses RHEL6RioGrande, RHEL6Thames (both highly available machines) and RHEL6Erie. As you have disabled cluster policy at the beginning of this lab, these virtual machines will not auto-migrate as soon as the host is back up. Therefore, they need to be manually migrated to the Pacific host.
Migrate virtual machines to Pacific host
Figure 58. Migrate virtual machines to Pacific host

To demonstrate high availability when host connection is disrupted
  1. On the Tree pane, click Hosts. On the Hosts tab, select the Pacific host, and click the Network Interfaces subtab on the details pane. Check the physical interface name of the rhevm network — in this example it is the eth0 network.
  2. As before, connect to the Pacific host via SSH. Disable the management network by running:
    # ifdown rhevm
    You have now shut down the network connecting the Pacific host to the Red Hat Enterprise Virtualization Manager. The next time that the Manager attempts to transmit signals to the host, it triggers the automatic fencing operation.
  3. From the Tree pane, click VMs to display the Virtual Machines tab. The highly available virtual machines, RHEL6RioGrande and RHEL6Thames, have restarted on the Atlantic host. Conversely, RHEL6Erie did not restart because it was not configured to be highly available.
    Highly Available virtual machines automatically migrated
    Figure 59. Highly Available virtual machines automatically migrated

  4. Finally, go to the Tree pane and click Hosts to examine the status of the hosts. After a short period, the Pacific host will be rebooted, assuming that power management was successfully configured on this host.
You have just run a demonstration where a non-responsive host was automatically fenced and rebooted. As you had simulated a non-persistent network failure, the host will recover from the fault following its reboot. In the interim period while it is being restarted, the highly available virtual machines originally running on it are restarted on another available host in the cluster. Conversely, non-highly available virtual machines need to be manually restarted.