Red Hat Enterprise Virtualization HA [ha ha]

The previous post in this series on Red Hat Enterprise Virtualization (RHEV), explained that the RHEV Manager is not just a mission critical component of the infrastructure — it’s a huge single point of failure as well.

What happens when the other major component of a RHEV infrastructure fails?  Can you rely on RHEV High Availability (HA) to quickly and reliably restart affected VMs when a RHEV Hypervisor fails? It depends – as you will see.

First, let’s make sure everyone is up to speed on HA capabilities provided by the gold standard in virtualization:

VMware HA

VMware HA is a robust feature that was first introduced with Virtual Infrastructure 3 in 2006.

VMware vCenter Server is required to configure HA options and add VMware ESX hosts to a cluster, but after that vCenter is hands-off — ESX hosts communicate among themselves to reliably restart virtual machines.  In fact, VMware HA can even restart vCenter Server if it is running inside a protected VM — wrap your head around that one.

Powerful options are available for administrators, such as specifying the restart priority of virtual machines and whether or not to force VMs to power off if a host becomes isolated from the rest of the cluster.

VMware has heavily invested in this technology, reducing risk  for customers that virtualize with vSphere.  For even more information on VMware HA, take a look at Duncan Epping’s HA Deep Dive.

RHEV HA [ha ha]

Looking at this Red Hat Enterprise Virtualization competitive comparison, you’d might assume that RHEV and vSphere are on equal footing when it comes to protecting virtual machines with HA:

Unsightly details behind the marketing

RHEV HA sounds great in the marketing brochure, but there are a few problems with the execution.  RHEV Manager is a single point of failure — running on a physical Windows box — and it’s also the actual brain behind HA.  Yes, RHEV-M is responsible for restarting virtual machines when a host fails.  If the manager is down, no HA for you!

That alone makes RHEV HA something less than “HA” for most production environments, but there are a few other key weaknesses:

  • HA must be manually enabled for each virtual machine — no cluster-wide settings
  • No cluster admission control — administrators must manually ensure sufficient capacity would be available in a cluster to accommodate a host failure
  • No VM restart priority to ensure the most critical workloads and dependencies are brought online first
  • Primitive split-brain protection requires IPMI or other out-of-band management interface to force a host shutdown
  • Cannot protect the RHEV Manager itself — chicken-and-egg situation

Wow, I didn’t notice those details in the comparison brochure.

Decide

Whether your datacenter is running Windows Server or the mighty Red Hat Enterprise Linux, doesn’t it makes sense to trust the proven leader in virtualization?  VMware vSphere is simply the most reliable platform for consolidating workloads and building your private cloud.  Going beyond exceptional HA is VMware FT – mirroring mission-critical VMs on backup hosts means zero downtime from host failures.

Related posts:

  1. Red Hat Enterprise Linux is not Enterprise Virtualization
  2. What is Red Hat Enterprise Virtualization?
  3. Red Hat Enterprise Virtualization: Pentium II Inside!
  4. RHEV Manager — It’s not just a clever name
  5. These are not the files you are looking for

Tags: ,


  • Share/Bookmark

  1. Tony’s avatar

    Can’t argue with your findings here, but I can argue with this:

    “Going beyond exceptional HA is VMware FT – mirroring mission-critical VMs on backup hosts means zero downtime from host failures.”

    You mean mirroring mission-critical 1vCPU, VM’s….

    Reply

    1. Anton Zhbankov’s avatar

      I have some mission critical apps that consume about 400-500 MHz and 3.0GHz CPUs.
      You forgot that 1 core of top Nehalem is slightly more powerful than Pentium 2.

      Reply

  2. nate’s avatar

    Hmm, looking at the vmware pricing PDF, it seems that “HA” is available in “Standard” edition, it’s also available in the higher end versions of essentials, for smaller installs I believe.

    Good post though, I’ve been a devoted vmware fan/user for about 11 years now, though mid-longer term I have grave concerns about where EMC is trying to take them, I plan to stick with vSphere for at least v4, and re-evaluate options again in 2011/2012 whenever the next refresh comes around.

    Reply

    1. Eric Gray’s avatar

      Good point, Nate. VMware HA is available in every edition except Essentials. Minor oversight on the part of Red Hat marketing, I’m sure.

      Reply

  3. TimC’s avatar

    You must be deathly afraid of RHEV with all this bashing…

    VMware HA is nothing more than a rebirth of Legato AAM with some tweaks for esx.

    Unless something has changed, your comment “ESX hosts communicate among themselves to reliably restart virtual machines.” isn’t entirely accurate unless you add a caveat of “under most conditions”. They can only reliably restart virtual machines if a primary node is still alive.

    Marcel has done a good job of covering the basics:
    http://up2v.wordpress.com/2009/03/04/dc28-vmware-ha-cluster-in-enterprise-deep-dive-and-best-practises/

    Reply

    1. Fernando’s avatar

      Yes …. you need a primary node … so, if you have bad luck enough to lost all the 5 primary nodes at the same time, than HA will not work.
      Very likely situation uh ???

      Reply

      1. TimC’s avatar

        Depends on the definition of “likely situation”. Depending on the size of the cluster, the type of servers, etc., yes, it’s an entirely realistic scenario.

        If your definition is “happens on a weekly basis” then the answer is no. But with that sort of approach, why have redundant power, raid-6, or redundant fabrics? Screw it, let’s just pretend there’s no failure modes and hope customers never ask.

        Reply

        1. Fernando’s avatar

          So, you are suggesting that losing 5 ESX hosts at the same time, is realistic, and might happen anytime ?

          Reply

          1. TimC’s avatar

            You’re suggesting a blade enclosure has never failed in the field, never will, and if it does happen, it’ll be during a scheduled outage?

            I’m not sure if you’re trolling or just have 0 experience in a datacenter. Either way, yes, I’m saying it’s entirely possible to lose 5 servers in one shot. I hope you aren’t in charge of architecture.

            Reply

            1. Fernando’s avatar

              No personal offenses please, let’s maintain professional discussions here.
              It is an very known best practice to spread blades across difference enclosures to avoid this problem.

              Blade enclosure failures are rare, a minimal possibility. If that happens, even with HA, you are in serious trouble, since you will loose many, many hosts at the same time, and capacity will be an issue here (unless you have let’s say, 40/50% idle capacity on your cluster).

              Reply

              1. TimC’s avatar

                So why exactly do you find it important to spread blades across enclosures, but find it completely unimportant to inform end-users of the primary node issue? It would seem to me those two actions/beliefs are entirely in conflict with one another.

                Lots of things are “rare”, that doesn’t mean they should be ignored or dismissed with a sarcastic response, as you chose to do.

                Reply

                1. Fernando’s avatar

                  Again, this is a very known practice to spread blades across enclosures, what is your concern here ?

                  You pointed a possible problem, and I showed that well designed environments will never suffer from it.

                  This primary node “issue” is not a secret, it is very well known. This will happen very, very rarely, you cannot say it is a general problem that will hit everyone.

                  And again, blade enclosures failure rates are minimal to none, given you have a decente datacenter infra.

                  Reply

                2. Anton Zhbankov’s avatar

                  Maybe I missed something, but as Fernando said primary nodes issue is not a secret and I saw no efforts to conceal it from end users.
                  Actually if end user admin is not too lazy to open google he would know that. If admin is not too lazy to search through internet what “slot size” mean in HA Runtime Info he would know how exactly HA works, what is primary node and what is HA slot.

                  Reply

            2. Tony’s avatar

              LOL…I’ve seen a chassis die, which is why I balance ESX hosts across 2 chassis :-)

              I’ve seen facilities electricians cut UPS shutoff switch lines, entire racks blow BOTH circuits, sewer lines back up and flood raised floors. I dunno maybe I just have bad luck, but ANYTHING is possible!

              Reply

              1. Fernando’s avatar

                Agree ! But in this case, HA will not save you !!!

                Reply

                1. Fernando’s avatar

                  I mean, anything will save you in a situation like this.

                  Reply

              2. Eric Gray’s avatar

                I wasn’t supposed to disclose this, but… the next version of vSphere will have an optional accessory (USB) that monitors sewage levels beneath datacenter floors. Once the configurable threshold is exceeded, an alarm sounds and a thin membrane is automatically deployed that surrounds each rack, activating a bilge pump to bring sewage down to acceptable levels (typically 1-3 cm max).

                Reply

                1. Fernando’s avatar

                  Looks like the ultimate feature to enhance HA !

                  Reply

                2. nate’s avatar

                  Per the “losing 5 servers at once” argument I recall seeing someone blog not too long ago about trying to get vSphere to be “aware” of the underlying hardware, that is say for example you have a pair of blade enclosures which are backups for each other or something, apparently vSphere has no idea of the underlying design of the system so it could in fact put all 5 primary nodes on the same enclosure, in the unlikely event that enclosure suffered a hard fault then HA would not work.

                  Myself I have not used HA, all of my systems are either free ESXi, or vSphere standard/essentials. For the most part “HA” is provided by redundant VMs and load balancers, with few exceptions(few DB servers and stuff, DB servers have standbys as well). I actually tried turning on HA at one point for some systems but it wouldn’t let me – not enough capacity or something. I didn’t investigate it much but what I would want to do is turn on HA for specific VMs rather than cluster wide, and only those VMs that don’t have redundant counterparts on other servers. I don’t know if that is possible or not in vSphere. Wherever possible I will always opt for redundant VMs and load balancing, more scalable, (usually) simpler to manage, you often have the ability of active-active traffic serving, and it mixes well with the physical environment as well.

                  I kind of get annoyed when people/organizations come out with some fancy new tool or process/procedure and then say OH, well it doesn’t work if your not fully virtualized, as if running a physical environment parallel to a virtual one is like running a mainframe next to your linux servers or something. (yes I know some still run mainframes today:) )

                  Then again I’ve NEVER (honestly never) had a ESX host fail, ever in the past 3-4 years I’ve been using it (3.0.2 is when I first started using ESX I believe). Maybe I’ve been lucky..

                  My virtual environment is very similar to my physical one whether it’s the tools I manage them with(cfengine), the tools I monitor them with (cacti+nagios), etc. I suppose I could “get more” out of my virtual infrastructure if I took to learning powershell and the like, but I just don’t see the benefits from my own environmental standpoint.

                  Reply

                  1. Eric Gray’s avatar

                    Nate,

                    Thanks for bringing in that additional perspective — nice pragmatic approach to integrating virtualization with your existing systems management.

                    In vSphere it is possible to tune HA so individual VMs respond differently to failures; you can probably get it configured the way you like with a little research. The other issue you mention has to do with capacity reserves — vCenter prevents you from powering on more VMs than you would be able to run effectively if a host dies.

                    Eric

                    Reply

              3. Anton Zhbankov’s avatar

                HA was not designed to save you from evil electrician with BIG scissors :)

                Reply

      2. Anton Zhbankov’s avatar

        If all 5 primary nodes goes down at the same time there should be more serious stuff like whole datacenter power outage. Actually even blade enclosure can’t just go down. At least my blade enclosure with 6 Hot-Swap PS from 2 different sources.

        Reply

  4. John L’s avatar

    My Red Hat KVM clusters are completely bulletproof (unless we lose both geographically separated datacenters).

    I use LVM for my virtual disks and mirror them to storage at the other datacenter, create bonded links across PCI network NICs to seperate switchs and create my cluster with blades in diffferent chassis’ at both locations.

    And, by the way, you don’t need IPMI or out-of-band mgmt interfaces to fence nodes. SCSI persistent reservations isolate physical machines from shared storage quite elegantly.

    Think about this scenario though. Let’s assume you have an HA VMware VM running web services and the VM is fine, but the HTTP service on the guest crashes. For my KVM clusters, I can configure a cluster of virtual machines running XVM fencing and configure a clustered HTTP service, using shared storage, on them. This way, even if my KVM guests are all healthy from an OS perspective, my HTTP services migrate automatically on failure. I could also just run the HTTP directly on the physical blades using the same cluster and shared storage as my HA KVM guests.

    Everything I need for this, from nuts to bolts on the software side, you get with Red Hat. That means I have one support channel to deal with. This is really the beauty of it all.

    The per socket cost of VMware ESX is outrageously expensive when you consider the alternatives. Especially if you aren’t fully utilizing the ESX server. On top of the substantial cost of VMware, you still need to pony up for the OS and OS support you will need to put on your virtual guests. And if you have issues with your guests, don’t be surprised that if you call VMware they tell you you have an OS issue or vice versa from Microsoft, Red Hat, Sun, etc. Who needs a middle man?

    The support model for OS vendors creating their own virtualization technologies ends the finger pointing games, the bringing down of guests to install new versions of VMware tools, etc. In my opinion, VMware should be a bit nervous right now and they should be slashing the pricing model of their products to prevent further bleeding. When heavy hitters like IBM start using KVM/RHEV for their clouds like I was just reading about, the writing is on the wall.

    Reply

  5. Deen’s avatar

    Hey dude, what the fuck is wrong with you. RHEV-M is work in progress an cost on a friction of what vmware do. I suffered using vm infrastructure 3 but now the new vSphere is better. RHEV-M is still infant, give some time for the developer and then compare. Most of you facts is outdated. The latest RHEV comes with ksm and and other feature that work well for me. Are you using RHEV-M. I am. I implemented at least in 10 data center and all my customers are happy. So RHEV-M work and it is simple.

    Reply

  6. Robert’s avatar

    According to this document, RHEV-M can be HA’d
    http://www.redhat.com/f/pdf/rhev/final2.2/DOC255_RH_WP_RHEV_D_2832287_0610_ma_web.pdf

    It’s also possible to provide HA for RHEV-M using RHEL’s Cluster Manager.

    Reply

  7. John L.’s avatar

    On any RHEL server, KVM or RHEV-M, you can set up “shared nothing” HA clusters using DRBD (Disk Replicated Block Device) http://www.drbd.org/.

    I used DRBD on a KVM cluster as the shared storage and I complete HA…not even shared storage. Since I have a complete copy of the data, I am literally sharing nothing. I am pretty sure that you can’t utilize DRBD on ESX because the VMKernel handles ALL access to the VMFS datastores. The Linux kenel for the SC can’t get to that data.

    So. for RHEV-M, just set DRBD on the raw partitions where the RHEV-M resides and you are all set…at the block level. You will have to have access to all the VLANs at both locations though if you want the KVM guests to have networking when you switch over.

    Reply

    1. Anton Zhbankov’s avatar

      John, DRBD is a great software with one problem. You can’t buy support for it, and you can’t have any guarantees and guaranteed time-to-reply and time-to-fix.
      This is critical for some enterprises.

      Reply

        1. Anton Zhbankov’s avatar

          Does they provide support in all time zones, all languages and 24*7? Does they have representatives in Europe, and what especially concerns me, in Russia?

          Reply

          1. John L’s avatar

            They are headquartered in Austria. Not sure on the Russian support, but your English seems to be good enough on this forum. :)

            I will concede, and this is for VMware as well, if you don’t have skilled on-site IT people, you probably don’t deserve to be running on the enterprise level. DRBD is no different. For me, it is just an extra level of data protection. I still do regular backups of the primary storage. If DRBD fails, you still have access to your primary node for whatever that’s worth.

            Reply

  8. Deen’s avatar

    Don’t just comment an not update. Who the hell ask you to compare with something in beta release. The brochure is for complete release 3.0. Things you said RHEV don’t have already there. It’s looks like every thing about vmware is the best. Are you paid by them to write this? Check the release update of RHEV and comment.

    Reply

    1. Eric Gray’s avatar

      Uh… there are no beta products mentioned in this article and the current version of RHEV is 2.2 — not 3.0.

      Reply

      1. Deen’s avatar

        3.0 will be the next release which RHEV-M will be running on linux platform. 2.2 is the current release. I am running 2.2 with failover RHEV-M setup. No chicken or egg problem.

        Reply

      2. Deen’s avatar

        I have seen vmware promotion article saying vm esx server is not linux base. It is some special os created by vmware. Even presentation give by the vmware sales team syas the same thing. I don’t go around saying vmware not good because of such article. ARE you running RHEV. I am running 56 blades with over 150 os, no problem at all.

        Reply

      3. Deen’s avatar

        I am running RHEV Manager on 2 virtual os… no problem at all…

        Reply

  9. truth’s avatar

    RHEV HA sounds great in the marketing brochure, but there are a few problems with the execution. RHEV Manager is a single point of failure — running on a physical Windows box — and it’s also the actual brain behind HA. Yes, RHEV-M is responsible for restarting virtual machines when a host fails. If the manager is down, no HA for you!
    >>> Incorrect, the manager can be made Highly avaiable.

    That alone makes RHEV HA something less than “HA” for most production environments, but there are a few other key weaknesses:

    * HA must be manually enabled for each virtual machine — no cluster-wide settings
    >>> So?

    * No cluster admission control — administrators must manually ensure sufficient capacity would be available in a cluster to accommodate a host failure
    >>> Are your administrators unable to perform the simple math necessary to determine if enough capacity exists to handle Virtual Machine HA?

    * No VM restart priority to ensure the most critical workloads and dependencies are brought online first
    >>> Yes, in RHEV 2.2 which is released now

    * Primitive split-brain protection requires IPMI or other out-of-band management interface to force a host shutdown
    >>> Fencing a physical host is something that is used in many cluster solutions. How is this primitive?

    * Cannot protect the RHEV Manager itself — chicken-and-egg situation
    >>> You can cluster a Windows virtual machine using KVM and Red Hat Cluster Suite to provide a highly available Manager. You could also cluster the application server and database to provide HA at that layer.

    Wow, I didn’t notice those details in the comparison brochure.
    >>> You also didn’t mention that it’s 25% the price of VMWare, the next version of the hypervisor scales to 4096 cores , and is going to leverage SELinux, something that VMWare will never do.

    Have a nice day! :)

    Reply

    1. Deen’s avatar

      Thanks Bro.. Finally somebody who understand RHEV. You forgot to mention, it is also OPEN SOURCE. You can request for the source code from REDHAT, something VMWARE will never do.

      Bye..

      Reply

      1. Anton Zhbankov’s avatar

        This argument always makes me laughing.

        Do you ask for all the design drawings when you buy a car?

        Reply

      2. Eric Gray’s avatar

        Is that right? My understanding of Red Hat’s stated direction is to open source RHEV after the Java port is finished.

        http://www.linux-kvm.com/content/update-rhev-m-going-open-source

        Even if Red Hat would release the current source for RHEV, what would anyone do with a big pile of deprecated Windows .NET code?

        Reply

    2. Eric Gray’s avatar

      The version of RHEV available when this article was written — and the one with which Red Hat was boldly claiming vSphere parity — was 2.1; the features of 2.1 as described above are 100% accurate.

      Maybe Red Hat should have waited to get all of the features actually in a shipping product before their CEO started firing shots at VMware.

      If you think capacity management is simple arithmetic, you are entitled to your opinion.

      Since VMware ESXi is a small-footprint hypervisor that contains no Linux, I would have to agree with the point about VMware not leveraging SELinux.

      Reply

  10. Vishal Bhatia’s avatar

    Point is every organization makes tall claims about their Product’s features and no-one highlights the weaknesses.

    VMWare claims FT to be such an important feature. How many times the customer is actually told :
    That it can have only 1 vCPU and that too with upto 20% overhead. I’ve heard of VMWare claims that FT can make RAC redundant. Imagine running a DB VM with a single vcpu and upto 20% overhead :D
    That it requires a dedicated Gigabit Ethernet network between the physical servers, 10 Gigabit Ethernet should be considered if VMware FT is enabled for many virtual machines on the same host.
    That you can’t use memory over-commit, thin provisioning, hot-plugging of devices and even snapshots with FT. All the features that VMWare claims are critical for a virtual environment and charges you $$$s for.

    Xen 4.0 also provides FT and possibly with a lot less overhead and multiple vcpu support

    Reply