SR-IOV and VMware vMotion

Most hypervisors now have, or will have, support for SR-IOV networking — VMware introduced support in vSphere 5.1.  One of the downsides of using this technology, which provides direct network access to a virtual machine, is that it hinders mobility — vMotion is no longer possible.  The new Hyper-V works around this limitation by temporarily routing I/O through a virtual switch during live migration.  Ingenious, no?

Over three years ago, VMware and partners demonstrated a solution to the SR-IOV/vMotion issue — by temporarily switching from passthrough to emulation mode it was possible to migrate a VM from one host to another.  This innovation can be seen in the following demonstration:

Now, you may be asking why VMware never shipped this feature if it existed over three years ago.  I’m not privvy to any specific details on the topic, but it would be sensible to say that decisions to release features are made based on priorities such as customer value.  Although hardware with SR-IOV support is now becoming widely available, the customer use cases are still very much at the edges.  Consider this: if a workload is so latency-sensitive that SR-IOV is truly required, how acceptable is it to impact service levels while I/O is rerouted during this migration period?

While this feature isn’t part of a shipping product from VMware, I’d argue that vSphere had it before Hyper-V for all intents and purposes.  Especially considering the weight Microsoft gives to the capabilities of their pre-release software.

Happy New Year!

Tags: , ,

2 comments

  1. Really?’s avatar

    So…. VMware showed a technology preview 3 years ago with something they are still not shipping (migration support for SR-IOV).

    Microsoft was openly working on this technology for 6+ years, see WinHEC 2006 slides online, but didn’t ship it until Hyper-V in Server 2012 last year. They always intended to support migration from the outset and shipped with it. Job done.

    Your point is exactly?

  2. Stu Fox’s avatar

    Well I guess in the situation where you have to do host maintenance then you’re talking about the difference between the ability to move the workload and take a hit in latency, versus not being able to do that at all without a full outage. And just maybe your workload doesn’t need that latency 24/7, maybe you only need to maintain that during working hours or whatever.

    And going further, SR-IOV isn’t necessarily about providing lower latency, it might just be about being able to leverage hardware capability to offload operations that the hypervisor otherwise might have to process itself. That means less overheard in hypervisor operations, which means more processor time available for VM’s. And if you’ve got support in your hardware for SR-IOV and can use it, but still maintain the flexibility of VM migration then why not?

    And why is it relevant that you had built it before Microsoft? The only relevant thing is what you are shipping to customers.

Comments are now closed.