An Enemy of Scalability - Hypervisor (Virtualization) Overhead

Some time ago, we migrated one of our Meteor webapps away from our previous KVM-based Linux CentOS 7 VPS to a dedicated server (a Dell PowerEdge M610 with dual hex core CPUs and dual SSDs in a software RAID-1 configuration).

The KVM hypervisor that ran on the former VPS was adding way too much overhead on network interrupts despite using the paravirtualized virtio network driver.

As network I/O load increased, it eventually caused a runaway escalation resulting in all available CPU cycles being stolen by the hypervisor, resulting in our app suddenly being unresponsive. This was regardless of the amount of CPU or I/O load that Node and our MySQL database were actually using.

By switching to the dedicated server, this eliminated all hypervisor overhead and allowed Linux to leverage the 8 ethernet transmit and receive queues implemented in hardware. This allowed the OS’s interrupt handling to be distributed across the multiple CPUs in the server.

Despite more than tripling the number of API calls to that Meteor webapp since we stopped using the VPS, our server’s load average is usually less than 0.5.

I reckon quite a few people could be blaming Meteor or even Node for scalability problems that may be simply be caused by hypervisor overhead resulting from running under a VPS.

If you have root access to your server, you can check the number of ethernet transmit and receive queues you have available by listing the contents of directory /sys/class/net/eth0/queues/. You will see names like tx-0, tx-1, tx-2 and rx-0, rx-1, rx-2 corresponding to the available transmit and receive queues respectively.

Note: If your ethernet device is not named ‘eth0’, you will have to substitute the correct name into the above path.

You can also view information about the interrupt load on each of your CPUs by viewing the file /proc/interrupts.

4 Likes

Did you ever try VMware with a SRIOV capable NIC? Would be curious if it would have similar issues.

@odesey, I have no experiences with that kind of environment, but the first thing I would do is to use the procedure above to view the number of ethernet transmit & receive queues that the guest Linux OS can see. If there are multiple, then there is a good chance that their interrupt handling will be spread across multiple vCPUs.