Harnessing vSphere Performance Benefits for NUMA

Non Uniform Memory Access or NUMA is becoming increasingly commonplace on the next generation of very powerful servers. What does this mean for virtualization applications on the latest VMware incarnation? Serious performance increases for NUMA equipped systems.

Jasmine McTigue

December 6, 2009

4 Min Read

Non Uniform Memory Access or NUMA is becoming increasingly commonplace on the next generation of very powerful servers. This is nothing new in the AMD product line; Opteron is a NUMA architecture and the associated performance boost of the Opteron specification catapulted AMD ahead of curve in the mid 2000's. Intel has been trying to catch up for quite some time and the latest generation of Intel Xeon Nehalem processors not only sport NUMA, but better Virtualization Assist (VT-x) as well.
What does this mean for virtualization applications on the latest VMware incarnation? Serious performance increases for NUMA equipped systems.

Before we start on getting the most out of your hot new Nehalem rig or that brand new HP DL585 Opteron equipped server, let's overview what makes NUMA different.

NUMA is the logical successor for Symmetric Multiprocessing. In Symmetric Multiprocessing or SMP, there are multiple processors and cores tied to a single memory controller. Each processor has uniform or symmetric access to all of the available memory. Access to memory resources are limited, because all CPUs work on a common bus and there is a fixed amount of bandwidth available.
In NUMA architectures, memory resources are allocated specifically to different processors and groups of cores (multiple buses). The most common way to do this is to build a memory controller in for each socket. The next step is to connect these processors with a high speed interface: AMD did it first with Hyper Transport, but the new Nehalem 3500 and 5500 series use a technology called Quickpath. The end result is a huge increase in performance for Intel chips, as much as a 100% improvement in processing and a drop of a third in memory latency figures in preliminary benchmarks over Penryn chipsets by Anandtech.

This is great news for virtualization, especially because our friends at VMware have built NUMA support directly into vSphere in a transparent and self optimizing fashion. The way it works is thus:

The NUMA scheduler places each guest virtual machine on particular home node containing processor and memory resources.
When memory is allocated to the virtual machine, it is allocated from the home node.
The NUMA scheduler reallocates virtual machines to different home nodes whenever it is advantageous to do so. This happens when allocating more memory to a machine might violate locality etc.

There are a few things that we need to keep in mind to get the best performance out of the NUMA scheduler. First we need to make sure NUMA is turned on in the bios. Almost all of these machines have a memory interleaving setting: when interleaving is turned on, memory banks are interleaved and memory access becomes uniform which means that the NUMA scheduler can't operate and NUMA is effectively disabled.Secondarily, the ESX NUMA optimizations are only enabled on systems which have at least two nodes with two cores per node. Systems which don't meet these minimum requirements are not eligible.

As long as NUMA is working, the scheduler operates as follows:

Each NUMA node has a specific number of cores as native to the processor and memory controller which is used. For example Nehalem support up to 8 cores per socket making the maximum node size 8 cores.
Virtual machines with a number of vCPUs equal to or less than the number of cores in each NUMA node will be managed by the NUMA scheduler and will have the best performance.
Virtual machines with more vCPU's than the NUMA node size will not be managed by the NUMA scheduler and will not benefit from the scheduler.
Virtual machines are allocated to NUMA nodes on startup in a round-robin fashion.
Every 2 Seconds, virtual machines are reevaluated to see if a node change is beneficial.
Administrators can force virtual machines to use a particular node through a combination of CPU and memory affinity settings for that VM.

This should give you a basic understanding of the ramifications of NUMA, how it is handled in vSphere and some of the pitfalls involved in administering it. I'm going to get even deeper into the ramifications of NUMA architectures for Fiber Channel and Networking technologies in the next post so check back! Any questions and comments will be answered as quickly as I can.

For more information about NUMA and Intels 3500 and 5500 series Nehalems, check out these links:

VMware vSphere resource management guideIntel Nehalem Microarchitecture or Direct to the Whitepaper

About the Author(s)

Jasmine McTigue

Principal, McTigue AnalyticsJasmine McTigue is principal and lead analyst of McTigue Analytics and an InformationWeek and Network Computing contributor, specializing in emergent technology, automation/orchestration, virtualization of the entire stack, and the conglomerate we call cloud. She also has experience in storage and programmatic integration. Jasmine began writing computer programs in Basic on one of the first IBM PCs; by 14 she was building and selling PCs to family and friends while dreaming of becoming a professional hacker. After a stint as a small-business IT consultant, she moved into the ranks of enterprise IT, demonstrating a penchant for solving "impossible" problems in directory services, messaging, and systems integration. When virtualization changed the IT landscape, she embraced the technology as an obvious evolution of service delivery even before it attained mainstream status and has been on the cutting edge ever since. Her diverse experience includes system consolidation, ERP, integration, infrastructure, next-generation automation, and security and compliance initiatives in healthcare, public safety, municipal government, and the private sector.

Related Topics

Recent in Infrastructure

Related Topics

Recent in Network Mgmt

Related Topics

Recent in Security

Related Topics

Recent in Enterprise Connectivity

Related Topics

Recent in Wireless

Related Topics

Recent in Careers

Related Topics

Harnessing vSphere Performance Benefits for NUMA

About the Author(s)