utorok 29. júna 2010

NUMA architektura

VMware ESX is NUMA aware and will schedule all of a VM's vCPUs on a ‘home’ NUMA node. However, if the VM container size (vCPU and RAM) is larger than the size of a NUMA node on the physical host, NUMA crosstalk will occur. It is recommended, but not required, to configure your maximum Zimbra VM container size to fit on a single NUMA node.

For Example

* ESX host with 4 sockets, 4 cores per socket, and 64 GB of RAM
* NUMA nodes are 4 cores with 16 GB of RAM (1 socket and local memory)
* Recommended maximum VM container is 4 vCPU with 16GB of RAM

Zdroj:
http://wiki.zimbra.com/wiki/Performance_Recommendations_for_Virtualizing_Zimbra_with_VMware_vSphere_4

"After all, access to remote memory may take three to five times as long as access to local memory even in a well designed NUMA system."

Zdroj:
NUMA: Theory and Practice
http://practical-tech.com/infrastructure/numa-theory-and-practice/

Turning on NUMA merely makes the kernel aware of the topology and lets it optimize its own memory placement. If applications want to have any useful performance then they either need to use the NUMA library for requesting memory, or the the administrator needs to give them explicit NUMA policy using the numactl tool.

If you make use of numactl in RHEL-4 you'll get very good benefits from NUMA. If you don't use numactl, then you probably are best off using interleaving, because that will give predictable/consistent performance.

NUMA wins: 1) if your application can run within only 1-numa-nodes of memory, default scheduler or use NUMActl to bind. 2) for multi-process application with small memory footprints. 3) if your application is aware of Linux NUMA APIs (DB2 and a few others)

interleave is better: 1) memory usage greatly exceeds 1-numa-nodes worth of memory (ie large database caches etc) 2) the Linux pagecache is not NUMAized, if heavy file I/O (not-directIO) with large files 3) analyze numastat output in RHEL4 and if your numa_miss are > factor of 10 compared to numa_hit.

Zdroj:
RHEL 4 and Intel vs. Opteron performance
http://markmail.org/message/vfwjd7rysqt2jahb

Both IBM and AMD NUMA systems are supported in ESX 3.0. AMD Opteron-based systems, such as the HP ProLiant DL585 Server, can be configured as NUMA architecture. The BIOS setting for node interleaving determines whether the system behaves more like a NUMA system, or more like a Uniform Memory Architecture (UMA) system. For more information, see the HP ProLiant DL585 Server Technology manual.

If node interleaving is disabled ESX Server detects the system as NUMA and applies NUMA optimizations. If you enable node interleaving (also known as interleaved memory), ESX Server does not detect the system as NUMA. The intelligent, adaptive NUMA scheduling and memory placement policies in VMware ESX Server 3 can manage all virtual machines transparently, so that administrators do not need to deal with the complexity of balancing virtual machines between nodes by hand. However, manual override controls are also available and advanced administrators may prefer to control the memory placement (through the Memory Affinity option) and processor utilization (through the Only Use Processors option) by hand. This may be useful, for example, if a virtual machine runs a memory-intensive workload, such as an in-memory database or a scientific computing application with a large data set. Such an application may see performance improvements if 100% of its memory is allocated locally, while virtual machines managed by the automatic NUMA optimizations often have a small percentage (5-15%) of their memory located remotely. An administrator may also wish to optimize NUMA placements manually if the system workload is known to be simple and unchanging; for example, an eight processor system running eight virtual machines with similar workloads would be easy to optimize by hand. Keep in mind if you manually set the processor or memory affinity for a virtual machine, the NUMA scheduler may not be able to automatically manage this virtual machine.

Please be aware that ESX NUMA scheduling and related optimizations are enabled only on dual-core systems with a total of at least four cores. On such ESX NUMA scheduling enabled systems, for ideal performance ensure that for each virtual machine # of VCPUs + 1 <= # of cores per node. Virtual machines that are not managed automatically by the NUMA scheduler (single core and/or less then total of four cores) still run fine; they simply don't benefit from ESX Server's NUMA optimizations.

Zdroj:
Performance Tuning Best Practices for ESX Server 3
http://www.vmware.com/pdf/vi_performance_tuning.pdf

Dalsie zdroje:

Optimizing Software Applications for NUMA - Intel® Software Network
http://software.intel.com/en-us/articles/optimizing-software-applications-for-numa/

IMPACT OF NUMA EFFECTS ON HIGH-SPEED NETWORKING WITH MULTI-OPTERON MACHINES
http://hal.archives-ouvertes.fr/docs/00/17/57/47/PDF/PDCS07.pdf

Calling It: NUMA Will Be The Shizzle In Two Years - Network Computing
http://www.networkcomputing.com/virtualization/calling-it-numa-will-be-the-shizzle-in-two-years.php

Local and Remote Memory: Memory in a Linux/ NUMA System
http://kernel.org/pub/linux/kernel/people/christoph/pmig/numamemory.pdf

VMware vSphere NUMA Imbalance Error when Upgrading from ESX 3.5 to vSphere 4
Zdroj: http://blog.lewan.com/2010/05/25/vmware-vsphere-numa-imbalance-error-when-upgrading-from-esx-3-5-to-vsphere-4/

Žiadne komentáre: