VMware Cloud Community
jrmunday
Commander
Commander

Performance degredation related to CPU Scheduling and NUMA nodes.

I have an interesting scenario (HP vs DELL hardware) with potentially degraded performance (specific to the DELL R815 hardware) and I would like to know if what I am seeing is being interpreted correctly, or whether am I simply being over cautious and don't actually have an issue.

Summary;

  1. Although a much higher hardware specification, the DELL R815 ESXi hosts are not scheduling the CPU cycles as efficiently as the HP DL585 G6 hardware. The impact we are seeing is an increased CPU ready time and performance degradation of the guest VM’s. This is evident with a very low number of guest VM’s on the host and increases as the consolidation ratio is ramped up or the CPU load is increased on any of the guest VM’s.
  2. There also appears to be an imbalance in the NUMA nodes where a particular node is favoured and the % NUMA local memory is not as efficient as it should be (ie. the HP hardware performs much better than the DELL hardware)

DELL Technical Details;

Hypervisor       : VMware ESXi 4.1.0, build 582267

Hardware specification;

Dell PowerEdge R815

- Model : AMD Opteron(tm) Processor 6174

- Processor Speed : 2.2 GHz

- Processor Sockets : 4

- Processor Cores per Socket : 12

- Logical Processors : 48

- Memory : 256 GB

esxtop performance statistics;

DELL Memory (incl NUMA statistics);

DELL_Mem_NUMA.png

Dell CPU;

DELL_CPU.png

Observations;

  • NUMA home node #7 is favoured, rather than balancing the load across all 8x nodes
  • % NUMA local memory is inefficiently allocated
  • Very low consolidation ratio of guest VM’s per host
  • Very low load on the host and already seeing ready time

Example of the affected Guest VM

Guest_VM.png

DELL Host is under no load whatsoever;

DELL_Resource.png

As a contrasting perspective from a heavily loaded HP DL585 G6 host, this is what I would “expect” to see;

HP Technical Details;

Hypervisor       : VMware ESXi 4.1.0, build 582267

HP Hardware specification;

HP ProLiant DL585 G6

- Model : Six-Core AMD Opteron(tm) Processor 8435

- Processor Speed : 2.6 GHz

- Processor Sockets : 4

- Processor Cores per Socket : 6

- Logical Processors : 24

- Memory : 128 GB

esxtop performance statistics;

HP Memory (incl NUMA statistics);

HP_Mem_NUMA.png

HP CPU;

HP_CPU.png

Observations;

  • HP host is of a lower hardware specification than the DELL host
  • HP host has almost 4x the number of guest VM’s hosted and does not suffer from the same performance issues
  • NUMA home node #0 is favoured, but there is a much better allocation of NUMA local memory (more efficient) – close to 100%
  • Much higher consolidation ratio of guest VM’s per host without performance issues
  • Much higher load on the host and almost ZERO ready time

HP host still has capacity, but is under much more load the than the affected DELL host;

HP_Resource.png

In both cases (HP and DELL) we do expect to see a certain level of ready time, but the levels seen on the DELL hardware are of concern, as well as the inefficient use of NUMA local memory. This issues is not seen on the HP hardware, including earlier and later generation hardware.

So the questions are;

  1. Have I interpreted this correctly?

  2. Has anyone else see this before? If yes, how was this resolved?

  3. What next steps can be taken to test and verify this information

vExpert 2014 - 2022 | VCP6-DCV | http://www.jonmunday.net | @JonMunday77
0 Kudos
32 Replies
kpc
Contributor
Contributor

That sounds great Jon.

I did get round to trying out three of the BIOS settings, DMA Virtualization Enabled, C1E Enabled and Power to MAX.

I did notice a small increase in boot speed but not had time to properly test yet.

I got a call back from support yesterday who said they will look into it but thanks for your efforts in digging into this problem, much appreciated.

Pete

0 Kudos
kpc
Contributor
Contributor

Just a quick update.  I can't confirm yet if this issue is linked to the one reported but after setting the C1E to Disabled in the BIOS my tests shows a 50% boot time increase in my XP and LInux VM.  I'm running on 469512. 

0 Kudos
jrmunday
Commander
Commander

Thanks for the feedback Pete. I don't believe it is related, but useful knowing the results of your testing.

Here is an interesting read regarding the C states;

http://en.community.dell.com/techcenter/high-performance-computing/w/wiki/2288.aspx

I have decided to DISABLE this on all of my ESXi hosts.

Testing so far looks really good with the hot patch that VMWare provided me and I don't see the NUMA issue that I previously saw. I did a clean build upto June patch level excluding the hotpatch and I can see the issue. As soon as I apply the hot patch the NUMA balance is evenly spread and the NUMA local memory % is almost 100% on all running guests. I'm currently rebuilding one entire cluster to this level.

For testing purposes, I could give you the hot patch that I have been provided to see if you see the same benifits. I would however only use this for testing and push VMware to provide you with the same hot patch if it works for you.

Cheers,

Jon

vExpert 2014 - 2022 | VCP6-DCV | http://www.jonmunday.net | @JonMunday77
0 Kudos
MKguy
Virtuoso
Virtuoso

I also noticed such a behavior over a year ago when we were still on 4.1 U1 and described it here:

http://communities.vmware.com/thread/313253

I think Dev09 made some important points there

Some also discussed it in the comments of http://frankdenneman.nl/2010/09/esx-4-1-numa-scheduling/

Looking at the stats in resxtop on our 5.0U1 hosts today, it seems a bit better though I still have to wonder why rarely some VMs have less than 90% memory N%L.

Not like we have any real issue here though.

-- http://alpacapowered.wordpress.com
0 Kudos
jrmunday
Commander
Commander

Yes, that's exactly the same issue I observed. With the hot patch that VMware provided me to address this, the issue dissappears and almost all VM's are 100% local and there is no longer the increased %ready time.

I wonder if this will be addressed for general release in version 5.1 that has just been announced - I will definitely be testing this as soon as it's available.

http://www.vmware.com/products/vsphere/esxi-and-esx/overview.html

vExpert 2014 - 2022 | VCP6-DCV | http://www.jonmunday.net | @JonMunday77
0 Kudos
pgotsis7
Contributor
Contributor

I am facing the same issue with our r815 systems. We have updated BIOS, Firmware etc to latest and the ESX build number we are using is 768111.

The performance of the VMs is problematic. This is still an environment under implementation but we need to go live soon.

How can I get the hot patch to try? When shall we expect the patch to get to us through the official channels?

0 Kudos
lhoagland
Contributor
Contributor

The update to my SR was

"Here's an update on your case.

this fix is included in ESX 5.0 P04 which was released September 27 (KB 2032584), with the details of the fix in KB 2032586.

All patches get rolled up into the update release so it will also be in U2 which is scheduled for December this year"

The fix they are talking about is HotPatch for PR 875553.

Hope that helps someone as I don't see any of them in update manager maybe I'm missing something.

VCP 2 Certification #: 5594
0 Kudos
jrmunday
Commander
Commander

This issue has been fixed in the latest set of patches that have just been released (27/09/2012) - successfully tested.

If you are running ESXi 5.0.0 build # 821926 then you should see the issue resolved.

See: http://www.vmware.com/patchmgr/findPatch.portal (Build 821926)

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=203258...

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=203259...

PR787454: After performing vMotion operations on virtual machines, a NUMA imbalance might occur with all the virtual machines being assigned the same home node on the destination server.

Let me know if you need any additional information.

Cheers,

Jon

vExpert 2014 - 2022 | VCP6-DCV | http://www.jonmunday.net | @JonMunday77
0 Kudos
jrmunday
Commander
Commander

Yes, the fix is included in this patch release (successfully tested);

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=203258...

vExpert 2014 - 2022 | VCP6-DCV | http://www.jonmunday.net | @JonMunday77
0 Kudos
pgotsis7
Contributor
Contributor

We had temporarily resolved the problem by setting the NUMA configuration, as described in the previous posts, and working a bit with the BIOS of the server. We also have upgraded to 821926. Do we need to get NUMA back to the original value?

0 Kudos
lakey81
Enthusiast
Enthusiast

FYI for anyone with this issue on 4.1.  VMware told me it can't be fixed (or they won't fix it) on 4.1 so we're just SOL.

0 Kudos
lhoagland
Contributor
Contributor

Also successfully tested here as of last night, after the patch overall CPU load has gone from 60-70% to 14-24% per host and as expected %RDY has dropped into single digits or below.

Thanks jrmunday! Without your great writeups and you persistence with VMware support these systems would have gone back to Dell and everyone would have been left with a bad impression of AMD here.

Latney Hoagland

VCP 2 Certification #: 5594
0 Kudos
jrmunday
Commander
Commander

No problem at all, I'm glad your issues are resolved and that I can finally contribute something to the community.

Let me know if you use the Dell vCenter plugin and need any help getting it working with OME / vCenter (including SNMP) ... it was a real pain to get working (with all the Firmare update features) but actually real simple once you know what needs doing ... I just havent had time document this for others.

Cheers,

Jon

vExpert 2014 - 2022 | VCP6-DCV | http://www.jonmunday.net | @JonMunday77
0 Kudos