Hello
I'm trying to perform a few experiments with ESX and I want to manually specify the affinity of the vCPUs and memory of a virtual machine. Specifically, I have a two socket server and I want ESX to allocate memory of a given VM on the remote node. I use the affinity controls in the Resources section of the VM Properties and set them as follows:
CPU Affinity: 0-1
Memory Affinity: Node 1
My impression was that this will force the vCPU to go to node 0 and VM's memory to go to node 1. However, running esxtop, I can see that all the memory is infact allocated on node 1, but N%L indicates a 100%. I interpret that as saying all the memory for this VM is allocated on the local node. Does this mean that the CPU affinity that I specified is incorrect or does not work somehow? Is there something overriding the specified affinities?
Just to be on the safe side, I also have set the value of Numa.AutoMemAffinity to 0 (in the Advanced Settings section). Here is the relevant portion of esxtop:
NAME NHN NMIG NRMEM NLMEM N%L GST_ND0 OVD_ND0 GST_ND1 OVD_ND1 MEMSZ javaserver0-1.1 0,1,2,3,4,5,6,7,8,9,10,11,12,13, 0 0.00 1 024.00 100 0.00 4.34 1024.00 8.16 1024.00
I would appreciate any help in explaining how I can force ESX to use the remote memory rather than local.
Thanks,
Amin
Not sure if the text is garbled or something but there appears to be something wrong with your numa home node (NHN) "0,1,2,3,4,5,6,7,8,9,10,11,12,13,". Normally that would just be a "1" since you've constrained it to only node 1. Did you shut the VM down, make the changes then start it back up?
Sorry for the text messing it up. Yes, that surprised me as well, it shows 0,1,2,3,4,5,6,7,8,9,10,11,12,13 as Numa home node, which is surprising and that's what caused me to suspect that CPU affinity is not working.
Yes, I turned off the VM, changed the settings, and then turned it back on. It doesn't matter what I set the CPU affinity to, it always shows the same string in esxtop. However when I change memory affinity I can see it moving from node0 to node1 and vice versa, but the CPU seems to be moving with it as well, hence memory always stays local, which is the opposite of what I want!
Amin
Well, playing with the settings again, I turned Numa.AutoMemAfiinity back on in the Advanced Settings, Numa section and now it seems that CPU affinity is working. I no longer see a list of 0 to 13, I just see 0 and 1 and I can force all memory allocation to go remote. Thanks!
Hello,
I encountered another situation regarding allocating memory on the remote node. It seems that after a while ESX starts to move memory of some of the virtual machines back to local node, even though the affinity specifies pinning the memory on the remote node. In one occasion I saw up to 70% of the memory of a VM moved to local node. I was wondering why that is the case and how I can disable that.
Thanks,
Amin
Can you post the VMX file for the VM you're working on? That would make it easier to see what all is currently configured for the VM.
vmx file is inlined below. For what it's worth, I'm trying this on VMmark 1.1.1. I noticed 70% javaserver's memory and 15% of the fileservers memory was moved to local node during the run. The other VMs had 0-5% of their memory on the local node. The affinity settings was set for all of them to force using remote memory. Host is a two socket server with 8 physical cores (16 logical with hyperthreding) per socket.
Amin
.encoding = "UTF-8"
config.version = "8"
virtualHW.version = "7"
pciBridge0.present = "TRUE"
pciBridge4.present = "TRUE"
pciBridge4.virtualDev = "pcieRootPort"
pciBridge4.functions = "8"
pciBridge5.present = "TRUE"
pciBridge5.virtualDev = "pcieRootPort"
pciBridge5.functions = "8"
pciBridge6.present = "TRUE"
pciBridge6.virtualDev = "pcieRootPort"
pciBridge6.functions = "8"
pciBridge7.present = "TRUE"
pciBridge7.virtualDev = "pcieRootPort"
pciBridge7.functions = "8"
vmci0.present = "TRUE"
nvram = "javaserver0-1.1.1.nvram"
virtualHW.productCompatibility = "hosted"
powerType.powerOff = "soft"
powerType.powerOn = "hard"
powerType.suspend = "hard"
powerType.reset = "soft"
displayName = "javaserver0-1.1.1"
extendedConfigFile = "javaserver0-1.1.1.vmxf"
floppy0.present = "TRUE"
numvcpus = "2"
scsi0.present = "TRUE"
scsi0.sharedBus = "none"
scsi0.virtualDev = "lsilogic"
memsize = "1024"
scsi0:0.present = "TRUE"
scsi0:0.fileName = "javaserver0-1.1.1.vmdk"
scsi0:0.deviceType = "scsi-hardDisk"
ide1:0.present = "TRUE"
ide1:0.clientDevice = "TRUE"
ide1:0.deviceType = "atapi-cdrom"
ide1:0.startConnected = "FALSE"
floppy0.startConnected = "FALSE"
floppy0.fileName = ""
floppy0.clientDevice = "TRUE"
ethernet0.present = "TRUE"
ethernet0.virtualDev = "e1000"
ethernet0.networkName = "VMmark Net"
ethernet0.addressType = "generated"
guestOS = "winnetenterprise-64"
uuid.location = "56 4d 78 8e d3 23 ff 61-1a 99 3a 48 39 9e 29 2e"
uuid.bios = "56 4d 78 8e d3 23 ff 61-1a 99 3a 48 39 9e 29 2e"
vc.uuid = "52 d0 ac 00 8b c2 3f 33-0c 54 e0 d2 b7 5f 8d 28"
ide1:0.fileName = ""
ethernet0.generatedAddress = "00:0c:29:9e:29:2e"
vmci0.id = "1378771856"
tools.syncTime = "FALSE"
cleanShutdown = "FALSE"
replay.supported = "FALSE"
unity.wasCapable = "TRUE"
sched.swap.derivedName = "/vmfs/volumes/4fd73629-0e676e00-72fe-001e673d56b0/javaserver0-1.1.1/javaserver0-1.1.1-918fcd45.vswp"
replay.filename = ""
scsi0:0.redo = ""
pciBridge0.pciSlotNumber = "17"
pciBridge4.pciSlotNumber = "21"
pciBridge5.pciSlotNumber = "22"
pciBridge6.pciSlotNumber = "23"
pciBridge7.pciSlotNumber = "24"
scsi0.pciSlotNumber = "16"
ethernet0.pciSlotNumber = "32"
vmci0.pciSlotNumber = "33"
ethernet0.generatedAddressOffset = "0"
hostCPUID.0 = "0000000d756e65476c65746e49656e69"
hostCPUID.1 = "000206d70020080017bee3ffbfebfbff"
hostCPUID.80000001 = "0000000000000000000000012c100800"
guestCPUID.0 = "0000000d756e65476c65746e49656e69"
guestCPUID.1 = "000206d700010800829822030febfbff"
guestCPUID.80000001 = "00000000000000000000000128100800"
userCPUID.0 = "0000000d756e65476c65746e49656e69"
userCPUID.1 = "000206d700200800029822030febfbff"
userCPUID.80000001 = "00000000000000000000000128100800"
evcCompatibilityMode = "FALSE"
vmotion.checkpointFBSize = "4194304"
sched.cpu.affinity = "0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15"
sched.mem.affinity = "1"
sched.cpu.htsharing = "any"
Can you also confirm the version/build of ESX used as well as the total number of LCPUs for your ESX host?
Total logical CPUs on the host are 32 (I see this when I want to specify CPU affinity).
I'm using ESXi 5.0.0 VMKernel Release Build 623860
Amin
FYI, We're looking into this internally....
Playing with the settings again, I set the Numa.PageMigEnabled to 0 and now I see only a tiny bit of the local memory used, but it's not completely zero though. All my VMs (as in VMmark 1.1.1) are now using mostly memory in the remote node, as the affinity settings dictate. Please kindly let me know if there is any other setting that I should change. I was also wondering why the local memory usage is not completely zero.
Thanks,
Amin
Can you please provide some stats to help us understand the issue?
Output file of `schedsnapshot <dir> 300 –notrace`
Thanks,
I will try to do this as soon as I get a chance, but as I mentioned eariler, when I set the Numa.PageMigEnabled to zero in the advanced settings I get the correct behaviour: memory is allocated on the remote note and stays there. There is still a residue in the local memory (generally less than 1MB), but I think I can live with that. Do you want me to run the command with the Numa.PageMigEnabled on or off?
When I try to run this comman in the ESX shell I get error: sh: schedsnapshot: not found. Can you please provide more information regardign how to exactly do this?
Thanks,
Amin
Hmm. OK let's try another way.
Either in your console or via ssh, cd into a LUN with some free space (~100MB should do it). Then run;
vsi_traverse
If it's under 50MB, you should be able to attach it here. Otherwise, I'll send you instructions on how to upload it to our ftpsite.
I ran the command successfully, the resulting file is 47 MB. I still cannot upload it via forumes, the progress bar in the upload section gets stuck at 0%. Here are the descriptions anyway. Please kindly let me know how I can get you the resulting file.
VM name: javaserver0-1.1.1
Numa.PageMigEnabled is turned on in the advanced setting
At the time I ran the command, around 7MB of memory was moved from remote node to local node
As I mentioned earlier, if I turn off Numa.PageMigEnabled, I don't see memory moving to local node anymore.