Hello
I experiencing the VM Power On Failures issue on vSphere 7 (same on ESXi Host) with adding PCIe device NVIDIA GPU.
- ESXi Task result
- vSphere Power On Failures
- ESXi Host Configure
- VM Configure
There is no NVIDIA GRID vGPU Profiles either (empty and missing profiles)
I even can not check the HCL for the Supermicro vendor in Lifecycle Manager.
There is no vendor add on or something else for Supermicro.
What should I do with them?
Thank you.
both L40 show 0 Bytes - is the gpu manager installed correctly ?
try "nvidia-smi" on the console to check
Thank you for the check.
The gpu manager was not installed, so I installed and reboot the host.
Now I can see the vGPU Profiles on L40 GPUs!
also the Memory of each L40 GPU shows 44.9GB now.
But the VM still cannot start with Direct I/O and vGPU profile neither.
++
I tried
but still have same error and cannot start the guest vm with gpu.
Have a look at the vmware.log and it might give a clue that why the power-on is failing.
Considering that each L40s has 48GB VRAM, it might be the MMIO size, 2 x 48GB = 96GB. The MMIO size has to be a power of 2; starting at 32GB, 64GB, 128GB .. etc
Assuming the VM is already configured for EFI virtual firmware, you could try adding/editing the vmx with the following lines to increase the MMIO size.
pciPassthru.use64bitMMIO = "TRUE"
pciPassthru.64bitMMIOSizeGB = "128"
do you want to assign the L40 with passtrhough oder with gpu profiles ?
with passtrough you assign the whole L40 to one vm, with profiles many vm can use one L40
afaik you dont need the gpu manager on the host with passthrough.
if you want gpu profiles the gpu manager is requried, you also need valid nvidia grid licenses and an nvidia license server in cls- oder dls-mode
The VM's Configuration Parameters that you mentioned were set up already.
What is the vmware.log and where can I found it?
vSphere and ESXi Host client's event log does not show any details for the vm power on failure.
(vCenter is provisioned on the one of the ESXi host in the cluster as a VM)
I want to assign the L40 with passthrough first.
but the passthrough enabled, vm cannot start (without gpu manager in the esxi host)
The vmware.log files should be in the same location where the VM is stored.
have you tried "dynamic directpath i/o" ?
is the vm configured to efi boot?
have you reserved all memory for the vm ?
I already tried dynamic too.
EFI boot and memory reservation also set up.
I found the NVIDIA vGPU does not support ESXi 7, but ESXi 8 would work with L40.
Supported Products :: NVIDIA Virtual GPU Software Documentation
But I can not find any reason that the passthrough mode still not working.
Now I'm install the GPU on Windows baremetal Host(workstation). so I will check the vmware.log later.
Thank you
According to the link the L40 is supported with ESXi 7 and 8
I have 20 hosts with 3 Tesla each running on ESXi7.0U3
btw: your vcenter ist 7.0a from may 2020 and your host 7.0u3o from sep 2023 ?
My vCenter is VMware vCenter Server Appliance 7.0.0.10300
and the ESXi Host is ESXi-7.0U3g-20328353-standard