Going by the functions reported in the stack, looks like the host has crashed in the ahci driver module during extensive logging. Do you have any logs set to verbose or debug level logging? Any other recent changes made to your environment?
Cheers,
Supreet
Actually this is a new installation, this pink screen is happening every 2-3 days. i have not set any logs as you mention
What hardware are you running ESXi on here?
It is a Power Edge T640
You checked to ensure you have completely up-to-date BIOS and firmware on this server?
Hello, I was wondering if you found a resolution to this issue? We are running VMware ESXi 6.7 on a PowerEdge T640 and it just started presenting this very same error.
Dear Sir.
I Found This Same Screen.
Use Exsi 14 Day. I Found This Screen.
Thak You
Nisit
You found the solution for this psod
I experienced the exact same issue. In my case it was caused by a faulty driver/controller/device (AHCI / DVDROM). I solved it by disabling 'vmw_ahci' and 'ahci' drivers since I didn't use the DVDROM anyway. The SSD and HDD are on a separate RAID controller, so if that's the same in your case:
esxcli system module set --enabled=false --module=vmw_ahci
esxcli system module set --enabled=false --module=ahci
and reboot the server. The servers which have the same experience are stable now.
I also wrote an article about this on my blog, feel free to read:
Solving PSOD 'Panic Requested by another PCPU' - Jume - My Virtualization Blog
Thank you for this post.
After crashing 3 times in 60 minutes, the server has been stable for 24 hours now.
Moderator note: Moved to ESXi
I have PSOD very similar to yours, however I am running an Cisco UCS blade. Did you find out anything for your PSOD? Cisco pointed me here, but I don't have any AHCI drivers installed.
I just got this same PSOD running on a UCS blade, firmware 4.04(d) and ESXi 6.7 Update 3. Did you find anything for this?
Hi ElizabethFoster,
Your PSOD screen shows Memory Controller Read Error messages which suggest hardware problems. If the firmware is up-to-date and supported, I would next recommend running hardware diagnostics.
--
Darius
I know this is an old post but it's what comes up when you google the issue. There's a KB article that covers exactly what is going on - https://kb.vmware.com/s/article/67560 but it is exceptionally poorly written.
There are two solutions to fix this, either remove/replace the CD/DVD drive or disable AHCI, assuming AHCI isn't being used for anything else. Unfortunately this bug can't be fixed via firmware updates so these drives are trash.
The fix is exactly what bouke posted, disable AHCI using the following commands:
esxcli system module set --enabled=false --module=vmw_ahci
esxcli system module set --enabled=false --module=ahci
If you want to confirm you have the problematic drive you can run this PowerCLI command to find you CD/DVD drive model.
Get-VMHost | where {$_ | get-scsilun -LunType cdrom} | Select Name,@{N="Vendor";E={$_ | Get-ScsiLun -LunType cdrom | select Vendor}},@{N="Model";E={$_ | Get-ScsiLun -LunType cdrom | select Model}}
If it says DU-8A5LH then it's a ticking time bomb waiting to PSOD at any moment (even when not using the drive).