VMware Cloud Community
alex777
Contributor
Contributor

system crash (ESX3.5)

HI all !

ESX 3.5, 64607 server on Supermicro X7DBU, 2 Xeon E5335, 16 GB RAM. SAS Datastorage YA-12SAES3 connected by LSI fas 3442x.

Some days all worked well. Then some of the VMs became unaccessible (do not answer on ping, there is no access through service console in VIC). Tried to power off them through VIC – timeout. Tried to reboot the server (through VIC and reboot through shell) – did not work also.

As a result I had rebooted the server pressing the reset button.

The server was loaded normally. VM were started.

Such situation (crash) repeats already third time.

In attach a piece of a vmkernel log at the moment of system crash.

It would be desirable to understand in what the reason of this system crash ?

0 Kudos
3 Replies
Texiwill
Leadership
Leadership

Hello,

I suggest you open a VMware Support Call, but from my experience most crashes occur due to faulty hardware.... I tend to do the following:

1) Upgrade firmware to acceptable levels by VMware ESX

2) Make sure the BIOS settings are correct for VMware ESX

3) Run vendor diagnostics for at least 24 hours. 48 preferably

4) Run memtest86 for at least 24 hours (48 preferably)

If everything passes, then I would consider it an ESX related bug. Some diagnostics take a while and should run for a while, as you may have a temperature problem. We had a problem with the glue used to hold the heatsink to the CPU.... Caused no end of issues.


Best regards,

Edward L. Haletky

VMware Communities User Moderator

====

Author of the book 'VMWare ESX Server in the Enterprise: Planning and Securing Virtualization Servers', Copyright 2008 Pearson Education. CIO Virtualization Blog: http://www.cio.com/blog/index/topic/168354, As well as the Virtualization Wiki at http://www.astroarch.com/wiki/index.php/Virtualization

--
Edward L. Haletky
vExpert XIV: 2009-2023,
VMTN Community Moderator
vSphere Upgrade Saga: https://www.astroarch.com/blogs
GitHub Repo: https://github.com/Texiwill
kjb007
Immortal
Immortal

You seem to be having a LOT of SCSI errors. As Texiwill states, you should open an SR. What kind of storage are you using? Is it all local disk? Is your hardware on the HCL?

-KjB

vExpert/VCP/VCAP vmwise.com / @vmwise -KjB
0 Kudos
alex777
Contributor
Contributor

I already opened SR but while they anything could not help.

Datastorage - YA-12SAES3, connected to ESX by SAS.

0 Kudos