VMware Cloud Community
jlorang
Enthusiast
Enthusiast
Jump to solution

ESXI 7 long boot times after San replacment

Updated the San and now the two hosts take a very long time to boot ( 30-45 minutes) I took a screenshot of the place where the Server is hanging. Seems like the "nsleep returned 4" needs to process and timeout before moving on. The host does eventually boot however I would like to figure out why the delay. Both hosts running ESXI 7 update M so its the latest and greatest version 7 has to offer. Thank You for the help in advance.

0 Kudos
1 Solution

Accepted Solutions
jlorang
Enthusiast
Enthusiast
Jump to solution

Thank You all for the suggestions.  I ended up removing the all the the Static and Dynamic storage Targets and re-adding.  After re-adding the SAN back into the hosts all is normal now.  Thank You all for the help

 

 

View solution in original post

0 Kudos
8 Replies
a_p_
Leadership
Leadership
Jump to solution

You please clarify "Updated the San"?
Did you just install an update, or did you replace the storage system?

André

0 Kudos
jlorang
Enthusiast
Enthusiast
Jump to solution

Thank You for the very fast reply.

To clarify, the San was replaced and storage vMotion was performed from the old San to the new San.

0 Kudos
a_p_
Leadership
Leadership
Jump to solution

What kind of SAN do you use (FC, iSCSI, NAS)?

Did you unmap/detach the old storage system LUNs from the ESXi hosts, and also cleanup the dynamic/static targets in case of using Software iSCSI?

André

0 Kudos
jlorang
Enthusiast
Enthusiast
Jump to solution

Hello - The San is connected via ISCSI.  The old San was/is removed from the network, and I did remove the ISCSI targets from both ESXI hosts.  

0 Kudos
a_p_
Leadership
Leadership
Jump to solution

Did you remove/detach the old LUNs before shutting down the old storage system?

Does esxcli storage core device detached list show detached LUNs?

Do you have LUNs that are used as RDMs for e.g. a Microsoft Cluster?

André

pashnal
Enthusiast
Enthusiast
Jump to solution

Hi , 

Please check if you have RDM presented to these hosts and reserve the using the below KB which will reduce the boot time as scanning the RDMs will take tiime . 

 

https://kb.vmware.com/s/article/1016106 

Please give a Thumbsup !!

Thanks 

pmichelli
Hot Shot
Hot Shot
Jump to solution

I want to say this is likely normal depending on the SAN you are using.  I have worked with dozens of vendors over the years.  Some allow ESXi to boot and mount the iSCSI very fast while others seem to take a few extra minutes.

I am currently managing Dell EMC scv3020 arrays and there is about a 1-2 minute delay during boot when it is bringing up the iSCSI. I have 2 sites doing the same thing.  One is presenting 40 LUNs the other only 5 and the delay is equal.

I never really dug too deep into it because in the end it just works once the OS has loaded.  I have a feeling (at least in my situation) that it is trying to mount something from the passive controller and eventually timing out.  VeeamOne send me an "errror" that it could not initialize one path.  EMC told me this was expected behaviour based on how we were configured.

When I managed 3PAR or Hitachi storage, the volumes would mount almost instantly.

What array type did you migrate to?  In your situation, 35 minutes is excessive and likely something is not properly configured. Have you worked with the storage vendor to diagnose this?

Example of what I see :

Alarm:

Connection to iSCSI storage target failure

Status:

Error

Previous status:

Reset/resolved

Time:

7/11/2023 1:42:22 PM

Details:

Fired by event: esx.problem.storage.iscsi.target.connect.error
Event description: Login to iSCSI target iqn.2002-03.com.compellent.somenumberstring on vmhba66 @ vmk2 failed. The iSCSI initiator could not establish a network connection to the target.
Initiated by: Not Set

jlorang
Enthusiast
Enthusiast
Jump to solution

Thank You all for the suggestions.  I ended up removing the all the the Static and Dynamic storage Targets and re-adding.  After re-adding the SAN back into the hosts all is normal now.  Thank You all for the help

 

 

0 Kudos