VMware Cloud Community
brianj2279
Contributor
Contributor

Cannot add datastore or do a rescan

I'm having this issue with all 5 of my hosts in a cluster, I'm running 3.5 update 1 connected to an emc cx380 san via FC, and am

trying to add a datastore or even try to do a rescan and I get the

error message saying: The request failed because the remote server took

too long to respond. I've gone into the esx host and did: tail -f

/var/log/vmkernel and am constantly seeing:

Mar 26 16:01:39 SEAESX05 vmkernel: 0:21:29:27.601 cpu3:1048)SCSI: 593:

Queue for device

vml.020002000060060160c5d31b0062078309b94edd11524149442035 is being

blocked to check for hung SP.

The lun that I am trying to add is showing available to the host, but

we cannot add it. I've tried tresspassing the lun also and still having

same error message pop up. The only possible fix I've found is this:

iSCSI or FC Device Rescan Times Out or Takes Very Long if LUNS are Unavailable

Description: If you remove all LUNs from an iSCSI or FC device, all paths to the LUNs become unavailable. If you then perform a rescan, an operation has timed out error message appears shortly after the rescan starts. The server remains unresponsive until the rescan is complete. Rescan may take 15-60 minutes. The operation has timed out message also appears when you remove the last static target. The server remains unresponsive until the remove LUN task is complete, which may take about 10 minutes.

Workaround: Assuming the all-paths-down condition was planned, a workaround would be to:

  1. Shutdown
    the ESX Server.

  2. Remove/unassign
    the LUNs from the array management software.

  3. Power
    on the ESX Server.

I'm pretty much stumped on this one.

Tags (2)
0 Kudos
3 Replies
christianZ
Champion
Champion

I guess you haven't presented the luns through the owning SP - each lun should be presented through the same SP with the same lun id.

0 Kudos
habibalby
Hot Shot
Hot Shot

Hello,

_Sorry to hijack this thread!!!! _

I'm having the same issue with my four hosts. But mine is abit different becuase I'm facing this issue with RDM. SQL cluster across boxes. All the hosts are seeing these LUNs. I can add datastore or do a rescan only while the LUNs are not presented to the host. Also I can Browse the Datastore, able to see the Add Storge Wizard when Adding new DataStore and do a Rescan only on the host where the Active SQL Node running. Suppose the RDM Luns preseneted to the host1, and host2 and the SQL VM running on host1, I can only browse the datastores, do a rescan and add new storge to that host. I cannot do the same with hosts2.

But, if I remove those LUNS "RDM Luns" from host2, host3 & host4. I can do a rescan, adding other LUNs than the RDM presented to the VM, able to see the Adding Storage Wizard.

Best Regards,

Hussain Al Sayed

.

If you find this information useful, please award points for "correct" or "helpful".

Best Regards, Hussain Al Sayed Consider awarding points for "correct" or "helpful".
0 Kudos
emcclend
Enthusiast
Enthusiast

I had similar to issues to both problems here in which I could not add a datastore as it would time out and I had long boot times. I'm also using RDM's which I use for 2 MSCS accross boxes. Below is a quote from my previous post in which I found an answer that helped me. Changingthe SCSI Retry time increased my boot time and allowed me to again add datastores without the timeout issue.

Previous Post Answer:

I have been doing some research and I think I have made some progress. I found out that I have been getting a lot of SCSI errors in the VMkernel log. I did some more digging and found out that I can change the SCSI retry times from 80 to 10 and it did wonders from my reboot time. Now instead of taking 20 minutes to boot up, it takes less than 5 minutes now. Much better. I made the change in the host configuration -> Advance Setting -> SCSI -> SCSI Retry. 80 was the default and 10 was suggested as being a good value. This has helped and I will be keeping an eye on what the effect may be doing but so far it has helped with boot times.

0 Kudos