VMware Cloud Community
Martin_Horton
Contributor
Contributor

Odd NFS Datastore behavior

I was running a small lab with 2 vmhosts (6.7) with a vCenter Server appliance. All the Datastores were stored on a Synology NAS and accessed via NFS. This has been running happily for several years.

The other day, due to persistent failures of the UPS (now fixed) all devices lost power several times. When I was able to restart following repair of the UPS, the Virtual Center Server Appliance (VCSA) VM would start but even though the console showed it having the right IP I could never access it. After numerous attempt to install a new VCSA all of which failed I eventually abandoned the effort.

Because all the networking was defined in a DSwitch which can only be administered by VCSA I eventually reset one host and reinstalled Esxi 6.7 and attached a Datastore to the Synology NAS. The Datastore attaches and can be browsed as one would expect, but then, after registering a VM from that Datastore and starting the VM, the VM continues to run, but any attempt to browse the Datastore results in an error that simply states that an error occurred and to try again later. After that it never works again.

Where is a log file that might yield useful diagnostic information located?

It is worth noting that the NFS share can be browed on the Synology and I can even browse it from my Windoes 10 computer and it all appears to be correct.

The vmhost is running 6.7.0 Update 3 (Build 15160138). Also, I was trying to install VCSA from VMware-VCSA-all-6.7.0-17713310.iso. Can anyone see any reason why that install would fail?

Any help will be much appreciated.

0 Kudos
3 Replies
sparky-
Contributor
Contributor

Sounds like the reference mapping changed on the NFS. So that /vmfs/volumes/some-GUID is no longer once it what was. I have had that happen a couple of times due to power failure. If you ls the /vmfs/volumes/NFS-MOUNT will show red

0 Kudos
Martin_Horton
Contributor
Contributor

pic2.png

Above is a copy of the LS command run on the vmhost2, the one that is NOT working.

Below is a copy of the LS command executed on vmhost1, the one which is working.

Some extra points worth noting. The shares VM2, VMs and VCSA are all on the same volume on the Synology. Both hosts are running identical software. Before I did this I restarted the Synology and both hosts just to make sure no leftover issue was at fault.

VM2 and VMs can be browsed, but VCSA cannot - it yields that absurd "Error. Please try later"

pic1.png

  

0 Kudos
Martin_Horton
Contributor
Contributor

I finally got this to work. I have no explanation as to why but I will tell you what I did and see if anyone can explain why it made a difference.

Because the Synology holds all the data and is only accessed by the VMHosts I put them on a separate VLAN. In order for the VMhost to access the Synology I created a vmKernel NIC on that VLAN that supported no services. Thyis is exactly the setup I have at a customer's site that is working flawlessly and has been working here in this lab for years.

What I did was to delete the vmKernel NIC and so it was forced to use the Management vmKernel NIC. Now the Datastores are stable. Can anyone explain why this could have conceivable have been a problem. My guess is it had NOTHING to do with the problem but was coincidence.

0 Kudos