Hello.
I have a home lab that I installed very quickly because it was just for testing different OS that I had. The thing is that yesterday I tried to connect to one of the VMs and it was unresponsive, I tried to reboot and shutting down the VM from the web console but I wasnt responding, So I reboot the Host..
After that I got the following..
and on Datastores:
To be honest, I have no clue of what is going on..
My setup is on an old desktop PC with 16 gb of RAM, one SSD disk of 128 GB and a normal SATA drive of 1 TB. Running an ESXi 6.5
I Will really appreciate more input on where to troubleshoot.. None of the VMs had important information, but some of them had interesting settings that I was trying, and I would love to not having to start their testing all over again.
Here are the results of some commands that can provide more information.
[claudio@kremlin:/vmfs/volumes] esxcli storage vmfs lockmode list
Volume Name UUID Type Locking Mode ATS Compatible ATS Upgrade Modes ATS Incompatibility Reason
----------- ----------------------------------- ------ ------------ -------------- ----------------- ---------------------------
datastore1 586e2861-e58b01e0-d312-1866da25b4a2 VMFS-5 ATS+SCSI false None Device does not support ATS
[claudio@kremlin:/vmfs/volumes] esxcli storage core device list
t10.ATA_____LITEON_CV32D8D1282D11_SATA_128GB__________TW0WVD6055085683025M
Display Name: Local ATA Disk (t10.ATA_____LITEON_CV32D8D1282D11_SATA_128GB__________TW0WVD6055085683025M)
Has Settable Display Name: true
Size: 122104
Device Type: Direct-Access
Multipath Plugin: NMP
Devfs Path: /vmfs/devices/disks/t10.ATA_____LITEON_CV32D8D1282D11_SATA_128GB__________TW0WVD6055085683025M
Vendor: ATA
Model: LITEON CV3-8D128
Revision: 10B
SCSI Level: 5
Is Pseudo: false
Status: on
Is RDM Capable: false
Is Local: true
Is Removable: false
Is SSD: true
Is VVOL PE: false
Is Offline: false
Is Perennially Reserved: false
Queue Full Sample Size: 0
Queue Full Threshold: 0
Thin Provisioning Status: yes
Attached Filters:
VAAI Status: unknown
Other UIDs: vml.0100000000545730575644363035353038353638333032354d4c4954454f4e
Is Shared Clusterwide: false
Is Local SAS Device: false
Is SAS: false
Is USB: false
Is Boot USB Device: false
Is Boot Device: true
Device Max Queue Depth: 31
No of outstanding IOs with competing worlds: 32
Drive Type: unknown
RAID Level: unknown
Number of Physical Drives: unknown
Protection Enabled: false
PI Activated: false
PI Type: 0
PI Protection Mask: NO PROTECTION
Supported Guard Types: NO GUARD SUPPORT
DIX Enabled: false
DIX Guard Type: NO GUARD SUPPORT
Emulated DIX/DIF Enabled: false
t10.ATA_____WDC_WD5000AAKX2D753CA1________________________WD2DWMAYUX022826
Display Name: Local ATA Disk (t10.ATA_____WDC_WD5000AAKX2D753CA1________________________WD2DWMAYUX022826)
Has Settable Display Name: true
Size: 476940
Device Type: Direct-Access
Multipath Plugin: NMP
Devfs Path: /vmfs/devices/disks/t10.ATA_____WDC_WD5000AAKX2D753CA1________________________WD2DWMAYUX022826
Vendor: ATA
Model: WDC WD5000AAKX-7
Revision: 1H19
SCSI Level: 5
Is Pseudo: false
Status: on
Is RDM Capable: false
Is Local: true
Is Removable: false
Is SSD: false
Is VVOL PE: false
Is Offline: false
Is Perennially Reserved: false
Queue Full Sample Size: 0
Queue Full Threshold: 0
Thin Provisioning Status: unknown
Attached Filters:
VAAI Status: unsupported
Other UIDs: vml.0100000000202020202057442d574d41595558303232383236574443205744
Is Shared Clusterwide: false
Is Local SAS Device: false
Is SAS: false
Is USB: false
Is Boot USB Device: false
Is Boot Device: false
Device Max Queue Depth: 31
No of outstanding IOs with competing worlds: 32
Drive Type: unknown
RAID Level: unknown
Number of Physical Drives: unknown
Protection Enabled: false
PI Activated: false
PI Type: 0
PI Protection Mask: NO PROTECTION
Supported Guard Types: NO GUARD SUPPORT
DIX Enabled: false
DIX Guard Type: NO GUARD SUPPORT
Emulated DIX/DIF Enabled: false
[claudio@kremlin:/vmfs/volumes] esxcli storage vmfs extent list
Volume Name VMFS UUID Extent Number Device Name Partition
----------- ----------------------------------- ------------- -------------------------------------------------------------------------- ---------
datastore1 586e2861-e58b01e0-d312-1866da25b4a2 0 t10.ATA_____LITEON_CV32D8D1282D11_SATA_128GB__________TW0WVD6055085683025M 3
datastore1 586e2861-e58b01e0-d312-1866da25b4a2 0 t10.ATA_____WDC_WD5000AAKX2D753CA1________________________WD2DWMAYUX022826 2
[claudio@kremlin:/vmfs/volumes] esxcli storage filesystem list
Error getting data for filesystem on '/vmfs/volumes/586e2861-e58b01e0-d312-1866da25b4a2': Cannot open volume: /vmfs/volumes/586e2861-e58b01e0-d312-1866da25b4a2, skipping.
How did you configure the storage / datastore both?
Is the storage / datastore coming from a physical device or software?
Is the storage / datastore local / external?
As I said, it was a testing installation at first, so basically, to the 128 GB SSD I added the 1 TB sata harddrive, all local physical drives, and I expanded the datastore. It worked fine for the last 3 months..
Did you use all your physical disks to expand datastore1 ?
Thats is IMHO a very bad idea.
Anyway - check if all your physical disks still have a valid partitiontable using the tool partedUtil
Thanks Continuum
Yes. I know it is a bad idea, but as it wasn't a production system, I didnt stop to read the docs (I know.. ).
Reading the /varl/log/vmkernel.log I see the following:
2017-04-11T11:40:28.771Z cpu3:65559)ScsiDeviceIO: 2962: Cmd(0x4395009d9680) 0x28, CmdSN 0xe from world 66690 to dev "t10.ATA_____WDC_WD5000AAKX2D753CA1________________________WD2DWMAYUX022826" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x4 0x44 0x0.
2017-04-11T11:40:28.771Z cpu1:67754)WARNING: FS3J: 1634: Failed to reserve space for journal on 586e2861-e58b01e0-d312-1866da25b4a2 : I/O error
2017-04-11T11:40:28.774Z cpu1:67754)Vol3: 3090: Failed to get object 28 type 1 uuid 586e2861-e58b01e0-d312-1866da25b4a2 FD 0 gen 0 :I/O error
2017-04-11T11:40:28.774Z cpu1:67754)WARNING: Fil3: 1361: Failed to reserve volume f530 28 1 586e2861 e58b01e0 6618d312 a2b425da 0 0 0 0 0 0 0
2017-04-11T11:40:28.774Z cpu1:67754)Vol3: 3090: Failed to get object 28 type 2 uuid 586e2861-e58b01e0-d312-1866da25b4a2 FD 4 gen 1 :I/O error
My datastore seems to be across two partitions on two different disks. both disks seem to be fine according to partedUtil
[claudio@kremlin:~] esxcli storage vmfs extent list
Volume Name VMFS UUID Extent Number Device Name Partition
----------- ----------------------------------- ------------- -------------------------------------------------------------------------- ---------
datastore1 586e2861-e58b01e0-d312-1866da25b4a2 0 t10.ATA_____LITEON_CV32D8D1282D11_SATA_128GB__________TW0WVD6055085683025M 3
datastore1 586e2861-e58b01e0-d312-1866da25b4a2 0 t10.ATA_____WDC_WD5000AAKX2D753CA1________________________WD2DWMAYUX022826 2
[claudio@kremlin:~] partedUtil getptbl /dev/disks/
t10.ATA_____LITEON_CV32D8D1282D11_SATA_128GB__________TW0WVD6055085683025M vml.0100000000202020202057442d574d41595558303232383236574443205744
t10.ATA_____LITEON_CV32D8D1282D11_SATA_128GB__________TW0WVD6055085683025M:1 vml.0100000000202020202057442d574d41595558303232383236574443205744:2
t10.ATA_____LITEON_CV32D8D1282D11_SATA_128GB__________TW0WVD6055085683025M:2 vml.0100000000545730575644363035353038353638333032354d4c4954454f4e
t10.ATA_____LITEON_CV32D8D1282D11_SATA_128GB__________TW0WVD6055085683025M:3 vml.0100000000545730575644363035353038353638333032354d4c4954454f4e:1
t10.ATA_____LITEON_CV32D8D1282D11_SATA_128GB__________TW0WVD6055085683025M:5 vml.0100000000545730575644363035353038353638333032354d4c4954454f4e:2
t10.ATA_____LITEON_CV32D8D1282D11_SATA_128GB__________TW0WVD6055085683025M:6 vml.0100000000545730575644363035353038353638333032354d4c4954454f4e:3
t10.ATA_____LITEON_CV32D8D1282D11_SATA_128GB__________TW0WVD6055085683025M:7 vml.0100000000545730575644363035353038353638333032354d4c4954454f4e:5
t10.ATA_____LITEON_CV32D8D1282D11_SATA_128GB__________TW0WVD6055085683025M:8 vml.0100000000545730575644363035353038353638333032354d4c4954454f4e:6
t10.ATA_____LITEON_CV32D8D1282D11_SATA_128GB__________TW0WVD6055085683025M:9 vml.0100000000545730575644363035353038353638333032354d4c4954454f4e:7
t10.ATA_____WDC_WD5000AAKX2D753CA1________________________WD2DWMAYUX022826 vml.0100000000545730575644363035353038353638333032354d4c4954454f4e:8
t10.ATA_____WDC_WD5000AAKX2D753CA1________________________WD2DWMAYUX022826:2 vml.0100000000545730575644363035353038353638333032354d4c4954454f4e:9
[claudio@kremlin:~] partedUtil getptbl /dev/disks/t10.ATA_____LITEON_CV32D8D1282D11_SATA_128GB__________TW0WVD6055085683025M
gpt
15566 255 63 250069680
1 64 8191 C12A7328F81F11D2BA4B00A0C93EC93B systemPartition 128
5 8224 520191 EBD0A0A2B9E5443387C068B6B72699C7 linuxNative 0
6 520224 1032191 EBD0A0A2B9E5443387C068B6B72699C7 linuxNative 0
7 1032224 1257471 9D27538040AD11DBBF97000C2911D1B8 vmkDiagnostic 0
8 1257504 1843199 EBD0A0A2B9E5443387C068B6B72699C7 linuxNative 0
9 1843200 7086079 9D27538040AD11DBBF97000C2911D1B8 vmkDiagnostic 0
2 7086080 15472639 EBD0A0A2B9E5443387C068B6B72699C7 linuxNative 0
3 15472640 250069646 AA31E02A400F11DB9590000C2911D1B8 vmfs 0
[claudio@kremlin:~] partedUtil getptbl /dev/disks/t10.ATA_____WDC_WD5000AAKX2D753CA1________________________WD2DWMAYUX022826
gpt
60801 255 63 976773168
2 128 976773128 AA31E02A400F11DB9590000C2911D1B8 vmfs 0
[claudio@kremlin:~] esxcfg-scsidevs -m
VmFileSystem: SlowRefresh() failed: Cannot open volume: /vmfs/volumes/586e2861-e58b01e0-d312-1866da25b4a2
t10.ATA_____LITEON_CV32D8D1282D11_SATA_128GB__________TW0WVD6055085683025M:3 /vmfs/devices/disks/t10.ATA_____LITEON_CV32D8D1282D11_SATA_128GB__________TW0WVD6055085683025M:3 586e2861-e58b01e0-d312-1866da25b4a2 0 datastore1
t10.ATA_____WDC_WD5000AAKX2D753CA1________________________WD2DWMAYUX022826:2 /vmfs/devices/disks/t10.ATA_____WDC_WD5000AAKX2D753CA1________________________WD2DWMAYUX022826:2 586e2861-e58b01e0-d312-1866da25b4a2 1 datastore1
[claudio@kremlin:~] df -h
VmFileSystem: SlowRefresh() failed: Cannot open volume: /vmfs/volumes/586e2861-e58b01e0-d312-1866da25b4a2
Error when running esxcli, return status was: 1
Errors:
Error getting data for filesystem on '/vmfs/volumes/586e2861-e58b01e0-d312-1866da25b4a2': Cannot open volume: /vmfs/volumes/586e2861-e58b01e0-d312-1866da25b4a2, skipping.
Hi Claudio
the error-messages you posted tell me that this will not be a trivial task.
We will very likely have no problems to recover the small files - vmx-file, vmsd-file if necessary, vmdk-descriptor-files.
The large binary files will be the tricky part.Here is a rough overview of the way I would proceed here:
- small files: dump vmfs-header of the disk that is extend number 1 - strings dump > file (this will very likely help us with the small files
- large files: plan A: try if you still can run the esxi-command
vmkfstools -p 0 name-flat.vmdk > /tmp/map-for-name-vmdk.txt
Please try vmkfstools -p 0 against all flat and all delta vmdks.
Also try if you can still follow this kb-entry:
Collecting and applying raw metadata dumps on VMFS volumes using DD (Data Description) (1020645) | V...
At the moment I still have no idea what caused the extends to dsappear - that is a detail we should find out first.
Claudio - if the data is reproducable I would suggest that you forget the recovery-plans and regard this as an expensive lesson for the future:
do not use extends - NEVER.
If the data is too important to discard everything - call me via skype.
I may be able to help you ....
Ulli
Thanks continuum.
Indeed, I have learned a lesson here..
I tried with the KB entry that you sent me, but on every step I got:
VmFileSystem: SlowRefresh() failed: Cannot open volume: /vmfs/volumes/586e2861-e58b01e0-d312-1866da25b4a2
so.. I couldnt recover anything.
At this point I decided to reformat the drives and create new datastore (this time no extends)
Thanks so much for your time and effort.
Greetings!
I have identically the same problem.
My ESXi version is 6.7 upd3 and VMFS 6. So what I can do with this? By the link upstairs there is an note:
Not applicable to VMFS 6.