Hi
We are running ESX3.5 on a number of IBM 3650s connected via Fiber to a IBM DS3400 SAN running in HA. I recently had to install the latest server to connect to this setup and ran across a few issues.
I have made the mistake of removing the SAN partitions from my Partitioning list when I installed ESX 3.5 on a server already connected to the SAN. My reasoning was that I don't want the installer to touch the SAN and to install to local mirror on the server and that I would add the SAN via the VirtualCenter interface later.
Unfortunately, what I have now is a server that can't see the volumes on the SAN properly. To explain:
Under Storage Adapters, I can see all the LUNs from the SAN (a esxcfg-mpath -l also shows everything similar to my other 'working' servers)
Under Storage , I see only my local storage
If I try to add storage, it shows me the LUNs but when I select a LUN, it shows it as empty (which it isn't) and wants to partition and format it. This I don't want to hazard because I can't afford to lose the current VMs on that Volume
Is there a way to just add the paths to /vmfs/volumes to get to see the SAN volumes? I have more ESX servers on the SAN and they show the volumes under /vmfs/volumes
For completeness, I have tried to reinstall ESX and even recreated the server raid, but the first list of partitions including the vmfs volumes never appeared again during reinstallations, now it just shows 'free'.
Any help would be appreciated
Tiaan
Hi Lightbulb
Thanks for the tip, but unfortunately ,didn't make a difference (it wasn't set, so I set it). I did a rescan of the Storage Adapters and then a refresh of the storage, nothing changed, so I tried to add the storage again and still shows empty. Perhaps I'm not following the correct process?
I attach a few screenshots (consolidated into new.gif) -
New - Storage Adapters.gif - List of Storage Adapters and paths
New - Storage.gif - current Storage settings
New - Storage add 1.gif - List of Luns
New - Storage add 2.gif - Show as empty
and then my existing servers (which can all see the SAN volumes, yes)
Existing - Storage Adapters.gif
Existing - Storage.gif - showing the same Lun's and their free space
i am worried that perhaps i need to set something on the SAN for the HBA rather than something on VMware.
Are you saying that you deleted the existing VMFS LUNs that other servers were using during the install of a new SAN server?
If so, that data is most likely gone and it is only a matter of time before the other ESX hosts that are referencing those LUNS will Purple screen.
I had a similar issue when our SAN admin got confused and deleted a handful of my VMFS LUNS. It happened on a Friday. Right after it happened our Exchange VM started getting corruption errors, but still worked. Sunday night the ESX hosts all Purple screened and we lost most of the data. It seems even though the LUNS are deleted, that they can stay up for a certain amount of time with the way that ESX handles VMFS caching.
-MattG
Hi MattG
Luckily, no, what I seemed to have deleted is the configuration for the esx to mount those volumes. the data (luckily for me) is still there, as all my other ESX hosts can see the data just fine. But for some reason this particular ESX server seems to think all the drives are empty (as my picture showed) and wants to create partitions. I am wary to do this as I don't want to lost the data I do have if something goes wonky.
So to recap.
My existing ESX boxes all see the SAN volumes fine and no data has been lost
The new ESX box sees the SAN Luns but thinks the volumes are empty and thus wants to partition and format them rather than just adding the existing volumes.
The fact that a total server reinstall (including recreating the raid sets on the server itself) did not help me get back the partition view on ESX install which included the volumes on the SAN (only shows free space, unlike when I originally installed and got vmfs partitions which I thought it was going to create and thus removed) I am wondering if some configuration on the SAN for those two HBAs connected from the new servers isn't wonky. (My SAN knowledge is not at a level to make a definitive declaration )
Tiaan
I am still not clear. You installed a new server that was attached to the SAN during install. On partition screen the installer said that it saw other LUN and asked if you wanted to delete them and you chose yes?
-MattG
Tiaan I think you need to be very careful, what you are saying just doesn't sound correct. You can't remove the SAN connections during the install. It does sound like you wiped the partition table but because the other hosts are still active didn't detect this....
can you post the outcome of "fdisk -l" ?
Duncan
VMware Communities User Moderator
-
If you find this information useful, please award points for "correct" or "helpful".
D,
Exactly. The other server will work for awhile, but if you were to reboot them (or even a VM on an affected LUN) they would fail.
Deleting the LUNS from under ESX will also delete the partition config tables which would be the reason that the new ESX host can see the LUNs, but it doesn't know what to do with them.
-MattG
MattG/Duncan
Unfortunately for me, it looks like you're absolutely right... doing fdisk -l on one of the machines still showing the drives and vms shows no paritions...
The FDISK output:
Disk /dev/sda: 71.9 GB, 71999422464 bytes
255 heads, 63 sectors/track, 8753 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sda1 * 1 13 104391 83 Linux
/dev/sda2 14 650 5116702+ 83 Linux
/dev/sda3 651 8348 61834126 fb Unknown
/dev/sda4 8349 8753 3253162+ f Win95 Ext'd (LBA)
/dev/sda5 8349 8486 1108453+ 82 Linux swap
/dev/sda6 8487 8740 2040223+ 83 Linux
/dev/sda7 8741 8753 104391 fc Unknown
Disk /dev/sdb: 549.7 GB, 549755813888 bytes
255 heads, 63 sectors/track, 66837 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
Disk /dev/sdc: 549.7 GB, 549755813888 bytes
255 heads, 63 sectors/track, 66837 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
Disk /dev/sdd: 400.5 GB, 400505700352 bytes
255 heads, 63 sectors/track, 48692 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
Disk /dev/sde: 401.5 GB, 401590452224 bytes
255 heads, 63 sectors/track, 48823 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
Disk /dev/sdf: 20 MB, 20971520 bytes
1 heads, 40 sectors/track, 1024 cylinders
Units = cylinders of 40 * 512 = 20480 bytes
Disk /dev/sdf doesn't contain a valid partition table
Disk /dev/sdg: 585.1 GB, 585111699456 bytes
255 heads, 63 sectors/track, 71135 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
Disk /dev/sdh: 20 MB, 20971520 bytes
1 heads, 40 sectors/track, 1024 cylinders
Units = cylinders of 40 * 512 = 20480 bytes
Disk /dev/sdh doesn't contain a valid partition table
The above is from one of the servers still seeing the data.
Is there possibly a way to write an 'in memory' partition table back to disk? Or is my only solution to copy EVERYTHING off (2tb - which will take some time) and redo?
Tiaan
Call support ASAP. They can recreate the partitions. If there was nothing else written to that area of disk they may be able to get some or all of the data back.
-MattG
Thanks Guys, I logged the call with VMware and are waiting for their feedback
I am pleased to tell you that the issue is resolved now with assistance from IBM VMware support. If any of you are interested in what had to be done (I did find a rather nice article online that gives the basic actions, and was similar to what the VMware guys did) I'd be happy to share.
What we had to do was potentially destructive, but happily got through it without losing anything.
Thanks to everyone for their helpful advise and to Donovan and Greg from the vendor side to help resolve this issue
I'd appreciate seeing some information on what exactly recovered this. Recovery from disaster is always worth knowing about before it happens to someone else.