VMware Cloud Community
tiaanm
Contributor
Contributor

Mounting SAN volumes post Install

Hi

We are running ESX3.5 on a number of IBM 3650s connected via Fiber to a IBM DS3400 SAN running in HA. I recently had to install the latest server to connect to this setup and ran across a few issues.

I have made the mistake of removing the SAN partitions from my Partitioning list when I installed ESX 3.5 on a server already connected to the SAN. My reasoning was that I don't want the installer to touch the SAN and to install to local mirror on the server and that I would add the SAN via the VirtualCenter interface later.

Unfortunately, what I have now is a server that can't see the volumes on the SAN properly. To explain:

Under Storage Adapters, I can see all the LUNs from the SAN (a esxcfg-mpath -l also shows everything similar to my other 'working' servers)

Under Storage , I see only my local storage

If I try to add storage, it shows me the LUNs but when I select a LUN, it shows it as empty (which it isn't) and wants to partition and format it. This I don't want to hazard because I can't afford to lose the current VMs on that Volume

Is there a way to just add the paths to /vmfs/volumes to get to see the SAN volumes? I have more ESX servers on the SAN and they show the volumes under /vmfs/volumes

For completeness, I have tried to reinstall ESX and even recreated the server raid, but the first list of partitions including the vmfs volumes never appeared again during reinstallations, now it just shows 'free'.

Any help would be appreciated

Tiaan

Tags (5)
0 Kudos
13 Replies
Lightbulb
Virtuoso
Virtuoso

Your other hosts can see the VMFS volumes correct?

You may need to set EnableResignature=1 in the advanced settings of your ESX host (See Attached) then rescan storage.

0 Kudos
tiaanm
Contributor
Contributor

Hi Lightbulb

Thanks for the tip, but unfortunately ,didn't make a difference (it wasn't set, so I set it). I did a rescan of the Storage Adapters and then a refresh of the storage, nothing changed, so I tried to add the storage again and still shows empty. Perhaps I'm not following the correct process?

I attach a few screenshots (consolidated into new.gif) -

New - Storage Adapters.gif - List of Storage Adapters and paths

New - Storage.gif - current Storage settings

New - Storage add 1.gif - List of Luns

New - Storage add 2.gif - Show as empty

and then my existing servers (which can all see the SAN volumes, yes)

Existing - Storage Adapters.gif

Existing - Storage.gif - showing the same Lun's and their free space

i am worried that perhaps i need to set something on the SAN for the HBA rather than something on VMware.

0 Kudos
MattG
Expert
Expert

Are you saying that you deleted the existing VMFS LUNs that other servers were using during the install of a new SAN server?

If so, that data is most likely gone and it is only a matter of time before the other ESX hosts that are referencing those LUNS will Purple screen.

I had a similar issue when our SAN admin got confused and deleted a handful of my VMFS LUNS. It happened on a Friday. Right after it happened our Exchange VM started getting corruption errors, but still worked. Sunday night the ESX hosts all Purple screened and we lost most of the data. It seems even though the LUNS are deleted, that they can stay up for a certain amount of time with the way that ESX handles VMFS caching.

-MattG

-MattG If you find this information useful, please award points for "correct" or "helpful".
0 Kudos
tiaanm
Contributor
Contributor

Hi MattG

Luckily, no, what I seemed to have deleted is the configuration for the esx to mount those volumes. the data (luckily for me) is still there, as all my other ESX hosts can see the data just fine. But for some reason this particular ESX server seems to think all the drives are empty (as my picture showed) and wants to create partitions. I am wary to do this as I don't want to lost the data I do have if something goes wonky.

So to recap.

My existing ESX boxes all see the SAN volumes fine and no data has been lost

The new ESX box sees the SAN Luns but thinks the volumes are empty and thus wants to partition and format them rather than just adding the existing volumes.

The fact that a total server reinstall (including recreating the raid sets on the server itself) did not help me get back the partition view on ESX install which included the volumes on the SAN (only shows free space, unlike when I originally installed and got vmfs partitions which I thought it was going to create and thus removed) I am wondering if some configuration on the SAN for those two HBAs connected from the new servers isn't wonky. (My SAN knowledge is not at a level to make a definitive declaration )

Tiaan

0 Kudos
MattG
Expert
Expert

I am still not clear. You installed a new server that was attached to the SAN during install. On partition screen the installer said that it saw other LUN and asked if you wanted to delete them and you chose yes?

-MattG

-MattG If you find this information useful, please award points for "correct" or "helpful".
0 Kudos
depping
Leadership
Leadership

Tiaan I think you need to be very careful, what you are saying just doesn't sound correct. You can't remove the SAN connections during the install. It does sound like you wiped the partition table but because the other hosts are still active didn't detect this....

can you post the outcome of "fdisk -l" ?

Duncan

VMware Communities User Moderator

-


Blogging:

Twitter:

If you find this information useful, please award points for "correct" or "helpful".

0 Kudos
MattG
Expert
Expert

D,

Exactly. The other server will work for awhile, but if you were to reboot them (or even a VM on an affected LUN) they would fail.

Deleting the LUNS from under ESX will also delete the partition config tables which would be the reason that the new ESX host can see the LUNs, but it doesn't know what to do with them.

-MattG

-MattG If you find this information useful, please award points for "correct" or "helpful".
0 Kudos
tiaanm
Contributor
Contributor

MattG/Duncan

Unfortunately for me, it looks like you're absolutely right... doing fdisk -l on one of the machines still showing the drives and vms shows no paritions...

The FDISK output:

Disk /dev/sda: 71.9 GB, 71999422464 bytes

255 heads, 63 sectors/track, 8753 cylinders

Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System

/dev/sda1 * 1 13 104391 83 Linux

/dev/sda2 14 650 5116702+ 83 Linux

/dev/sda3 651 8348 61834126 fb Unknown

/dev/sda4 8349 8753 3253162+ f Win95 Ext'd (LBA)

/dev/sda5 8349 8486 1108453+ 82 Linux swap

/dev/sda6 8487 8740 2040223+ 83 Linux

/dev/sda7 8741 8753 104391 fc Unknown

Disk /dev/sdb: 549.7 GB, 549755813888 bytes

255 heads, 63 sectors/track, 66837 cylinders

Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System

Disk /dev/sdc: 549.7 GB, 549755813888 bytes

255 heads, 63 sectors/track, 66837 cylinders

Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System

Disk /dev/sdd: 400.5 GB, 400505700352 bytes

255 heads, 63 sectors/track, 48692 cylinders

Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System

Disk /dev/sde: 401.5 GB, 401590452224 bytes

255 heads, 63 sectors/track, 48823 cylinders

Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System

Disk /dev/sdf: 20 MB, 20971520 bytes

1 heads, 40 sectors/track, 1024 cylinders

Units = cylinders of 40 * 512 = 20480 bytes

Disk /dev/sdf doesn't contain a valid partition table

Disk /dev/sdg: 585.1 GB, 585111699456 bytes

255 heads, 63 sectors/track, 71135 cylinders

Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System

Disk /dev/sdh: 20 MB, 20971520 bytes

1 heads, 40 sectors/track, 1024 cylinders

Units = cylinders of 40 * 512 = 20480 bytes

Disk /dev/sdh doesn't contain a valid partition table

The above is from one of the servers still seeing the data.

Is there possibly a way to write an 'in memory' partition table back to disk? Or is my only solution to copy EVERYTHING off (2tb - which will take some time) and redo?

Tiaan

0 Kudos
MattG
Expert
Expert

Call support ASAP. They can recreate the partitions. If there was nothing else written to that area of disk they may be able to get some or all of the data back.

-MattG

-MattG If you find this information useful, please award points for "correct" or "helpful".
0 Kudos
depping
Leadership
Leadership

Call support indeed, it should be fairly easy to mark the partitions as VMFS again though. They can help you,

Duncan

VMware Communities User Moderator

-


Blogging:

Twitter:

If you find this information useful, please award points for "correct" or "helpful".

0 Kudos
tiaanm
Contributor
Contributor

Thanks Guys, I logged the call with VMware and are waiting for their feedback

0 Kudos
tiaanm
Contributor
Contributor

I am pleased to tell you that the issue is resolved now with assistance from IBM VMware support. If any of you are interested in what had to be done (I did find a rather nice article online that gives the basic actions, and was similar to what the VMware guys did) I'd be happy to share.

What we had to do was potentially destructive, but happily got through it without losing anything.

Thanks to everyone for their helpful advise and to Donovan and Greg from the vendor side to help resolve this issue

0 Kudos
Josh26
Virtuoso
Virtuoso

I'd appreciate seeing some information on what exactly recovered this. Recovery from disaster is always worth knowing about before it happens to someone else.

0 Kudos