VMware Cloud Community
bobd591
Contributor
Contributor

Vmware ESXi 6.0.0 with 2 servers down

Hi everybody;

 

I hope all is well.

I am the IT technician in a daycare. Yesterday we had a power outage for just a short time, and both servers in our Vmware ESXi 6.0.0 are down. One is the domain controller/file server. They are both Win 2012 Server R2 in Vmware ESXi 6.0.0 (VMKernel release Build 5050593)  on a Dell Poweredge R530. The VMWare interface boots all the way in, and the RAID looks healthy. (I'm checking again now...)

First problem-I have no password for the user root of the ESXI. 

The first thing I would try be a re-install of the EsXi 6.0 software to reset the password and hopefully re-attach the servers. If anybody has a lead to where I can download a bootable version-that would be sincerely appreciated... 

The second thing I may try at https://www.starwindsoftware.com/blog/vmware/forgot-esxi-root-password-no-problems-4-ways-reset/ is the Resetting root password on the standalone ESXi hosts option booting with an Ubuntu shell.

Am I in the ballpark, or out to lunch?

The goal is of course to preserve and re-attach the Domain controller server. Since both servers are down, I am hoping it is a ESXi config problem, that was blown away in the power failure.

I am a complete newbie with VMware ESXi, and welcome any advice on how to get these servers back up. 

 

Thanks very much in advance!

Bob Donaldson

 

 

 

0 Kudos
12 Replies
e_espinel
Virtuoso
Virtuoso

Hello. If you have access to ESXi 6.0 you could try using the following passwords: Passw0rd (zero instead of letter o). Passw0rd. PASSW0RD You can also try to remove the password by following the steps in the link you provide. If you continue to have problems please contact me internally at my email: eespinel@gmail.com
Enrique Espinel
Senior Technical Support on IBM, Lenovo, Veeam Backup and VMware vSphere.
VSP-SV, VTSP-SV, VTSP-HCI, VTSP
Please mark my comment as Correct Answer or assign Kudos if my answer was helpful to you, Thank you.
Пожалуйста, отметьте мой комментарий как Правильный ответ или поставьте Кудо, если мой ответ был вам полезен, Спасибо.
bobd591
Contributor
Contributor

Hi all;

Thank you e_espinel11-I booted with an Ubuntu desktop disk and was able to reset the password and get into ESXi.

The problem is that the 2 VM's show as invalid...I am reading about this. Should I backup the VMS first? Any pointers to how I can fix this or start learning are appreciated!

Here are 2 nuggets of information so far-and much thanks to those who posted them:

1-I used ddrescue on ubuntu live usb to copy the vmdk, vmdk-flat, vmx files to a backup source and then copied those back into the datastore.   

2-If you are seriously interested in evaluating the damage the power-failure did to your VMs use vmfs6-tools or UFSexplorer. 
Start trouble-shooting with debugvmfs which is a part of the vmfs-tools - that tool will typically tell what is wrong - or crash when trying to open the block-device. 
 

Thanks;

 

bobd

0 Kudos
bobd591
Contributor
Contributor

Hi;

I am new to Vmware and ESXi.

I was able to change the password and gain root access to our Vmware ESXi 6.0.0 host in which we have 2 servers that are down.

I used the Resetting root password on the standalone ESXi hosts option at https://www.starwindsoftware.com/blog/forgot-esxi-root-password-no-problems-4-ways-reset

I have logged on with SSH and have been unsuccessful at locating any .vmx files or virtual machines. The data store shows only 2 install .iso images. The attached images show the status of our virtual machines and storage devices:

I am desperately trying to get the servers back online-especially the Domain controller...any pointers or help with where to start would be sincerely appreciated.

 

Thanks in Advance!

bobd

 

0 Kudos
Kinnison
Commander
Commander

Hello,


From your second screenshot your datastore are in a "degraded" state so first of all I would use the iDRAC tool to check or exclude the presence of errors related to your volumes. In another thread of yours you say that your RAID volumes seem healthy, but I have some doubts.


Regards,
Ferdinando

bobd591
Contributor
Contributor

Thank you Ferdinando.

The RAID looks good (see image)

raid state.jpg

 This is encouraging, as I am hoping to restore the server with no data loss...  I am trying to organize my troubleshooting, so I will focus on healing the "degraded" state of the volumes as reported by ESXi.

I did see one thread where someone solved this with esxcli ssh commands as specified here at https://communities.vmware.com/t5/ESXi-Discussions/Disks-degraded-in-ESXi/m-p/1404465#M134443

I am a little shy to run these commands out of the box, but I feel better now...I can do this!

 

Any insights appreciated! Thank you all!

bobd 

0 Kudos
Kinnison
Commander
Commander

Hello @bobd591,


From your last screenshot, using a "RAID 0" type layout for your volumes exposes you to the real risk of (complete) data loss to the occurrence of various types of functional problems. With a setup like that you should be sure to have frequent and robust backups otherwise, if this happens, it won't be any fun at all.


Regards,
Ferdinando

bobd591
Contributor
Contributor

Ferdinando;

I agree-I inherited this setup...

I do have all the data, and system state backed up.
I was able to get the Storage Device status to normal...(see image)

new status.jpg

I will focus on getting the proper things backed up. The problem is, the staff is breathing down my back to get the Domain controller/file server up...

Thanks;

 

bobd

0 Kudos
e_espinel
Virtuoso
Virtuoso

Hello.
summarizing your case, you had a server with ESXi 6.0 with two internal physical disks (one of 200GB and another of 2.56TB), you configured a datastore on each disk. this means that you had two datastores.
Upgraded to version 7 and now you only see one datastore, but in the device you still see the two initial disks.
In the attached image you can see a datastore1 you can show this datastore and then make a browser of this datastore1 to see what is inside it.

 

Enrique Espinel
Senior Technical Support on IBM, Lenovo, Veeam Backup and VMware vSphere.
VSP-SV, VTSP-SV, VTSP-HCI, VTSP
Please mark my comment as Correct Answer or assign Kudos if my answer was helpful to you, Thank you.
Пожалуйста, отметьте мой комментарий как Правильный ответ или поставьте Кудо, если мой ответ был вам полезен, Спасибо.
Kinnison
Commander
Commander

Hello,


I understand your situation, the staff should also understand that beyond the "RAID 0 issue" the ESXi 6.0 build in use is also quite dated and in any case now well beyond the "technical guidance" phase. Let's just say it's a potential "unpleasant" combination.
However, these are just my personal opinions.


The important thing at the moment is that you are able to start those virtual machines.


Regards,
Ferdinando

bobd591
Contributor
Contributor

Hi Ferdinando;

Just to re-summarize:

We have a Dell poweredge R530 with ESXi 6.0 with 3 scsi physical disks of 1 tb each. The logical volumes are one of 200GB (old exchange server this vol not needed) and another of 2.56TB (Domain controller /file server badly needed) in Raid O, healthy.

In ESXi is one datastore the previous tech setup, which has now only 2 .isos in it. (exchange Win 2012 server) The storage devices (logical) in ESXi are now status normal. We are still ESXi 6.0.

I am now verifying the size of virtual machines I can find via SSH, and backing those and the right files (.vmx, etc) to an external drive.

In between I am troubleshooting the existing virtual machines in ESXI whose status is "Invalid".

 

Hey-thanks a lot for looking at my stuff! I really appreciate it. I hope all is well with you.

Sincerely;

bobd

Kahnawake, Quebec
Canada

https://www.stepxstep.ca/

 

bobd

 

 

0 Kudos
bobd591
Contributor
Contributor

Here is the contents of /dev/disks:

-rw------- 1 root root 200.0G Nov 17 19:52 naa.6d09466022823d0021c3e56f0e5cf2ad
-rw------- 1 root root 4.0M Nov 17 19:52 naa.6d09466022823d0021c3e56f0e5cf2ad:1
-rw------- 1 root root 4.0G Nov 17 19:52 naa.6d09466022823d0021c3e56f0e5cf2ad:2
-rw------- 1 root root 192.6G Nov 17 19:52 naa.6d09466022823d0021c3e56f0e5cf2ad:3
-rw------- 1 root root 250.0M Nov 17 19:52 naa.6d09466022823d0021c3e56f0e5cf2ad:5
-rw------- 1 root root 250.0M Nov 17 19:52 naa.6d09466022823d0021c3e56f0e5cf2ad:6
-rw------- 1 root root 110.0M Nov 17 19:52 naa.6d09466022823d0021c3e56f0e5cf2ad:7
-rw------- 1 root root 286.0M Nov 17 19:52 naa.6d09466022823d0021c3e56f0e5cf2ad:8
-rw------- 1 root root 2.5G Nov 17 19:52 naa.6d09466022823d0021c3e56f0e5cf2ad:9
-rw------- 1 root root 2.5T Nov 17 19:52 naa.6d09466022823d0021c3e5740ea9fe97
-rw------- 1 root root 2.5T Nov 17 19:52 naa.6d09466022823d0021c3e5740ea9fe97:1

 

bobd

0 Kudos
Kinnison
Commander
Commander

Hello bobd591,


So, if I understand correctly, via SSH you are saving everything possible related to those two virtual machines.


Take it with a pinch of salt, because it can entail certain risks and consequently cause further damage, but to make a long story short I found myself in a situation substantially similar to yours and the remedy, the effective one, was to restart the system because it was not possible to "get through the CLI a spider from the his hole". Perhaps it would also be a good idea to explain to the "staff" how they are working in the context of "production" in the "Day-care" area, the relatives of the guests in the event of disservices do not take long "to become nastier than vipers" (forgive me for the joke).


Keep in mind that the firmware level of that system, although it has been around for several years now, over time some have been released, even relatively recent ones. I'd give it a look.


A good weekend,
Ferdinando

0 Kudos