VMware Cloud Community
MarkUnisys
Contributor
Contributor

vCenter High Availability continuous reboot

We have vSphere 6.5 and are running the VCSA vCenter in high availability mode. It has been working all along fine. Now, all of a sudden we cannot logon to vCenter. What it appears is that the active and peer vCenters keep rebooting and keep switching which one is the active vCenter. At times when trying to logon the response is website no available. Other times it indicates failover in progress. Other times it indicates that it is initializing as though vCenter services are starting. very randomly I can logon to vCenter but never long enough to troubleshoot what is going on before I lose connectivity to vCenter when it seems to reboot or start transitioning services to the peer. Seems to be stuck in this loop.

Has anyone seen anything like this?

In addition, I can, for the most part, access the vcsa administration page at port 5480. Not much help there. But it always indicates Overall Health: Alert.

Message was edited by: Mark Werner

0 Kudos
3 Replies
daphnissov
Immortal
Immortal

I've seen something of those symptoms, and unfortunately the vCHA feature lacks a lot of troubleshooting and CLI options to fix unknown behavior like this, so it's almost always best to remove the HA configuration and nodes and redeploy.

0 Kudos
DanVari00
Contributor
Contributor

Mark,

we are having exactly the same issue but with a newly installed cluster. The hosts are updated to 6.5 U1 as is the appliance.

Between the failover you have about 30 seconds to disable the automatic failover and set it to maintenance mode. We have collected the logs and created a case to investigate this.

I'll try to keep this post updated with news from support

0 Kudos
MarkUnisys
Contributor
Contributor

I would be interested if you hear anything back.

After spending some time trying to troubleshoot I basically had to do what @daphnissov suggested. I had to power off witness and peer, and then destroy vCenter HA via CLI. I was reluctant to do that as I had no vCenter backup and I really didn't want to end up with no vCenter. That ended up being my only option. Thankfully it worked. I have been busy on other things and have not enabled vCenter HA again yet. So I am just running standalone vCenter. I'd actually really like to know what happened and why it happened before I enable vCenter HA again.

I

0 Kudos