A few weeks back, upgraded to VCSA 8.0 U2a, and I do have file based backups running every night. I had an unexpected shutdown of my NAS the other night, and afterwards the VCSA services would not start with the error below:
Error executing start on service applmgmt. Details {
"detail": [
{
"id": "install.ciscommon.service.failstart",
"translatable": "An error occurred while starting service '%(0)s'",
"args": [
"applmgmt"
],
"localized": "An error occurred while starting service 'applmgmt'"
}
],
"componentKey": null,
"problemId": null,
"resolution": null
}
Service-control failed. Error: {
"detail": [
{
"id": "install.ciscommon.service.failstart",
"translatable": "An error occurred while starting service '%(0)s'",
"args": [
"applmgmt"
],
"localized": "An error occurred while starting service 'applmgmt'"
}
],
"componentKey": null,
"problemId": null,
"resolution": null
}
I get this with just about all services when I try to start it up.
Now, I figure "heck, I got's backups, no worries" so I deleted the VCSA VM and deployed a new one, restoring from backups. Well, the restore never completes because the services won't start with the same error above. I have tried several different backups and they all have the same issue.
Of course, I also have to "tweak" the partition sizes to even get the restore going because a couple of partitions are way bigger than they should be and I guess this is an artifact of having upgraded from VCSA 6.7 through to 7 then to 8. I plan to fix this by migrating to a new appliance once I get things up and going again, but can't get to that point.
Any thoughts on what the heck is going on? This is my lab, but this is a pre-test for work so if I can't get a restore to function then I need to figure out if I will upgrade my VCSA 7.0 U3p to 8 or not, or if this is truly a bug with 8 U2a that is fixed in later versions.
@Tibmeister - Have you checked the partition state when this is happening? Eg login via SSH and run df -h to check usage, and make sure none of the partitions are full?
One of the first things I did. Outside of / everything's under 5% utilized. I really don't want to have to rebuild a new vCenter and go through all the tweaking...
I guess a brand new rebuild is in order, which does not bode well for needing to do this in my production environment.
Brand new vCenter, up for 24 hours, had a iSCSI blip, and trashed. This was a brand new built 8.0 u2b deployment, so what gives, has vCenter really become such a sensitive flower? If so then we need to be able to run it bare metal because things happen with storage that is not 100% local.
Several reboots and the VCSA came back to life. Very strange indeed.