VMware Cloud Community
mgiurgeu
Contributor
Contributor
Jump to solution

VAMI backups failing

v 8.0.1.00200

manual or automatic backups have started to fail regularly after working fine for years; /var/log/vmware/applmgmt/backup.log showing:

2023-07-27T15:33:15.101 [20230727-193211-21860503] [ConfigFilesBackup:PID-33481] [Proc::GetProcsStatus:Proc.py:327] ERROR: rc: 1, stderr: /usr/bin/tar: storage/db/vmware-observability/stellar.db: file changed as we read it

2023-07-27T15:33:15.101 [20230727-193211-21860503] [ConfigFilesBackup:PID-33481] [Proc::GetProcsStatus:Proc.py:345] ERROR: Process returncode is 1, but expected exit codes are [0].
2023-07-27T15:33:15.101 [20230727-193211-21860503] [ConfigFilesBackup:PID-33481] [Proc::GetProcsStatus:Proc.py:361] ERROR: rc: 0, stderr: b'putting file - as \\vCSA\\vCenter\\sn_vcenter\\M_8.0.1.00200_20230727-193211_\\confi
g_files.tar.gz (6651.9 kb/s) (average 6651.9 kb/s)\n'
2023-07-27T15:33:15.102 [20230727-193211-21860503] [ConfigFilesBackup:PID-33481] [ConfigFiles::BackupConfigFiles:ConfigFiles.py:336] ERROR: Failed configuration files backup- rc: 1; stdout: ; stderr:/usr/bin/tar: storage/db/vmware-observabi
lity/stellar.db: file changed as we read it
; exception:
2023-07-27T15:33:15.102 [20230727-193211-21860503] [ConfigFilesBackup:PID-33481] [ConfigFiles::BackupConfigFiles:ConfigFiles.py:347] ERROR: Failed configuration files backup
Underlying process status. rc: 1
stdout:
stderr: /usr/bin/tar: storage/db/vmware-observability/stellar.db: file changed as we read it
Traceback (most recent call last):
File "/usr/lib/applmgmt/backup_restore/py/vmware/appliance/backup_restore/components/ConfigFiles.py", line 340, in BackupConfigFiles
status)
util.Common.BackupRestoreError: Failed configuration files backup
Underlying process status. rc: 1
stdout:
stderr: /usr/bin/tar: storage/db/vmware-observability/stellar.db: file changed as we read it

2023-07-27T15:33:16.109 [20230727-193211-21860503] [MainProcess:PID-33374] [Proc::VerifyProcStatusAndGetArchive:Proc.py:158] ERROR: Error at process ConfigFilesBackup; rc:1.
2023-07-27T15:33:16.109 [20230727-193211-21860503] [MainProcess:PID-33374] [Proc::VerifyProcStatusAndGetArchive:Proc.py:162] ERROR: stderr:Failed configuration files backup
/usr/bin/tar: storage/db/vmware-observability/stellar.db: file changed as we read it

2023-07-27T15:33:16.109 [20230727-193211-21860503] [MainProcess:PID-33374] [Proc::VerifyProcStatusAndGetArchive:Proc.py:172] INFO: Following error message isn't localized:
stderr:Failed configuration files backup
/usr/bin/tar: storage/db/vmware-observability/stellar.db: file changed as we read it

2023-07-27T15:33:16.109 [20230727-193211-21860503] [MainProcess:PID-33374] [BackupManager::main:BackupManager.py:591] ERROR: BackupManager encountered an exception: Hit exception inside process ConfigFilesBackup:
2023-07-27T15:33:16.111 [20230727-193211-21860503] [MainProcess:PID-33374] [BackupManager::HandleBackupCleanup:BackupManager.py:469] INFO: Cleaning up the backup job.
2023-07-27T15:33:16.111 [20230727-193211-21860503] [MainProcess:PID-33374] [BackupManager::Cleanup:BackupManager.py:405] INFO: Cleaning up all running backup child processes.
2023-07-27T15:33:46.129 [20230727-193211-21860503] [MainProcess:PID-33374] [Proc::CleanupChildProcesses:Proc.py:253] ERROR: Child process failed to terminate
2023-07-27T15:33:46.139 [20230727-193211-21860503] [MainProcess:PID-33374] [BackupManager::Cleanup:BackupManager.py:412] INFO: Cleaning up all sizeFiles.
2023-07-27T15:33:46.140 [20230727-193211-21860503] [MainProcess:PID-33374] [BackupManager::Cleanup:BackupManager.py:420] INFO: Cleaning up ConfigFiles.
2023-07-27T15:33:46.140 [20230727-193211-21860503] [MainProcess:PID-33374] [BackupManager::Cleanup:BackupManager.py:420] INFO: Cleaning up ComponentScripts.
2023-07-27T15:33:46.141 [20230727-193211-21860503] [MainProcess:PID-33374] [BackupManager::Cleanup:BackupManager.py:420] INFO: Cleaning up VCDB.
2023-07-27T15:33:46.141 [20230727-193211-21860503] [MainProcess:PID-33374] [VCDB::BackupVCDBCleanup:VCDB.py:2109] INFO: Finishing full database backup
2023-07-27T15:33:46.157 [20230727-193211-21860503] [MainProcess:PID-33374] [VCDB::BackupVCDBCleanup:VCDB.py:2118] INFO: Successfully cleaned up for VCDB backup.
2023-07-27T15:33:46.157 [20230727-193211-21860503] [MainProcess:PID-33374] [BackupManager::Cleanup:BackupManager.py:420] INFO: Cleaning up Lotus.
2023-07-27T15:33:46.157 [20230727-193211-21860503] [MainProcess:PID-33374] [Lotus::BackupLotusCleanup:Lotus.py:140] INFO: Successfully completed Lotus cleanup.
2023-07-27T15:33:46.157 [20230727-193211-21860503] [MainProcess:PID-33374] [BackupManager::Cleanup:BackupManager.py:420] INFO: Cleaning up VTSDB.
2023-07-27T15:33:46.157 [20230727-193211-21860503] [MainProcess:PID-33374] [VTSDB::BackupVTSDBCleanup:VTSDB.py:237] INFO: Successfully cleaned up for VTSDB backup.
2023-07-27T15:33:46.157 [20230727-193211-21860503] [MainProcess:PID-33374] [BackupManager::Cleanup:BackupManager.py:429] INFO: Cleaning up failed backup files.
2023-07-27T15:33:46.158 [20230727-193211-21860503] [MainProcess:PID-33374] [SMBClientStorageIOLib::_run_cmd:SMBClientStorageIOLib.py:23] INFO: Running Command: ['/usr/bin/smbclient', '//nas/Public/', '-A', '/dev/shm/smbCredentialFilebpfs7yn
f', '--send-buffer', '100M', '-t', '140', '-c', 'deltree vCSA/vCenter/sn_vcenter/M_8.0.1.00200_20230727-193211_']
2023-07-27T15:33:46.552 [20230727-193211-21860503] [MainProcess:PID-33374] [SMBClientStorageIOLib::_process_cmd_output:SMBClientStorageIOLib.py:97] ERROR: Command failed with error: NT_STATUS_SHARING_VIOLATION
2023-07-27T15:33:46.553 [20230727-193211-21860503] [MainProcess:PID-33374] [SMBClientStorageIOLib::_process_cmd_output:SMBClientStorageIOLib.py:104] ERROR: smb cmd failed. RC: 1, Err: , Cmd: ['/usr/bin/smbclient', '//nas/Public/', '-A', '/d
ev/shm/smbCredentialFilebpfs7ynf', '--send-buffer', '100M', '-t', '140', '-c', 'deltree vCSA/vCenter/sn_vcenter/M_8.0.1.00200_20230727-193211_']
2023-07-27T15:33:46.553 [20230727-193211-21860503] [MainProcess:PID-33374] [BackupManager::Cleanup:BackupManager.py:433] WARNING: Ignore to clean up failed backup files, due to issue: 'Plugin error occurred. ErrCode: 5, Args: ()'.
2023-07-27T15:33:46.554 [20230727-193211-21860503] [MainProcess:PID-33374] [BackupManager::HandleBackupCleanup:BackupManager.py:477] ERROR: Failed to cleanup the backup job. Error: Failed to clean up the backup job.
2023-07-27T15:33:46.866 [20230727-193211-21860503] [MainProcess:PID-33374] [vpxdevent_lib::dispatch_events:vpxdevent_lib.py:275] INFO: Event com.vmware.applmgmt.backup.job.failed.event successfully posted to http://localhost:8085/sdk
2023-07-27T15:33:46.870 [20230727-193211-21860503] [MainProcess:PID-33374] [BackupManager::main:BackupManager.py:618] INFO: Backup job failed.

0 Kudos
1 Solution

Accepted Solutions
Kinnison
Commander
Commander
Jump to solution

Hello,


Consider if it may be appropriate to update to the 8.0U1C build released in recent days, from the release notes:


"vCenter backup might fail due to a database error

During vCenter backup, config files are copied onto a backup server. A write operation on any of the vCenter database files during the copy operation causes it to fail. As a result, the backup operation also fails. This issue is resolved in this release. With the fix, instead of directly copying files, vCenter makes a separate DB dump first and then completes the backup".


That would be consistent (a reasonable assumption) with the error telling you that the "stellar.db" file underwent changes as it was being read.


Regards,
Ferdinando

View solution in original post

0 Kudos
16 Replies
Sachchidanand
Expert
Expert
Jump to solution

Have you made any recent changes like upgrading vCenter? after that you are facing this issue?

As per logs, smb cmd command failed, seems at target end either it locked the files or any stale session is there.

what is your target? Linux or Windows? try resetting smb service on the target then run the backup again.

Regards,

Sachchidanand

0 Kudos
mgiurgeu
Contributor
Contributor
Jump to solution

Yes, I think an upgrade caused this, but I'm not sure which. I've been backing up to the same device, using the same protocol for years. There were no changes on the destination. If I initiate the backup manually, I can see the files being copied on the NAS but then the whole process stops around 92%. I tried switching between SMB and FTP, same result.

I'm not sure what to make of the repeated "file changed as we read it" messages in the backup log

 

0 Kudos
maksym007
Expert
Expert
Jump to solution

Strange that it fails. VAMI backup is a straightforward config and not complicated

0 Kudos
mgiurgeu
Contributor
Contributor
Jump to solution

The backup starts fine, I can see the files being created and increasing in size, and then around 95% it fails. vum.gz keeps growing and it is during this phase when the whole thing crashes and all files are cleaned up (deleted). Studying backup.log as best as I can, here is the relevant part, but I don't know what to make of it:

 

2023-07-28T16:19:19.314 [20230728-201812-21860503] [ConfigFilesBackup:PID-36187] [Proc::GetProcsStatus:Proc.py:327] ERROR: rc: 1, stderr: /usr/bin/tar: storage/db/vmware-observability/stellar.db: file changed as we read it

2023-07-28T16:19:19.314 [20230728-201812-21860503] [ConfigFilesBackup:PID-36187] [Proc::GetProcsStatus:Proc.py:345] ERROR: Process returncode is 1, but expected exit codes are [0].
2023-07-28T16:19:19.315 [20230728-201812-21860503] [ConfigFilesBackup:PID-36187] [Proc::GetProcsStatus:Proc.py:361] ERROR: rc: 0, stderr: b'putting file - as \\vCenter\\sn_vcenter.conestogo.on.ca\\M_8.0.1.00200_20230728-201812_\\config_file
s.tar.gz (6315.7 kb/s) (average 6315.7 kb/s)\n'
2023-07-28T16:19:19.315 [20230728-201812-21860503] [ConfigFilesBackup:PID-36187] [ConfigFiles::BackupConfigFiles:ConfigFiles.py:336] ERROR: Failed configuration files backup- rc: 1; stdout: ; stderr:/usr/bin/tar: storage/db/vmware-observabi
lity/stellar.db: file changed as we read it
; exception:
2023-07-28T16:19:19.316 [20230728-201812-21860503] [ConfigFilesBackup:PID-36187] [ConfigFiles::BackupConfigFiles:ConfigFiles.py:347] ERROR: Failed configuration files backup
Underlying process status. rc: 1
stdout:
stderr: /usr/bin/tar: storage/db/vmware-observability/stellar.db: file changed as we read it
Traceback (most recent call last):
File "/usr/lib/applmgmt/backup_restore/py/vmware/appliance/backup_restore/components/ConfigFiles.py", line 340, in BackupConfigFiles
status)
util.Common.BackupRestoreError: Failed configuration files backup
Underlying process status. rc: 1
stdout:
stderr: /usr/bin/tar: storage/db/vmware-observability/stellar.db: file changed as we read it

2023-07-28T16:19:20.137 [20230728-201812-21860503] [MainProcess:PID-36085] [Proc::VerifyProcStatusAndGetArchive:Proc.py:158] ERROR: Error at process ConfigFilesBackup; rc:1.
2023-07-28T16:19:20.137 [20230728-201812-21860503] [MainProcess:PID-36085] [Proc::VerifyProcStatusAndGetArchive:Proc.py:162] ERROR: stderr:Failed configuration files backup
/usr/bin/tar: storage/db/vmware-observability/stellar.db: file changed as we read it

2023-07-28T16:19:20.137 [20230728-201812-21860503] [MainProcess:PID-36085] [Proc::VerifyProcStatusAndGetArchive:Proc.py:172] INFO: Following error message isn't localized:
stderr:Failed configuration files backup
/usr/bin/tar: storage/db/vmware-observability/stellar.db: file changed as we read it

2023-07-28T16:19:20.138 [20230728-201812-21860503] [MainProcess:PID-36085] [BackupManager::main:BackupManager.py:591] ERROR: BackupManager encountered an exception: Hit exception inside process ConfigFilesBackup:

0 Kudos
Kinnison
Commander
Commander
Jump to solution

Hello,


Consider if it may be appropriate to update to the 8.0U1C build released in recent days, from the release notes:


"vCenter backup might fail due to a database error

During vCenter backup, config files are copied onto a backup server. A write operation on any of the vCenter database files during the copy operation causes it to fail. As a result, the backup operation also fails. This issue is resolved in this release. With the fix, instead of directly copying files, vCenter makes a separate DB dump first and then completes the backup".


That would be consistent (a reasonable assumption) with the error telling you that the "stellar.db" file underwent changes as it was being read.


Regards,
Ferdinando

0 Kudos
Ajay1988
Expert
Expert
Jump to solution

Can you please try the below and report the status. Also better to report the issue via SR with VMware as this is with 8.0 version. 

Take a backup of appliance.conf file .

1. vi /etc/applmgmt/appliance/appliance.conf
2. Change
"fileTransferMode": "parallel",
to
"fileTransferMode": "serial",
3. Restart applmgmt: service-control --restart applmgmt

If you think your queries have been answered
Mark this response as "Correct" or "Helpful".

Regards,
AJ
mgiurgeu
Contributor
Contributor
Jump to solution

Thank you very much for the suggestion.

This seems to have fixed it, at least in manual mode. I'll keep an eye on it to see if it's working following the schedule.

Sorry for asking, but what is "SR with VMware" (for reporting issues)?

MG

 

0 Kudos
mgiurgeu
Contributor
Contributor
Jump to solution

Unfortunately, this failed almost every other time I tried to run the job manually, so I didn't wait for the backup schedule. I decided to upgrade to 8.0U1C and that fixed the problem. I ran it several times after the upgrade and it worked flawless every single time (reverted fileTransferMode back to parallel).

 

Thanks again for your suggestion, I really appreciate you effort

0 Kudos
mgiurgeu
Contributor
Contributor
Jump to solution

I can confirm that updating to 8.0U1C fixed the problem with the backup. Thank you very much for the suggestion.

0 Kudos
Sachchidanand
Expert
Expert
Jump to solution

SR is Service Request with vmware. Follow the KB to open SR with vmware:

https://kb.vmware.com/s/article/2006985

Regards,

Sachchidanand

0 Kudos
Kinnison
Commander
Commander
Jump to solution

Hello,


IMHO the method proposed by @Ajay1988 can be useful for those who may find themselves in the same situation and, for one reason or another, cannot update to the latest version of the vCenter object or are not in a position to involve technical support.


Regards,
Ferdinando

0 Kudos
PabloEcipi
Contributor
Contributor
Jump to solution

kinnison
Good day
in one of the vcenter "stellar.db" in the folder "/storage/db/vmware-observability" has a size of 1.8 Gby . perform 2 tests
1) I stopped the services
service-control --stop observability
service-control --stop observability-vapi
make the backup
start the services
service-control --start observability
service-control --start observability-vapi
The backup was done successfully
2) I stopped the services
rename the 2 files in the folder
start the services
the file was created with size 0
the backup was successful
But the file doesn't grow

Is there a way to clean this "stellar.db" file?

Thank you very much in advance

0 Kudos
Kinnison
Commander
Commander
Jump to solution

Hello PabloEcipi,


I Beg your pardon but IMHO this is a question that @Ajay1988 or some other colleague of his is/are better entitled to answer. Personally I haven't looked into it too much because I haven't encountered "backup" problems yet and in any case my vCenter object is already updated to version 8.0U1c which should prevent them anyway.


And if I have to be honest, for now I'm careful not to mess with vCenter databases of any kind, let's say that my "main" vCenter object for how it works is absolutely unsupported and therefore I prefer to stay away from "other" potential problems.


Regards,
Ferdinando

0 Kudos
XServerCloud
Contributor
Contributor
Jump to solution

I have the same issue after updating to 8.0.2. Changing parallel to serial have been resolved the issue. Thanks!

likeahoss
Enthusiast
Enthusiast
Jump to solution

I patched 7.0.3d to 7.0.3m, and FIPS is enabled, I ran into this issue, and this workaround fixed it.

 

Thank you @Ajay1988 

 

0 Kudos
WilsonIT
Contributor
Contributor
Jump to solution

I was also having this issue and Ajay1988's suggestion fixed it for me, too.