v 8.0.1.00200
manual or automatic backups have started to fail regularly after working fine for years; /var/log/vmware/applmgmt/backup.log showing:
2023-07-27T15:33:15.101 [20230727-193211-21860503] [ConfigFilesBackup:PID-33481] [Proc::GetProcsStatus:Proc.py:327] ERROR: rc: 1, stderr: /usr/bin/tar: storage/db/vmware-observability/stellar.db: file changed as we read it
2023-07-27T15:33:15.101 [20230727-193211-21860503] [ConfigFilesBackup:PID-33481] [Proc::GetProcsStatus:Proc.py:345] ERROR: Process returncode is 1, but expected exit codes are [0].
2023-07-27T15:33:15.101 [20230727-193211-21860503] [ConfigFilesBackup:PID-33481] [Proc::GetProcsStatus:Proc.py:361] ERROR: rc: 0, stderr: b'putting file - as \\vCSA\\vCenter\\sn_vcenter\\M_8.0.1.00200_20230727-193211_\\confi
g_files.tar.gz (6651.9 kb/s) (average 6651.9 kb/s)\n'
2023-07-27T15:33:15.102 [20230727-193211-21860503] [ConfigFilesBackup:PID-33481] [ConfigFiles::BackupConfigFiles:ConfigFiles.py:336] ERROR: Failed configuration files backup- rc: 1; stdout: ; stderr:/usr/bin/tar: storage/db/vmware-observabi
lity/stellar.db: file changed as we read it
; exception:
2023-07-27T15:33:15.102 [20230727-193211-21860503] [ConfigFilesBackup:PID-33481] [ConfigFiles::BackupConfigFiles:ConfigFiles.py:347] ERROR: Failed configuration files backup
Underlying process status. rc: 1
stdout:
stderr: /usr/bin/tar: storage/db/vmware-observability/stellar.db: file changed as we read it
Traceback (most recent call last):
File "/usr/lib/applmgmt/backup_restore/py/vmware/appliance/backup_restore/components/ConfigFiles.py", line 340, in BackupConfigFiles
status)
util.Common.BackupRestoreError: Failed configuration files backup
Underlying process status. rc: 1
stdout:
stderr: /usr/bin/tar: storage/db/vmware-observability/stellar.db: file changed as we read it
2023-07-27T15:33:16.109 [20230727-193211-21860503] [MainProcess:PID-33374] [Proc::VerifyProcStatusAndGetArchive:Proc.py:158] ERROR: Error at process ConfigFilesBackup; rc:1.
2023-07-27T15:33:16.109 [20230727-193211-21860503] [MainProcess:PID-33374] [Proc::VerifyProcStatusAndGetArchive:Proc.py:162] ERROR: stderr:Failed configuration files backup
/usr/bin/tar: storage/db/vmware-observability/stellar.db: file changed as we read it
2023-07-27T15:33:16.109 [20230727-193211-21860503] [MainProcess:PID-33374] [Proc::VerifyProcStatusAndGetArchive:Proc.py:172] INFO: Following error message isn't localized:
stderr:Failed configuration files backup
/usr/bin/tar: storage/db/vmware-observability/stellar.db: file changed as we read it
2023-07-27T15:33:16.109 [20230727-193211-21860503] [MainProcess:PID-33374] [BackupManager::main:BackupManager.py:591] ERROR: BackupManager encountered an exception: Hit exception inside process ConfigFilesBackup:
2023-07-27T15:33:16.111 [20230727-193211-21860503] [MainProcess:PID-33374] [BackupManager::HandleBackupCleanup:BackupManager.py:469] INFO: Cleaning up the backup job.
2023-07-27T15:33:16.111 [20230727-193211-21860503] [MainProcess:PID-33374] [BackupManager::Cleanup:BackupManager.py:405] INFO: Cleaning up all running backup child processes.
2023-07-27T15:33:46.129 [20230727-193211-21860503] [MainProcess:PID-33374] [Proc::CleanupChildProcesses:Proc.py:253] ERROR: Child process failed to terminate
2023-07-27T15:33:46.139 [20230727-193211-21860503] [MainProcess:PID-33374] [BackupManager::Cleanup:BackupManager.py:412] INFO: Cleaning up all sizeFiles.
2023-07-27T15:33:46.140 [20230727-193211-21860503] [MainProcess:PID-33374] [BackupManager::Cleanup:BackupManager.py:420] INFO: Cleaning up ConfigFiles.
2023-07-27T15:33:46.140 [20230727-193211-21860503] [MainProcess:PID-33374] [BackupManager::Cleanup:BackupManager.py:420] INFO: Cleaning up ComponentScripts.
2023-07-27T15:33:46.141 [20230727-193211-21860503] [MainProcess:PID-33374] [BackupManager::Cleanup:BackupManager.py:420] INFO: Cleaning up VCDB.
2023-07-27T15:33:46.141 [20230727-193211-21860503] [MainProcess:PID-33374] [VCDB::BackupVCDBCleanup:VCDB.py:2109] INFO: Finishing full database backup
2023-07-27T15:33:46.157 [20230727-193211-21860503] [MainProcess:PID-33374] [VCDB::BackupVCDBCleanup:VCDB.py:2118] INFO: Successfully cleaned up for VCDB backup.
2023-07-27T15:33:46.157 [20230727-193211-21860503] [MainProcess:PID-33374] [BackupManager::Cleanup:BackupManager.py:420] INFO: Cleaning up Lotus.
2023-07-27T15:33:46.157 [20230727-193211-21860503] [MainProcess:PID-33374] [Lotus::BackupLotusCleanup:Lotus.py:140] INFO: Successfully completed Lotus cleanup.
2023-07-27T15:33:46.157 [20230727-193211-21860503] [MainProcess:PID-33374] [BackupManager::Cleanup:BackupManager.py:420] INFO: Cleaning up VTSDB.
2023-07-27T15:33:46.157 [20230727-193211-21860503] [MainProcess:PID-33374] [VTSDB::BackupVTSDBCleanup:VTSDB.py:237] INFO: Successfully cleaned up for VTSDB backup.
2023-07-27T15:33:46.157 [20230727-193211-21860503] [MainProcess:PID-33374] [BackupManager::Cleanup:BackupManager.py:429] INFO: Cleaning up failed backup files.
2023-07-27T15:33:46.158 [20230727-193211-21860503] [MainProcess:PID-33374] [SMBClientStorageIOLib::_run_cmd:SMBClientStorageIOLib.py:23] INFO: Running Command: ['/usr/bin/smbclient', '//nas/Public/', '-A', '/dev/shm/smbCredentialFilebpfs7yn
f', '--send-buffer', '100M', '-t', '140', '-c', 'deltree vCSA/vCenter/sn_vcenter/M_8.0.1.00200_20230727-193211_']
2023-07-27T15:33:46.552 [20230727-193211-21860503] [MainProcess:PID-33374] [SMBClientStorageIOLib::_process_cmd_output:SMBClientStorageIOLib.py:97] ERROR: Command failed with error: NT_STATUS_SHARING_VIOLATION
2023-07-27T15:33:46.553 [20230727-193211-21860503] [MainProcess:PID-33374] [SMBClientStorageIOLib::_process_cmd_output:SMBClientStorageIOLib.py:104] ERROR: smb cmd failed. RC: 1, Err: , Cmd: ['/usr/bin/smbclient', '//nas/Public/', '-A', '/d
ev/shm/smbCredentialFilebpfs7ynf', '--send-buffer', '100M', '-t', '140', '-c', 'deltree vCSA/vCenter/sn_vcenter/M_8.0.1.00200_20230727-193211_']
2023-07-27T15:33:46.553 [20230727-193211-21860503] [MainProcess:PID-33374] [BackupManager::Cleanup:BackupManager.py:433] WARNING: Ignore to clean up failed backup files, due to issue: 'Plugin error occurred. ErrCode: 5, Args: ()'.
2023-07-27T15:33:46.554 [20230727-193211-21860503] [MainProcess:PID-33374] [BackupManager::HandleBackupCleanup:BackupManager.py:477] ERROR: Failed to cleanup the backup job. Error: Failed to clean up the backup job.
2023-07-27T15:33:46.866 [20230727-193211-21860503] [MainProcess:PID-33374] [vpxdevent_lib::dispatch_events:vpxdevent_lib.py:275] INFO: Event com.vmware.applmgmt.backup.job.failed.event successfully posted to http://localhost:8085/sdk
2023-07-27T15:33:46.870 [20230727-193211-21860503] [MainProcess:PID-33374] [BackupManager::main:BackupManager.py:618] INFO: Backup job failed.
Hello,
Consider if it may be appropriate to update to the 8.0U1C build released in recent days, from the release notes:
"vCenter backup might fail due to a database error
During vCenter backup, config files are copied onto a backup server. A write operation on any of the vCenter database files during the copy operation causes it to fail. As a result, the backup operation also fails. This issue is resolved in this release. With the fix, instead of directly copying files, vCenter makes a separate DB dump first and then completes the backup".
That would be consistent (a reasonable assumption) with the error telling you that the "stellar.db" file underwent changes as it was being read.
Regards,
Ferdinando
Have you made any recent changes like upgrading vCenter? after that you are facing this issue?
As per logs, smb cmd command failed, seems at target end either it locked the files or any stale session is there.
what is your target? Linux or Windows? try resetting smb service on the target then run the backup again.
Regards,
Sachchidanand
Yes, I think an upgrade caused this, but I'm not sure which. I've been backing up to the same device, using the same protocol for years. There were no changes on the destination. If I initiate the backup manually, I can see the files being copied on the NAS but then the whole process stops around 92%. I tried switching between SMB and FTP, same result.
I'm not sure what to make of the repeated "file changed as we read it" messages in the backup log
Strange that it fails. VAMI backup is a straightforward config and not complicated
The backup starts fine, I can see the files being created and increasing in size, and then around 95% it fails. vum.gz keeps growing and it is during this phase when the whole thing crashes and all files are cleaned up (deleted). Studying backup.log as best as I can, here is the relevant part, but I don't know what to make of it:
2023-07-28T16:19:19.314 [20230728-201812-21860503] [ConfigFilesBackup:PID-36187] [Proc::GetProcsStatus:Proc.py:327] ERROR: rc: 1, stderr: /usr/bin/tar: storage/db/vmware-observability/stellar.db: file changed as we read it
2023-07-28T16:19:19.314 [20230728-201812-21860503] [ConfigFilesBackup:PID-36187] [Proc::GetProcsStatus:Proc.py:345] ERROR: Process returncode is 1, but expected exit codes are [0].
2023-07-28T16:19:19.315 [20230728-201812-21860503] [ConfigFilesBackup:PID-36187] [Proc::GetProcsStatus:Proc.py:361] ERROR: rc: 0, stderr: b'putting file - as \\vCenter\\sn_vcenter.conestogo.on.ca\\M_8.0.1.00200_20230728-201812_\\config_file
s.tar.gz (6315.7 kb/s) (average 6315.7 kb/s)\n'
2023-07-28T16:19:19.315 [20230728-201812-21860503] [ConfigFilesBackup:PID-36187] [ConfigFiles::BackupConfigFiles:ConfigFiles.py:336] ERROR: Failed configuration files backup- rc: 1; stdout: ; stderr:/usr/bin/tar: storage/db/vmware-observabi
lity/stellar.db: file changed as we read it
; exception:
2023-07-28T16:19:19.316 [20230728-201812-21860503] [ConfigFilesBackup:PID-36187] [ConfigFiles::BackupConfigFiles:ConfigFiles.py:347] ERROR: Failed configuration files backup
Underlying process status. rc: 1
stdout:
stderr: /usr/bin/tar: storage/db/vmware-observability/stellar.db: file changed as we read it
Traceback (most recent call last):
File "/usr/lib/applmgmt/backup_restore/py/vmware/appliance/backup_restore/components/ConfigFiles.py", line 340, in BackupConfigFiles
status)
util.Common.BackupRestoreError: Failed configuration files backup
Underlying process status. rc: 1
stdout:
stderr: /usr/bin/tar: storage/db/vmware-observability/stellar.db: file changed as we read it
2023-07-28T16:19:20.137 [20230728-201812-21860503] [MainProcess:PID-36085] [Proc::VerifyProcStatusAndGetArchive:Proc.py:158] ERROR: Error at process ConfigFilesBackup; rc:1.
2023-07-28T16:19:20.137 [20230728-201812-21860503] [MainProcess:PID-36085] [Proc::VerifyProcStatusAndGetArchive:Proc.py:162] ERROR: stderr:Failed configuration files backup
/usr/bin/tar: storage/db/vmware-observability/stellar.db: file changed as we read it
2023-07-28T16:19:20.137 [20230728-201812-21860503] [MainProcess:PID-36085] [Proc::VerifyProcStatusAndGetArchive:Proc.py:172] INFO: Following error message isn't localized:
stderr:Failed configuration files backup
/usr/bin/tar: storage/db/vmware-observability/stellar.db: file changed as we read it
2023-07-28T16:19:20.138 [20230728-201812-21860503] [MainProcess:PID-36085] [BackupManager::main:BackupManager.py:591] ERROR: BackupManager encountered an exception: Hit exception inside process ConfigFilesBackup:
Hello,
Consider if it may be appropriate to update to the 8.0U1C build released in recent days, from the release notes:
"vCenter backup might fail due to a database error
During vCenter backup, config files are copied onto a backup server. A write operation on any of the vCenter database files during the copy operation causes it to fail. As a result, the backup operation also fails. This issue is resolved in this release. With the fix, instead of directly copying files, vCenter makes a separate DB dump first and then completes the backup".
That would be consistent (a reasonable assumption) with the error telling you that the "stellar.db" file underwent changes as it was being read.
Regards,
Ferdinando
Can you please try the below and report the status. Also better to report the issue via SR with VMware as this is with 8.0 version.
Take a backup of appliance.conf file .
1. vi /etc/applmgmt/appliance/appliance.conf
2. Change
"fileTransferMode": "parallel",
to
"fileTransferMode": "serial",
3. Restart applmgmt: service-control --restart applmgmt
Thank you very much for the suggestion.
This seems to have fixed it, at least in manual mode. I'll keep an eye on it to see if it's working following the schedule.
Sorry for asking, but what is "SR with VMware" (for reporting issues)?
MG
Unfortunately, this failed almost every other time I tried to run the job manually, so I didn't wait for the backup schedule. I decided to upgrade to 8.0U1C and that fixed the problem. I ran it several times after the upgrade and it worked flawless every single time (reverted fileTransferMode back to parallel).
Thanks again for your suggestion, I really appreciate you effort
I can confirm that updating to 8.0U1C fixed the problem with the backup. Thank you very much for the suggestion.
SR is Service Request with vmware. Follow the KB to open SR with vmware:
https://kb.vmware.com/s/article/2006985
Regards,
Sachchidanand
Hello,
IMHO the method proposed by @Ajay1988 can be useful for those who may find themselves in the same situation and, for one reason or another, cannot update to the latest version of the vCenter object or are not in a position to involve technical support.
Regards,
Ferdinando
kinnison
Good day
in one of the vcenter "stellar.db" in the folder "/storage/db/vmware-observability" has a size of 1.8 Gby . perform 2 tests
1) I stopped the services
service-control --stop observability
service-control --stop observability-vapi
make the backup
start the services
service-control --start observability
service-control --start observability-vapi
The backup was done successfully
2) I stopped the services
rename the 2 files in the folder
start the services
the file was created with size 0
the backup was successful
But the file doesn't grow
Is there a way to clean this "stellar.db" file?
Thank you very much in advance
Hello PabloEcipi,
I Beg your pardon but IMHO this is a question that @Ajay1988 or some other colleague of his is/are better entitled to answer. Personally I haven't looked into it too much because I haven't encountered "backup" problems yet and in any case my vCenter object is already updated to version 8.0U1c which should prevent them anyway.
And if I have to be honest, for now I'm careful not to mess with vCenter databases of any kind, let's say that my "main" vCenter object for how it works is absolutely unsupported and therefore I prefer to stay away from "other" potential problems.
Regards,
Ferdinando
I have the same issue after updating to 8.0.2. Changing parallel to serial have been resolved the issue. Thanks!
I patched 7.0.3d to 7.0.3m, and FIPS is enabled, I ran into this issue, and this workaround fixed it.
Thank you @Ajay1988
I was also having this issue and Ajay1988's suggestion fixed it for me, too.