After updating from 7.0.2.00500 to 7.0.3.00500 the scheduled backups to SMB stopped working. Not domain-joined vCenter to not domain-joined NAS. This has worked before but stopped after the update. Manual update works fine (3.5GB), Scheduled updates fail after a few hundred MB.
This is the error from /storage/log/vmware/applmgmt/backup.log:
2022-04-07T23:20:36.252 [20220407-215906-19480866] [VCDB-WAL-Backup:PID-30577] [VCDB::_backup_wal_files:VCDB.py:877] INFO: Backup WAL files from /storage/archive/vpostgres
2022-04-07T23:20:36.252 [20220407-215906-19480866] [VCDB-WAL-Backup:PID-30577] [VCDB::_list_wal_files:VCDB.py:727] INFO: Found 0 WAL files.
2022-04-07T23:20:36.252 [20220407-215906-19480866] [VCDB-WAL-Backup:PID-30577] [VCDB::_list_wal_files:VCDB.py:727] INFO: Found 0 WAL files.
2022-04-07T23:20:36.252 [20220407-215906-19480866] [VCDB-WAL-Backup:PID-30577] [VCDB::_verify_wal_archived:VCDB.py:737] INFO: Checking the WAL file has been archived: 000000010000002F000000CA.gz
2022-04-07T23:20:36.252 [20220407-215906-19480866] [VCDB-WAL-Backup:PID-30577] [VCDB::_verify_wal_archived:VCDB.py:748] INFO: Verified the wal file archived at: /storage/archive/vpostgres/000000010000002F000000CA.gz
2022-04-07T23:20:36.252 [20220407-215906-19480866] [VCDB-WAL-Backup:PID-30577] [VCDB::_archive_current_wals:VCDB.py:1027] INFO: Dispatching 1 WAL files into backup archive: wal_backup_1.tar.gz
2022-04-07T23:20:36.253 [20220407-215906-19480866] [VCDB-WAL-Backup:PID-30577] [backupRestoreDispatch::dispatchFiles:backupRestoreDispatch.py:276] INFO: tarCmd = ['/usr/bin/tar', '-cz', '-C', '/', '--ignore-failed-read', '-T', '/var/log/vmware/applmgmt/wal_files_4yoi1xib.lst', '--warning', 'no-file-ignored', '--warning', 'no-file-changed', '--warning', 'no-file-removed']
2022-04-07T23:20:36.329 [20220407-215906-19480866] [VCDB-WAL-Backup:PID-30577] [Proc::GetProcsStatus:Proc.py:327] ERROR: rc: 2, stderr: /usr/bin/tar: -: Cannot write: Broken pipe
/usr/bin/tar: Error is not recoverable: exiting now
2022-04-07T23:20:36.329 [20220407-215906-19480866] [VCDB-WAL-Backup:PID-30577] [Proc::GetProcsStatus:Proc.py:332] INFO: Skip to report the error.
2022-04-07T23:20:36.329 [20220407-215906-19480866] [VCDB-WAL-Backup:PID-30577] [Proc::GetProcsStatus:Proc.py:345] ERROR: Process returncode is 2, but expected exit codes are [0, 1].
2022-04-07T23:20:36.329 [20220407-215906-19480866] [VCDB-WAL-Backup:PID-30577] [Proc::GetProcsStatus:Proc.py:327] ERROR: rc: 1, stderr: Traceback (most recent call last):
File "/usr/lib/applmgmt/backup_restore/py/vmware/appliance/backup_restore/plugins/../util/Calculate.py", line 59, in <module>
main(sys.argv[1], sys.argv[2], sys.argv[3])
File "/usr/lib/applmgmt/backup_restore/py/vmware/appliance/backup_restore/plugins/../util/Calculate.py", line 46, in main
stdout_obj.write(data)
BrokenPipeError: [Errno 32] Broken pipe
2022-04-07T23:20:36.329 [20220407-215906-19480866] [VCDB-WAL-Backup:PID-30577] [Proc::GetProcsStatus:Proc.py:332] INFO: Skip to report the error.
2022-04-07T23:20:36.330 [20220407-215906-19480866] [VCDB-WAL-Backup:PID-30577] [Proc::GetProcsStatus:Proc.py:345] ERROR: Process returncode is 1, but expected exit codes are [0].
2022-04-07T23:20:36.330 [20220407-215906-19480866] [VCDB-WAL-Backup:PID-30577] [Proc::UpdateExceptionStatus:Proc.py:383] ERROR: Checksum not generated at /dev/shm/backupRestoreSumFile-20220407-215906-19480866-6zw58xil
2022-04-07T23:20:36.330 [20220407-215906-19480866] [VCDB-WAL-Backup:PID-30577] [VCDB::run:VCDB.py:1111] ERROR: Failed to backup WAL files.
2022-04-07T23:20:36.330 [20220407-215906-19480866] [VCDB-WAL-Backup:PID-30577] [VCDB::run:VCDB.py:1112] ERROR: Failed to dispatch WAL files.
Underlying process status. rc: 19
stdout:
stderr:
What I've checked/tried:
- Check that there are no snapshots preset.
- Renamed the "vcenter" folder inside the SMB share to vcenter-old and let the scheduled backup make a new one. It creates a new one but still fails after an amount of data and files has been written.
- Manual backup using the Scheduled settings works.
- Directory Server State: (a previous time this caused problems because it was in a Standalone state)
/usr/lib/vmware-vmafd/bin/dir-cli state get
Directory Server State: Normal (3)
- Check that backupMarker.txt is not present in /etc/vmware/.
- Checking file permissions in /storage/archive/vpostgres.
I fixed it with a suggestion I found online, by adjusting the time the schedule starts. I set it an hour later and it started working again.
1 week later, still only able to make manual backups. I guess I'll need to make a ticket.
I fixed it with a suggestion I found online, by adjusting the time the schedule starts. I set it an hour later and it started working again.
I just tried adjusting the time based on your suggestion and it finally worked again. I would be curious to learn why it works at some times and not others. Thanks Marco.