Bug #23029
closedkatello-backup does incremental backup of pulp data when checksum check is failing
Description
Cloned from https://bugzilla.redhat.com/show_bug.cgi?id=1488525
Description of problem:
Referring to upstream code, but the problem is the same in Red Hat Satellite 6.2.
If we are doing an `--online-backup` with `katello-backup` we are running this part of the code:
FileUtils.cd '/var/lib/pulp' do
puts "Backing up Pulp data... "
matching = false
until matching
checksum1 = run_cmd("find . -printf '%T@\n' | md5sum")
create_pulp_data_tar
checksum2 = run_cmd("find . -printf '%T@\n' | md5sum")
matching = (checksum1 == checksum2)
end
See https://github.com/Katello/katello-packaging/blob/master/katello/katello-backup#L269
The important part is the following:
checksum1 = run_cmd("find . -printf '%T@\n' | md5sum")
create_pulp_data_tar
checksum2 = run_cmd("find . -printf '%T@\n' | md5sum")
We are making sure that nothing has changed during the time the `tar` command is run to make sure we have a consistent backup. This is OK. But in case we do hit an issue, we re-run the activity which is failing as the following option is set in the `tar` command:
run_cmd("tar --selinux --create --file=#{File.join(@dir, 'pulp_data.tar')} --exclude=var/lib/pulp/katello-export --listed-incremental=#{File.join(@dir, '.pulp.snar')} --transform 's,^,var/lib/pulp/,S' -S *")
See in https://github.com/Katello/katello-packaging/blob/master/katello/katello-backup#L349
According `tar` man page does `--listed-incremental` create the file if it does not exist or else use it when exists to allow incremental option.
"Handle new GNU-format incremental backups. FILE is the name of a snapshot file, where tar stores additional information which is used to decide which files changed since the previous incremental dump and, consequently, must be dumped again. If FILE does not exist when creating an archive, it will be created and all files will be added to the resulting archive"
Since we don't use an option archive name, that is changing between the re-run we end up with only the changes in `pulp` backup that happen between the first and second run.
This should not happen and we should make sure a full backup is taken if we re-run the `tar` command. Another option would be to provide dynamic file names to avoid that the first `pulp` tar is overwritten by the second run and thus is missing all the data.
Version-Release number of selected component (if applicable):
- satellite-6.2.11-2.0.el7sat.noarch
How reproducible:
- Always
Steps to Reproduce:
1. Check out https://bugzilla.redhat.com/show_bug.cgi?id=1478047 where we found a possible cause that would trigger a re-run of the `tar` command to backup `/var/lib/pulp`
Actual results:
Empty `pulp` backup created due to slight modification happening during the first run of the `tar` command that does backup `/var/lib/pulp`. Once the job is rerun, we have a incremental file available and thus only changes are considered which might be very small and thus leaving incomplete backup.
Expected results:
Proper backup to be available after `katello-backup` run. No matter if it ran once or twice or even more.
Additional info:
See https://bugzilla.redhat.com/show_bug.cgi?id=1478047 which is related and which is the reason why we found the problem.
Updated by The Foreman Bot almost 7 years ago
- Status changed from New to Ready For Testing
- Pull request https://github.com/theforeman/foreman-packaging/pull/2313 added
Updated by John Mitsch almost 7 years ago
- Translation missing: en.field_release set to 338
Updated by Jonathon Turel over 6 years ago
- Status changed from Ready For Testing to Closed