Bug #23029

katello-backup does incremental backup of pulp data when checksum check is failing

Added by Christine Fouant over 3 years ago. Updated about 3 years ago.

Backup & Restore
Target version:
Bugzilla link:
Fixed in Releases:
Found in Releases:


Cloned from

Description of problem:

Referring to upstream code, but the problem is the same in Red Hat Satellite 6.2.

If we are doing an `--online-backup` with `katello-backup` we are running this part of the code: '/var/lib/pulp' do
puts "Backing up Pulp data... "
matching = false
until matching
checksum1 = run_cmd("find . -printf '%T@\n' | md5sum")
checksum2 = run_cmd("find . -printf '%T@\n' | md5sum")
matching = (checksum1 == checksum2)


The important part is the following:

checksum1 = run_cmd("find . -printf '%T@\n' | md5sum")
checksum2 = run_cmd("find . -printf '%T@\n' | md5sum")

We are making sure that nothing has changed during the time the `tar` command is run to make sure we have a consistent backup. This is OK. But in case we do hit an issue, we re-run the activity which is failing as the following option is set in the `tar` command:

run_cmd("tar --selinux --create --file=#{File.join(@dir, 'pulp_data.tar')} --exclude=var/lib/pulp/katello-export --listed-incremental=#{File.join(@dir, '.pulp.snar')} --transform 's,^,var/lib/pulp/,S' -S *")

See in

According `tar` man page does `--listed-incremental` create the file if it does not exist or else use it when exists to allow incremental option.

"Handle new GNU-format incremental backups. FILE is the name of a snapshot file, where tar stores additional information which is used to decide which files changed since the previous incremental dump and, consequently, must be dumped again. If FILE does not exist when creating an archive, it will be created and all files will be added to the resulting archive"

Since we don't use an option archive name, that is changing between the re-run we end up with only the changes in `pulp` backup that happen between the first and second run.

This should not happen and we should make sure a full backup is taken if we re-run the `tar` command. Another option would be to provide dynamic file names to avoid that the first `pulp` tar is overwritten by the second run and thus is missing all the data.

Version-Release number of selected component (if applicable):

- satellite-6.2.11-2.0.el7sat.noarch

How reproducible:

- Always

Steps to Reproduce:
1. Check out where we found a possible cause that would trigger a re-run of the `tar` command to backup `/var/lib/pulp`

Actual results:

Empty `pulp` backup created due to slight modification happening during the first run of the `tar` command that does backup `/var/lib/pulp`. Once the job is rerun, we have a incremental file available and thus only changes are considered which might be very small and thus leaving incomplete backup.

Expected results:

Proper backup to be available after `katello-backup` run. No matter if it ran once or twice or even more.

Additional info:

See which is related and which is the reason why we found the problem.


#1 Updated by The Foreman Bot over 3 years ago

  • Status changed from New to Ready For Testing
  • Pull request added

#2 Updated by John Mitsch over 3 years ago

  • Legacy Backlogs Release (now unused) set to 338

#3 Updated by Jonathon Turel about 3 years ago

  • Status changed from Ready For Testing to Closed

Also available in: Atom PDF