Bug #34957

Manifest refresh randomly fails with "No such file or directory" when having multile dynflow workers

Added by Adam Ruzicka 10 months ago. Updated 10 months ago.

Target version:
Bugzilla link:
Fixed in Releases:
Found in Releases:
Red Hat JIRA:


Cloned from

Description of problem:
Manifest refresh randomly fails on a Satellite with multiple dynflow workers with error:

Error: No such file or directory @ rb_sysopen - /tmp/

The reason is tricky :
- ManifestRefresh task determines filename for the new manifest file as /tmp/#{rand}.zip
- UpstreamExport dynflow step is asked to export the new manifest to that file
- subsequent Import dynflow step is asked to read the file and process the update further

The dynflow steps can be processed by different dynflow workers, which are run as different systemd services. And sadly for us, the services use their own private temp directory like:


So, when UpstreamExport step is executed by one dynflow worker, it puts the zip file to its own private temp. And if we are unlucky, the Import step is picked by another worker that misses the file in its own private temp /o\ .

Which means, having 3 dynflow workers, there is just 1/3 probability the manifest refresh succeeds.

We need to use static/shared tmp file instead.

Version-Release number of selected component (if applicable):
Sat 6.10.5

How reproducible:
2/3 when having 3 dynflow workers

Steps to Reproduce:
1. Set up Satellite with 3 dynflow workers, e.g. per
2. Import a manifest
3. Repeatedly refresh it:
hammer subscription refresh-manifest --organization-id=1

Actual results:
3. randomly fails with error:
Error: No such file or directory @ rb_sysopen - /tmp/

in such a case, the zip file can be spot under a private temp dir of a worker's service, like:

Expected results:
manifest refresh to always succeed

Additional info:

Associated revisions

Revision dabeda40 (diff)
Added by Adam Ruzicka 10 months ago

Fixes #34957 - Put manifest into a shared temp directory (#10129)

On production deployments, dynflow workers have private tmp directories,
meaning they cannot use /tmp as a place for shared data. This could lead
to manifest refresh failing on scaled-up deployments.


#1 Updated by The Foreman Bot 10 months ago

  • Status changed from New to Ready For Testing
  • Pull request added

#2 Updated by Ian Ballou 10 months ago

  • Triaged changed from No to Yes
  • Target version set to Katello 4.5.0
  • Category set to Subscriptions

#3 Updated by The Foreman Bot 10 months ago

  • Fixed in Releases Katello 4.5.0 added

#4 Updated by Adam Ruzicka 10 months ago

  • Status changed from Ready For Testing to Closed

Also available in: Atom PDF