Bug #39327
openRefresh RollingCV Repo task hangs when assigning multiple environments to RollingCV
Description
If the RollingCV has multiple lifecycle environmnents, and you try to perform a sync on the repository included in RollingCV, there is a high chance that one of the Refresh RollingCV Repo tasks will hang, which will cause the repository to stay locked, thus unable to be interact with it, preventing future syncs and modifications to it. The task will remain in state: planned, with progress at 100%, and result pending (picture in attachments). In dynflow console everything appears completed, and the sync itself finishes, but prevents future syncs.
Resulting states:
plan.state :stopped, plan.result :success
task.state 'planned', task.result 'pending'
Working theory:
When a library source syncs and has multiple rolling clones across environments / Rolling CVs, RefreshRollingRepo is spawned once per clone in parallel. The clone-update path likely ends with a callback or event keyed on the shared root repo (Pulp publication regen? applicability? content count?). When N siblings try to deliver that event in parallel, only one’s wrapper-state-update completes cleanly; the others’ completion events get lost. Sources never wedge because they aren’t competing with siblings on shared state.
Current walkaround to the issue is assign RollingCV only to Library, which generates only one Refresh RollingCV Repo task.
Issue was also discussed on the Foreman forum:
https://community.theforeman.org/t/refresh-rollingcv-repo-task-hangs-causing-repos-to-not-up/46243
Files
Updated by Jeremy Lenz 27 days ago
- Target version set to Katello 4.21.0
- Triaged changed from No to Yes
Updated by Quirin Pamp 16 days ago
· Edited
I am pretty sure this is related to the fact that we are starting parallel Refresh actions as "async" tasks here: https://github.com/Katello/katello/blob/master/app/lib/actions/katello/repository/sync.rb#L73
This is the only place where we do this. I forgot why exactly we chose async tasks here (but not elsewhere).
I was also able to reproduce the issue on the first try by having a rolling CV with 6 associated environments, and syncing some new content into the relevant repo. I ended up with two of six tasks stuck in 100% complete; planned pending
Edit: I believe we made this an async task because we don't yet have the info if we need it during the planning phase.
Updated by Ian Ballou 6 days ago
- Target version changed from Katello 4.21.0 to Katello 4.21.1
Updated by Ian Ballou 6 days ago
- Target version changed from Katello 4.21.1 to Katello 5.0.0