Improve speed of manifest refresh by running RefreshIfNeeded steps concurrently
Description of problem:
During a manifest refresh, all enabled (Red Hat) repositories are refreshed via Actions::Pulp3::Orchestration::Repository::RefreshIfNeeded dynflow step. That step is called sequentially for each such repo.
Since pulp can easily parallelize this independent work, we shall speed up the manifest refresh by calling the dynflow steps concurrently.
To prevent some scalability issues similarly like with Capsule sync, let add concurrency of blocks of :foreman_proxy_content_batch_size tasks "only".
Proposed patch (that I will raise PR later on):
@ -42,10 +42,14 @ module Actions
def plan_refresh_repos(import_products_action, org)
repositories = ::Katello::Repository.in_default_view.in_product(::Katello::Product.redhat.in_org(org))
- repositories.each do |repo|
- :dependency => import_products_action.output)
+ repositories.in_groups_of(Setting[:foreman_proxy_content_batch_size], false) do |repo_batch|
+ concurrence do
+ repo_batch.each do |repo|
+ :dependency => import_products_action.output)
Having 60 (Red Hat) repos enabled, the patch improved time of manifest refresh from 100 seconds to approx 60. (while most fo the remaining time was spent in candlepin). So improvement in tens of percents of running time can be achieved in general (depending on # of enabled RH repos).