Project

General

Profile

Bug #21727

very slow publishing of a content view with filters containing many errata

Added by Partha Aji about 4 years ago. Updated over 3 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
Content Views
Target version:
Difficulty:
Triaged:
Bugzilla link:
Fixed in Releases:
Found in Releases:

Description

Description of problem:
Having a Content View with a filter that includes or excludes thousands of errata, an attempt to publish the CV takes too much time (i.e. 2+ minutes per each CV's repo with the filter applied).

As an example, having a CV with 10 repos with such filters, it takes approx. 30 minutes of planning the task (and then just few minutes to execute it, incld. CopyRpm or DistributorPublish).

That is bad from two reasons:
1) overall performance is bad (because planning a task takes several times more than executing the task)
2) user sees practically nothing for most of the task lifecycle. When interested why the task publish takes so long, task details are empty (since task is still in planning).

Particular code that takes so long:

https://github.com/Katello/katello/blob/master/app/lib/actions/katello/repository/clone_yum_content.rb#L17 (clause_gen.generate)

that is this method:

https://github.com/Katello/katello/blob/master/app/lib/katello/util/filter_clause_generator.rb#L9-L12

The method is inefficient for arguments with thousands of errata in it.

Today, there exists a workaround in using opposite filtering (i.e. instead of "include all errata older than month ago", use "exclude any newer errata" (and deal with pkgs outside errata). However this workaround will be less and less applicable as the overall number of errata in a repo will grow over time.

How reproducible:
100%

Steps to Reproduce:
1. Have synced several bigger repos with many errata
2. Create a CV, add there all the repos.
3. Add a filter "include all errata older than date X.Y." such that the date is just a month old / to include most of errata in the CV
4. Click to publish the CV
5. Check how long it will take to publish the CV (and when the task will leave planning phase / will start executing the very first step)

Actual results:
5. CV publish takes 30+ minutes, most of the time is spent in planning (very first dynflow step is kicked off after a long time)

Expected results:
5. Some reasonable lower planning phase

Additional info:
Just add some debugging statements just around the line https://github.com/Katello/katello/blob/master/app/lib/actions/katello/repository/clone_yum_content.rb#L17 to see the delay is right there.

NOTE the main issue is 'planning' of the task is taking a very long time. Not the task to the run the copy in itself.

Associated revisions

Revision 6c54a7fa (diff)
Added by Partha Aji about 4 years ago

Fixes #21727 - Faster errata filtered CV publish

This commit tries to speed up the publishing of content views via Errata
filters by using Katello Database instead of requesting that
information from pulp especially while planning.

History

#1 Updated by Partha Aji about 4 years ago

  • Bugzilla link set to 1509965

#2 Updated by Partha Aji about 4 years ago

  • Target version set to 236

#3 Updated by The Foreman Bot about 4 years ago

  • Status changed from New to Ready For Testing
  • Pull request https://github.com/Katello/katello/pull/7079 added

#4 Updated by Justin Sherrill about 4 years ago

  • Legacy Backlogs Release (now unused) set to 329

#5 Updated by Partha Aji about 4 years ago

  • % Done changed from 0 to 100
  • Status changed from Ready For Testing to Closed

Also available in: Atom PDF