Project

General

Profile

Actions

Bug #21727

closed

very slow publishing of a content view with filters containing many errata

Added by Partha Aji almost 7 years ago. Updated over 6 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
Content Views
Target version:
Difficulty:
Triaged:
Fixed in Releases:
Found in Releases:

Description

Description of problem:
Having a Content View with a filter that includes or excludes thousands of errata, an attempt to publish the CV takes too much time (i.e. 2+ minutes per each CV's repo with the filter applied).

As an example, having a CV with 10 repos with such filters, it takes approx. 30 minutes of planning the task (and then just few minutes to execute it, incld. CopyRpm or DistributorPublish).

That is bad from two reasons:
1) overall performance is bad (because planning a task takes several times more than executing the task)
2) user sees practically nothing for most of the task lifecycle. When interested why the task publish takes so long, task details are empty (since task is still in planning).

Particular code that takes so long:

https://github.com/Katello/katello/blob/master/app/lib/actions/katello/repository/clone_yum_content.rb#L17 (clause_gen.generate)

that is this method:

https://github.com/Katello/katello/blob/master/app/lib/katello/util/filter_clause_generator.rb#L9-L12

The method is inefficient for arguments with thousands of errata in it.

Today, there exists a workaround in using opposite filtering (i.e. instead of "include all errata older than month ago", use "exclude any newer errata" (and deal with pkgs outside errata). However this workaround will be less and less applicable as the overall number of errata in a repo will grow over time.

How reproducible:
100%

Steps to Reproduce:
1. Have synced several bigger repos with many errata
2. Create a CV, add there all the repos.
3. Add a filter "include all errata older than date X.Y." such that the date is just a month old / to include most of errata in the CV
4. Click to publish the CV
5. Check how long it will take to publish the CV (and when the task will leave planning phase / will start executing the very first step)

Actual results:
5. CV publish takes 30+ minutes, most of the time is spent in planning (very first dynflow step is kicked off after a long time)

Expected results:
5. Some reasonable lower planning phase

Additional info:
Just add some debugging statements just around the line https://github.com/Katello/katello/blob/master/app/lib/actions/katello/repository/clone_yum_content.rb#L17 to see the delay is right there.

NOTE the main issue is 'planning' of the task is taking a very long time. Not the task to the run the copy in itself.

Actions #1

Updated by Partha Aji almost 7 years ago

  • Bugzilla link set to 1509965
Actions #2

Updated by Partha Aji almost 7 years ago

  • Target version set to 236
Actions #3

Updated by The Foreman Bot almost 7 years ago

  • Status changed from New to Ready For Testing
  • Pull request https://github.com/Katello/katello/pull/7079 added
Actions #4

Updated by Justin Sherrill almost 7 years ago

  • Translation missing: en.field_release set to 329
Actions #5

Updated by Partha Aji almost 7 years ago

  • Status changed from Ready For Testing to Closed
  • % Done changed from 0 to 100
Actions

Also available in: Atom PDF