Bug #18376
showing Content -> Errata with "Installable" checkbox checked hakes too much time and memory when on scale
Description
Cloned from https://bugzilla.redhat.com/show_bug.cgi?id=1417642
Description of problem:
With 5 errata and 6138 clients registered, showing Content -> Errata with "Installable" checkbox checked hakes too much time and memory
Version-Release number of selected component (if applicable):
satellite-6.2.6-2.0.el7sat.noarch
How reproducible:
always
Steps to Reproduce:
1. Have 6138 clients registered, each with 5 installable erratas
(same 5 in our case)
2. Go to Content -> Errata
Actual results:
Takes more than 8 minutes to load the page, Ruby RSS memory goes from 4 to 10 GB
Expected results:
Should be faster
Additional info:
Related issues
Associated revisions
History
#1
Updated by Thomas McKay over 5 years ago
- Subject changed from showing Content -> Errata with "Installable" checkbox checked hakes too much time and memory when on scale to showing Content -> Errata with "Installable" checkbox checked hakes too much time and memory when on scale
- Target version set to 161
#2
Updated by Partha Aji over 5 years ago
- Assignee changed from Walden Raines to Partha Aji
#3
Updated by Partha Aji over 5 years ago
I noticed duplicated inner join that is causing the craziness..
SELECT DISTINCT COUNT
FROM "katello_errata"
INNER JOIN "katello_repository_errata" ON "katello_repository_errata"."erratum_id" = "katello_errata"."id"
INNER JOIN "katello_content_facet_errata" ON "katello_content_facet_errata"."erratum_id" = "katello_errata"."id"
INNER JOIN "katello_content_facets" ON "katello_content_facets"."id" = "katello_content_facet_errata"."content_facet_id"
INNER JOIN "katello_content_facet_errata" "content_facet_errata_katello_errata" ON "content_facet_errata_katello_errata"."erratum_id" = "katello_errata"."id"
..........
Notice the double inner join
INNER JOIN "katello_content_facet_errata" ON "katello_content_facet_errata"."erratum_id" = "katello_errata"."id"
Followed by
INNER JOIN "katello_content_facet_errata" "content_facet_errata_katello_errata" ON "content_facet_errata_katello_errata"."erratum_id" = "katello_errata"."id"
.
Removing that one redundant change made query 500 times faster (43633 -> 84 ms) in my test db with 400 hosts. So the fix is probably along those lines
The Content => Errata => (applicable + installed) runs the equivalent of
::Katello::Erratum.installable_for_hosts(h).applicable_to_hosts(h) which generates that double join.
Should have a fix on this hopefully soon.
#4
Updated by The Foreman Bot over 5 years ago
- Status changed from New to Ready For Testing
- Pull request https://github.com/Katello/katello/pull/6587 added
#5
Updated by Justin Sherrill over 5 years ago
- Legacy Backlogs Release (now unused) set to 219
#6
Updated by Partha Aji over 5 years ago
- Status changed from Ready For Testing to Closed
- % Done changed from 0 to 100
Applied in changeset katello|bd158ce127aac9e6becbe28b97b533ec79947295.
#7
Updated by John Mitsch over 5 years ago
- Has duplicate Bug #18652: Listing Applicable hosts for any errata takes long time at scale added
Fixes #18376 - Improved Installable Errata search
Go to Content -> Errata and select the Installable check box
Prior to this commit the "installable_for_hosts" method would create a
introduce an extra 'inner join' causing the query to perform painfully
slow. This commit fixes that by selectively adding that duplicate join
table only if required.
The installable repo query also had a duplicate join for the katello
errata table. Which got removed in this commit
Also fixed the unit tests to highlight the point that "Installable
Errata" is determined by the repositories bound to that host.
Basically the query says "Find all applicable errata on this host.
Filter this list further by only including those repos that are enabled
on this host"
Fixed the unit tests to something that make more sense.
Installable errata is determined purely by the repos that are bound to
the host. It does not really matter what environment/content view
this host belongs. CVE gives the master list of repos available to the
host of which only a subset are enabled or bound.