Bug #37842
opengetting hosts list performs redundantly huge query over duplicated host IDs
Description
Clone from https://issues.redhat.com/browse/SAT-28060
Description of problem:
Running a query like "get me all hosts that has some upgradeable package" is VERY imperformant in scale. The reason is the huge psql query built under scoped_search does repeat Host IDs VERY many times (hundreds to thousands). Just the one WHERE clause can be 17MB long text.
That makes the psql query and consequent API response very slow.
How reproducible:
100%
Is this issue a regression from an earlier version:
Probably not.
Steps to Reproduce:
1. Scale your environment by many Hosts with some package downgraded. Like run:
dnf install sos -y; dnf downgrade sos -y
and then "clone" this Host many times via re-registering it under different DMI UUID, by repeatedly running script:
uuid=$(uuidgen)
short=$(hostname -s)
domain=YOUR.DOMAIN
echo "{\"dmi.system.uuid\": \"${uuid}\"}" > /etc/rhsm/facts/uuid.facts
hostnamectl set-hostname ${short}.${uuid%%-*}.${domain}
subscription-manager clean
subscription-manager register --activationkey YOUR_AK --org YOUR_ORG
2. Once having hundreds to thousands of such Hosts, run query like:
curl -X GET -u admin:PASSWORD -H "content-type: application/json" -H "Accept: application/json" "https://$(hostname -f)/api/hosts?per_page=99999&search=content_view_id%3D1%26applicable_rpms%3E0&order=name&thin=true" > hosts.json
(supply proper CV ID, I used `1`).
It seems that `applicable_rpms` is needed, the more other katello search options use (`lifecycle_environment_id`, `activation_key_id` etc), the more evident result you get. Try all mentioned, and you can grab your coffee.
Actual behavior:
duration of such query is improperly HUGE. It grows quadratically wrt # of Hosts, each option mutliplies the execution time. HUGE psql query is logged (see below), postgres consumes a lot of CPU and RAM.
Expected behavior:
Quicklier response based on smaller query.
Business Impact / Additional info:
postgres logs query like:
2024-09-17 23:22:19 CEST LOG: duration: 59159.345 ms execute <unnamed>: SELECT DISTINCT "hosts"."name" AS alias_0, "hosts"."id" FROM "hosts" LEFT OUTER JOIN "katello_content_facets" ON "katello_content_facets"."host_id" = "hosts"."id" LEFT OUTER JOIN "katello_content_view_environment_content_facets" ON "katello_content_view_environment_content_facets"."content_facet_id" = "katello_content_facets"."id" LEFT OUTER JOIN "katello_content_view_environments" ON "katello_content_view_environments"."id" = "katello_content_view_environment_content_facets"."content_view_environment_id" LEFT OUTER JOIN "katello_content_views" ON "katello_content_views"."id" = "katello_content_view_environments"."content_view_id" LEFT OUTER JOIN "katello_environments" ON "katello_environments"."id" = "katello_content_view_environments"."environment_id" WHERE "hosts"."type" = $1 AND ((("hosts"."id" IN (SELECT "hosts"."id" FROM "hosts" INNER JOIN "katello_content_facets"
ON "hosts"."id" = "katello_content_facets"."host_id"
INNER JOIN "katello_content_view_environment_content_facets"
ON "katello_content_facets"."id" = "katello_content_view_environment_content_facets"."content_facet_id"
INNER JOIN "katello_content_view_environments"
ON "katello_content_view_environment_content_facets"."content_view_environment_id" = "katello_content_view_environments"."id"
INNER JOIN "katello_content_views"
ON "katello_content_view_environments"."content_view_id" = "katello_content_views"."id"
WHERE "katello_content_views"."id" = '1' )) AND ("hosts"."id" IN (SELECT "hosts"."id" FROM "hosts" INNER JOIN "katello_content_facets"
ON "hosts"."id" = "katello_content_facets"."host_id"
INNER JOIN "katello_content_view_environment_content_facets"
ON "katello_content_facets"."id" = "katello_content_view_environment_content_facets"."content_facet_id"
INNER JOIN "katello_content_view_environments"
ON "katello_content_view_environment_content_facets"."content_view_environment_id" = "katello_content_view_environments"."id"
INNER JOIN "katello_environments"
ON "katello_content_view_environments"."environment_id" = "katello_environments"."id"
WHERE "katello_environments"."id" = '1' )) AND (hosts.id IN (1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1009,1006,1007,1007,1006,1006,1006,1006,1006,1006,1006,1006,1006,1006,1006,1006,1006,1006,1006,1006,1006,1006,1006,1006,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1007,1006,..
Please make the list uniq