Bug #38003
closedDeleting a CV version does not scale when a product has too many repos (cloned in CVs)
Description
Description of problem:
Having a product with many repos that each repo is in many CV versions, an attempt to delete a bigger CV version (of say 100 repos) takes a lot of time and memory.
In a customer story behind this, puma worker planning a foreman task consumed 11GB memory.
On my reproducer, I got 6-8GB easily, and the planning took 15 minutes.
With a simple change, the memory consumption can be reduced to 200MB-ish of RAM and to 20ish seconds.
The key "not scalling well" factor is https://github.com/Katello/katello/blob/master/app/lib/actions/katello/repository/destroy.rb#L72-L77 during deletion of a repository.
Replacing that cycle by a single ActiveRecord query prevents manipulation with individual repository objects and save a lot of time+space.
How reproducible:
100%
{*}Is this issue a regression from an earlier version:{*}
yes (performance regression since ACS feature added)
Steps to Reproduce:
1. Create a product with 200 repos:
{code}
hammer product create --organization-id 1 --name ZOO_product
for i in $(seq 1 200); do
echo "repository create --organization-id 1 --product ZOO_product --name ZOO_repo_${i} --content-type yum --download-policy on_demand --url https://repos.fedorapeople.org/repos/pulp/pulp/demo_repos/zoo/"
done | hammer shell
{code}
2. Create one CV with 100 of the repos:
{code}
hammer content-view create --organization-id 1 --name CV_zoo_manyrepos
for i in $(seq 1 100); do
echo "content-view add-repository --organization-id 1 --name CV_zoo_manyrepos --product ZOO_product --repository ZOO_repo_${i}"
done | hammer shell
{code}
3. Create another CV with all the 200 repos (or even with completely disjunct set of repos, like "seq 101 200"):
{code}
hammer content-view create --organization-id 1 --name CV_zoo_HUGErepos
for i in $(seq 1 200); do
echo "content-view add-repository --organization-id 1 --name CV_zoo_HUGErepos --product ZOO_product --repository ZOO_repo_${i}"
done | hammer shell
{code}
4. Publish some versions of both CVs - the more the better:
{code}
for i in $(seq 1 10); do
echo "content-view publish --organization-id 1 --name CV_zoo_manyrepos"
echo "content-view publish --organization-id 1 --name CV_zoo_HUGErepos"
done | hammer shell
{code}
5. Restart `foreman` service to see memory usage required to process below request.
6. Delete either CV version, e.g. via WebUI.
7. Monitor memory usage of puma workers, and spot in /var/log/foreman/production.log:
{code}
2024-11-11T15:21:53 [I|app|05628762] Started PUT "/katello/api/content_views/107/bulk_delete_versions" for ::1 at 2024-11-11 15:21:53 +0100
2024-11-11T15:21:53 [I|app|05628762] Processing by Katello::Api::V2::ContentViewsController#bulk_delete_versions as JSON
2024-11-11T15:21:53 [I|app|05628762] Parameters: {"bulk_content_view_version_ids"=>{"included"=>{"ids"=>[2492]}, "excluded"=>{}}, "id"=>"107", "content_view"=>{"id"=>"107"}, "api_version"=>"v2"}
..
2024-11-11T15:32:55 [I|app|05628762] Completed 202 Accepted in 662543ms (Views: 98.4ms | ActiveRecord: 20959.9ms | Allocations: 222030027)
{code}
Spot the PUT request and its duration and Allocations.
Actual behavior:
Planning the task takes many minutes (662 seconds in above example) and consumes >5GB memory (much depends on scaling).
{*}Expected behavior:{*}
Planning takes less than a minute, low memory usage of puma process.
{*}Business Impact / Additional info:{*}
I will provide a patch / PR soon.
Updated by The Foreman Bot about 1 month ago
- Status changed from New to Ready For Testing
- Assignee set to Pavel Moravec
- Pull request https://github.com/Katello/katello/pull/11214 added
Updated by Partha Aji about 1 month ago
- Target version changed from Katello 4.15.0 to Katello 4.14.2
- Triaged changed from No to Yes
Updated by Chris Roberts 28 days ago
- Category set to Content Views
- Status changed from Ready For Testing to Closed
Updated by The Foreman Bot 27 days ago
- Pull request https://github.com/Katello/katello/pull/11237 added
Updated by The Foreman Bot 21 days ago
- Pull request https://github.com/Katello/katello/pull/11242 added