Project

General

Profile

Bug #15283

Slow response times due to Foreman locking query

Added by Bryan Kearney over 5 years ago. Updated about 3 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
-
Target version:
-
Difficulty:
Triaged:
Bugzilla link:

Description

Cloned from https://bugzilla.redhat.com/show_bug.cgi?id=1334924
Using Satellite 6.1.7, we experienced extremely slow UI and API performance (response times measured in the minutes, tons of ancillary failures due to lock contention). On the server, we hit the Passenger process limit on Foreman instances which were each tied to a Postgres query using 100% CPU continuously for tens of minutes.

We added memory and reduced concurrency to avoid swapping and increased Postgres's shared_buffers & work_mem but the queries were still taking minutes to execute.

The long running queries were all the same Foreman lock check, something of this form:

SELECT COUNT FROM "foreman_tasks_locks" INNER JOIN "foreman_tasks_tasks" ON "foreman_tasks_tasks"."id" = "foreman_tasks_locks"."task_id" WHERE "foreman_tasks_locks"."name" = 'read' AND "foreman_tasks_locks"."resource_id" = 529 AND "foreman_tasks_locks"."resource_type" = 'Katello::System' AND (foreman_tasks_tasks.state != 'stopped') AND ("foreman_tasks_locks"."task_id" NOT IN ('ab80a792-a95e-455a-a0cf-23801e84eba2'));

A quick EXPLAIN showed that triggers a table scan:

foreman=# EXPLAIN ANALYZE SELECT COUNT FROM "foreman_tasks_locks" INNER JOIN "foreman_tasks_tasks" ON "foreman_tasks_tasks"."id" = "foreman_tasks_locks"."task_id" WHERE "foreman_tasks_locks"."name" = 'read' AND "foreman_tasks_locks"."resource_id" = 529 AND "foreman_tasks_locks"."resource_type" = 'Katello::System' AND (foreman_tasks_tasks.state != 'stopped') AND ("foreman_tasks_locks"."task_id" NOT IN ('ab80a792-a95e-455a-a0cf-23801e84eba2'));
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=16218.46..16218.47 rows=1 width=0) (actual time=115714.130..115714.131 rows=1 loops=1)
-> Nested Loop (cost=0.00..16218.46 rows=1 width=0) (actual time=7013.001..115714.110 rows=2 loops=1)
Join Filter: ((foreman_tasks_locks.task_id)::text = (foreman_tasks_tasks.id)::text)
Rows Removed by Join Filter: 3029738
-> Index Scan using index_foreman_tasks_locks_on_resource_type_and_resource_id on foreman_tasks_locks (cost=0.00..23.13 rows=1 width=37) (actual time=0.111..16.142 rows=1615 loops=1)
Index Cond: (((resource_type)::text = 'Katello::System'::text) AND (resource_id = 529))
Filter: (((task_id)::text <> 'ab80a792-a95e-455a-a0cf-23801e84eba2'::text) AND ((name)::text = 'read'::text))
Rows Removed by Filter: 1615
-> Seq Scan on foreman_tasks_tasks (cost=0.00..16171.96 rows=1869 width=37) (actual time=0.020..71.448 rows=1876 loops=1615)
Filter: ((state)::text <> 'stopped'::text)
Rows Removed by Filter: 371338
Total runtime: 115714.238 ms
(12 rows)

After adding a couple of indexes, the response times are significantly improved:

foreman=# CREATE INDEX ON foreman_tasks_tasks (id, state);
CREATE INDEX
foreman=# CREATE INDEX ON foreman_tasks_locks (name, resource_type, resource_id);
CREATE INDEX
foreman=# EXPLAIN ANALYZE SELECT COUNT FROM "foreman_tasks_locks" INNER JOIN "foreman_tasks_tasks" ON "foreman_tasks_tasks"."id" = "foreman_tasks_locks"."task_id" WHERE "foreman_tasks_locks"."name" = 'read' AND "foreman_tasks_locks"."resource_id" = 529 AND "foreman_tasks_locks"."resource_type" = 'Katello::System' AND (foreman_tasks_tasks.state != 'stopped') AND ("foreman_tasks_locks"."task_id" NOT IN ('ab80a792-a95e-455a-a0cf-23801e84eba2'));
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=69.76..69.77 rows=1 width=0) (actual time=30.810..30.811 rows=1 loops=1)
-> Nested Loop (cost=0.00..69.75 rows=1 width=0) (actual time=30.787..30.807 rows=2 loops=1)
-> Index Scan using foreman_tasks_locks_name_resource_type_resource_id_idx on foreman_tasks_locks (cost=0.00..22.91 rows=7 width=37) (actual time=0.149..1.926 rows=1615 loops=1)
Index Cond: (((name)::text = 'read'::text) AND ((resource_type)::text = 'Katello::System'::text) AND (resource_id = 529))
Filter: ((task_id)::text <> 'ab80a792-a95e-455a-a0cf-23801e84eba2'::text)
-> Index Only Scan using foreman_tasks_tasks_id_state_idx on foreman_tasks_tasks (cost=0.00..6.68 rows=1 width=37) (actual time=0.018..0.018 rows=0 loops=1615)
Index Cond: (id = (foreman_tasks_locks.task_id)::text)
Filter: ((state)::text <> 'stopped'::text)
Rows Removed by Filter: 1
Heap Fetches: 1057
Total runtime: 30.873 ms
(11 rows)

(As an aside, I don't know whether ~300k tasks is considered excessive but https://bugzilla.redhat.com/show_bug.cgi?id=1289384 makes me glad 6.2 will include foreman_tasks:cleanup)


Related issues

Has duplicate foreman-tasks - Bug #16584: foreman_tasks_tasks table needs index on task stateDuplicate2016-09-16

Associated revisions

Revision 4868e9df (diff)
Added by Ivan Necas almost 5 years ago

Fixes #15283 - add more indexes the locks

We are searching on various lock attributes to find out if we don't
have any active task locking the resource. If the amount of locks
it too high, it can lead to performance degradation.

History

#1 Updated by Ivan Necas over 5 years ago

  • Project changed from Foreman Remote Execution to foreman-tasks

#2 Updated by Ivan Necas about 5 years ago

  • Target version set to 127
  • Status changed from New to Assigned

#3 Updated by The Foreman Bot about 5 years ago

  • Status changed from Assigned to Ready For Testing
  • Assignee set to Ivan Necas
  • Pull request https://github.com/theforeman/foreman-tasks/pull/202 added

#4 Updated by Ivan Necas almost 5 years ago

  • Status changed from Ready For Testing to Closed

#5 Updated by Shlomi Zadok almost 5 years ago

  • Has duplicate Bug #16584: foreman_tasks_tasks table needs index on task state added

#6 Updated by The Foreman Bot almost 5 years ago

  • Pull request https://github.com/theforeman/foreman-tasks/pull/205 added

Also available in: Atom PDF