Bug #1546
report::expire slow with many reports
| Status: | New | Start: | ||
|---|---|---|---|---|
| Priority: | Normal | Due date: | ||
| Assigned to: | - | % Done: | 0% |
|
| Category: | Reporting | |||
| Target version: | Bug scrub | |||
| Backlog: | No | Difficulity: | ||
| Votes: | 2 (View) |
Description
report::expire is causing me issues, one of which is its very slow to run (~20 minutes, and thats before it ultimately errors).
Ohad believes this is due to table scans:
Ok, one simple thing that I found out, was that we were using the following query:
Report Load (16588.0ms) SELECT id FROM `reports` WHERE (reports.id
= 0) AND (created_at < '2011-06-16 14:13:02') ORDER BY reports.id ASC
LIMIT 1000
if you run the following query in mysql console using explain:
explain SELECT id FROM `reports` WHERE (reports.id >= 0) AND (created_at < '2012-06-16 14:13:02') ORDER BY reports.id ASC LIMIT 1000;
You would see that the query was forced to scan the entire reports table and was not using the indexes.
it should be a "bit" faster to change from created_at to reported_at (as created_at has no index at all). I assume that the reason why using the older code was better for you (as it was using reported_at which has an index) and using the in batches was just a side effect.
we need to figure out the correct index for making this process a bit faster, in my limited tests, creating index on multiple columns (id and reported_at) were not very helpful.
in your case, creating an index is not trivial (as you have a large set of data) as the indexes usually locks the db.
https://groups.google.com/d/msg/foreman-users/_Mn4oxXmP7E/T3ByUZ9YTcIJ
History
Updated by Jacob McCann about 1 year ago
To me the slowness of the expire seems more to do with the giant arrays being generated from the loops: https://groups.google.com/d/msg/foreman-users/_Mn4oxXmP7E/lSKWXpW3JoQJ
Updated by Greg Sutcliffe 6 months ago
- Target version set to Bug scrub
- Start deleted (
03/22/2012)
