report::expire slow with many reports
|Assigned to:||-||% Done:||
|Target version:||Bug scrub|
report::expire is causing me issues, one of which is its very slow to run (~20 minutes, and thats before it ultimately errors).
Ohad believes this is due to table scans:
Ok, one simple thing that I found out, was that we were using the following query:
Report Load (16588.0ms) SELECT id FROM `reports` WHERE (reports.id
= 0) AND (created_at < '2011-06-16 14:13:02') ORDER BY reports.id ASC
if you run the following query in mysql console using explain:
explain SELECT id FROM `reports` WHERE (reports.id >= 0) AND (created_at < '2012-06-16 14:13:02') ORDER BY reports.id ASC LIMIT 1000;
You would see that the query was forced to scan the entire reports table and was not using the indexes.
it should be a "bit" faster to change from created_at to reported_at (as created_at has no index at all). I assume that the reason why using the older code was better for you (as it was using reported_at which has an index) and using the in batches was just a side effect.
we need to figure out the correct index for making this process a bit faster, in my limited tests, creating index on multiple columns (id and reported_at) were not very helpful.
in your case, creating an index is not trivial (as you have a large set of data) as the indexes usually locks the db.
Updated by Jacob McCann about 1 year ago
To me the slowness of the expire seems more to do with the giant arrays being generated from the loops: https://groups.google.com/d/msg/foreman-users/_Mn4oxXmP7E/lSKWXpW3JoQJ