rake reports:expire abuses memory and network bandwidth
Noticed this specifically from the reports:expire process:
SELECT * FROM `reports` WHERE (created_at < '2011-02-22 15:52:29' and status = 0)
There is no need at all to send the entire row. With thousands of hosts with 50 reports a day, this job sends GB of data over the network from the sql server (and into the rake processes memory) every time it is run just to delete reports. This should be optimized as it can cause some serious memory and network usage problems if processing a large number of reports.
#1 Updated by Kal McFate about 8 years ago
Just for example:
rake reports:expire days=12
rake reports:expire days=10
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
5830 root 19 0 1300m 1.2g 3484 R 95.7 30.7 28:56.57 /usr/bin/ruby /usr/bin/rake reports:expire days=10
on 2 days of data.
30 minutes, and still going.
#2 Updated by Ohad Levy about 8 years ago
yes you are correct.
the main reason we need to pull all reports is to know if we need to delete the related log, message and source records.
maybe we could improve the finder sql to make it faster.
one thing we could do quickly, is simply to read 1000 records or so each time instead of loading them all into memory.