Project

General

Profile

Actions

Bug #15675

closed

reports:expire is slow and affects the performance of the application

Added by Nacho Barrientos almost 8 years ago. Updated almost 6 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
Performance
Target version:
Difficulty:
Triaged:
Fixed in Releases:
Found in Releases:

Description

Hi,

We're currently running Foreman 1.11.2 and it's impossible for us to expire Puppet reports using the standard way to do it (via the Rake task 'reports:expire') as the SQL queries that it issues are slow and generate locks in the reports table. This situation slows down our whole Puppet infrastructure as the masters start to wait too much for Foreman to process reports that they don't have time to digest new requests. This increments the backlog, increasing the catalog compilation time and the latency of basically all the requests that the agents perform.

We have a query killer in place in the database (a procedure, basically) that kills queries longer than 5 minutes, therefore every night the rake task is killed and no reports are cleaned-up. We do this essentially to protect our Puppet infra.

The query itself that's killed is the following:

DELETE FROM `logs` WHERE `logs`.`id` IN (SELECT id FROM (SELECT `logs`.`id` FROM `logs` INNER JOIN `reports` ON `reports`.`id` = `logs`.`report_id` WHERE `logs`.`report_id` IN (SELECT `reports`.`id` FROM `reports`  WHERE `reports`.`type` IN ('ConfigReport') AND (reports.created_at < '2016-07-04 03:30:08'))  ORDER BY logs.id) __active_record_temp)

which I guess comes from:

    Log.joins(:report).where(:report_id => where(cond)).delete_all (app/models/report.rb:L82)

Our current workaround is the following:

1) Disable the query killer
2) Initiate a transaction and use a simplified query:
2.1) START TRANSACTION
2.2) DELETE FROM `logs` WHERE `logs`.`report_id` IN (SELECT `reports`.`id` FROM `reports` WHERE `reports`.`type` IN ('ConfigReport') AND (reports.created_at < '2016-07-04 03:30:08'));
2.3) COMMIT the transaction (this does not generate any lock until the transaction can be committed and the performance is not degraded during the 'preparing' stage)
3) Now that all the logs of expired exports are deleted, execute again reports:expire so it has less work to do. This normally succeeds and creates no impact.
4) Enable the query killer

With this strategy we can kind of work around the problem but, can you think of any way to make it more efficient (and less harmless) to the application to expire reports using the Rake task?

Another option that we have is to declare a downtime every time that we expire reports but we'd rather not to go for this :)

The rake task is triggered by a nightly cron in our case:

30 5 * * * (/usr/sbin/foreman-rake reports:expire ) >> /var/log/foreman/reports-expire.log 2>&1

We're adding ~10 million entries to 'logs' per day and 340k to 'reports'. We're currently unable to expire 24 hours worth of reports without putting the infrastructure at risk.

Thanks for your help and time!

Actions #1

Updated by Dominic Cleal almost 8 years ago

  • Category set to Performance
Actions #2

Updated by Tomer Brisker over 7 years ago

  • Status changed from New to Assigned
  • Assignee set to Tomer Brisker
Actions #3

Updated by Anonymous over 7 years ago

  • Status changed from Assigned to Closed
  • % Done changed from 0 to 100
Actions #4

Updated by Dominic Cleal over 7 years ago

  • translation missing: en.field_release set to 161
Actions #5

Updated by Daniel Lobato Garcia over 7 years ago

  • Pull request https://github.com/theforeman/foreman/pull/3661 added

No idea what happened but the PR was not linked to the issue this time

Actions #6

Updated by Daniel Lobato Garcia over 7 years ago

  • Target version set to 1.7.1
Actions #7

Updated by Daniel Lobato Garcia over 7 years ago

  • Target version deleted (1.7.1)
Actions #8

Updated by Daniel Lobato Garcia over 7 years ago

  • Target version set to 1.7.1
Actions #9

Updated by Daniel Lobato Garcia over 7 years ago

  • Target version changed from 1.7.1 to 1.6.2
Actions #10

Updated by Daniel Lobato Garcia over 7 years ago

  • Target version changed from 1.6.2 to 1.7.1
Actions

Also available in: Atom PDF