Bug #23623

Break report expiration into batches

Added by Lukas Zapletal 2 months ago. Updated 5 days ago.

Status:Closed
Priority:Normal
Assignee:Lukas Zapletal
Category:Reporting
Target version:1.19.0
Difficulty: Team Backlog:
Triaged:No Fixed in Releases:1.19.0
Bugzilla link: Found in Releases:
Pull request:https://github.com/theforeman/foreman/pull/5587

Description

The problem we are facing with reports not being deleted and slowed down report importing is concurrency. If you want to delete many records in one transaction, the RDBMS must lock whole table and this blocks further inserts or updates. The same problem is on PostgreSQL.

You can indeed use any SQL query which will do the job of getting rid of the records. Inner join is something that is exceptionally problematic on MySQL, but it’s super slow on PostgreSQL as well.

The recommended solution to this long-standing issue is to break the deletion into chunks. Each chunk needs to be just hundreds of thousands of records, then do sleep (dozens of seconds or minutes) so the locked transactions have a chance to complete. You need to find the right balance of deletion of records so you actually keep up with incoming records.

To do that, we really need to use joins because whole chain of joined records must be deleted in a single transaction (we don’t have ON DELETE CASCADE set for those). Or if MySQL supports cascade deletion, it is worth setting that and then it is as easy as deleting batch of records from reports table but this was problematic with Rails AFAIK.

Currently the SQL statements are:

DELETE FROM “logs” WHERE “logs”.“report_id” IN (SELECT “reports”.“id” FROM “reports” WHERE “reports”.“type” IN (‘ConfigReport’) AND (reports.created_at < ‘2018-05-10 13:36:21’))

DELETE FROM “messages” WHERE (id not IN (SELECT DISTINCT message_id FROM “logs”))

DELETE FROM “sources” WHERE (id not IN (SELECT DISTINCT source_id FROM “logs”))

DELETE FROM “reports” WHERE “reports”.“type” IN (‘ConfigReport’) AND (reports.created_at < ‘2018-05-10 13:36:21’)

Associated revisions

Revision 4156621b
Added by Lukas Zapletal 5 days ago

Fixes #23623 - expiration rake task in batches

History

#1 Updated by The Foreman Bot 2 months ago

  • Assignee set to Lukas Zapletal
  • Status changed from New to Ready For Testing
  • Pull request https://github.com/theforeman/foreman/pull/5587 added

#2 Updated by Lukas Zapletal 5 days ago

  • % Done changed from 0 to 100
  • Status changed from Ready For Testing to Closed

#3 Updated by Marek Hulán 5 days ago

  • Triaged set to No
  • Target version set to 1.19.0
  • Fixed in Releases 1.19.0 added

Also available in: Atom PDF