Bug #10133
closedMassive db deadlocks in postgres from hosts_counter updates with counter_cache_fix.rb
Added by Chuck Schweizer almost 10 years ago. Updated over 6 years ago.
Description
https://gist.github.com/csschwe/4cc4d9be58e1cb96ec6c
After updating the Foreman 1.8 rc3 I am seeing a massive amount of DB Deadlocks. This issue was not present in 1.7
Updated by Chuck Schweizer almost 10 years ago
This is in a 40K node environment.
Updated by Ohad Levy almost 10 years ago
quick google shows a few things:
- http://stackoverflow.com/questions/11911087/ruby-on-rails-is-a-counter-cache-transaction-safe
- https://github.com/magnusvk/counter_culture
Updated by Tomer Brisker almost 10 years ago
- Related to Bug #5692: Puppet environment counters not updated added
Updated by Tomer Brisker almost 10 years ago
- Category set to Database
Which PostgreSQL version are you using?
This sounds like it might be related to a problem that was fixed in 9.3: http://mina.naguib.ca/blog/2010/11/22/postgresql-foreign-key-deadlocks.html
Updated by Lukas Zapletal almost 10 years ago
There are three users in the comments complaining that 9.3 version is even worse and it was not fixed for them :-(
Alvaro Herrera describes the solution in introducing new keyword SELECT ... FOR KEY. That would mean you need both new PostgreSQL 9.3 and newer Rails which takes advantage of that approach? Or some change in Foreman would be required I assume.
Updated by Tomer Brisker almost 10 years ago
This specific deadlock should be prevented when we upgrade to Rails 4, as it is caused by a workaround for a bug in cached counters that existed only in Rails 3
Updated by Lukas Zapletal almost 10 years ago
Oh I see. Maybe to make this workaround optional so users with heavy load can turn it off?
Updated by Tomer Brisker almost 10 years ago
- Status changed from New to Assigned
- Assignee set to Tomer Brisker
The counter_cache fix was already in 1.7, so I'm trying to understand what caused this.
Chuck, what operation causes the deadlocks? Did you upgrade anything other then the foreman?
Updated by Tomer Brisker almost 10 years ago
Digging into the log it would seem the deadlock is caused by a race between the counter_cache_fix and rails' original update_counters trying to update the same counter at the same time. Will continue investigating.
Updated by Chuck Schweizer almost 10 years ago
My environment is a fully updated RHEL 6 install using the foreman installer.
foreman 1.8 rc3
postgres 8.4
The foreman server is only setup to receive reports and facts from the puppet masters, it is not acting as a puppet server or external node configurator.
From what I can tell the uploading of reports and facts from the 40K nodes, through the puppet masters, is causing the deadlocks. Commenting out the logic that updates the DB in counter_cache_fix.rb made the deadlocks stop.
The only thing that was change going from foreman 1.7.1 to 1.8 rc3 was foreman. Nothing else on the system was updated or changed. The foreman install was run after installing the 1.8 rc3 rpms.
Updated by The Foreman Bot over 9 years ago
- Status changed from Assigned to Ready For Testing
- Pull request https://github.com/theforeman/foreman/pull/2362 added
- Pull request deleted (
)
Updated by Daniel Lobato Garcia over 9 years ago
Hi Chuck,
Tomer has prepared a proposed fix for this issue - https://github.com/theforeman/foreman/pull/2362
Could you report if it works for your case?
Thanks!
Updated by Chuck Schweizer over 9 years ago
After reducing the number of puppet masters in my environment I have been unable to reproduce the issue.
Daniel Lobato Garcia wrote:
Hi Chuck,
Tomer has prepared a proposed fix for this issue - https://github.com/theforeman/foreman/pull/2362
Could you report if it works for your case?Thanks!
Updated by Dominic Cleal over 9 years ago
- Status changed from Ready For Testing to New
- Assignee deleted (
Tomer Brisker)
If anybody reproduces this, we'll retry the patch.
Updated by Tomer Brisker over 9 years ago
- Has duplicate Bug #11232: Occassional error in tasks when importing facts from foreman-chef added
Updated by The Foreman Bot over 9 years ago
- Status changed from New to Ready For Testing
Updated by Marek Hulán over 9 years ago
- Assignee set to Tomer Brisker
- Translation missing: en.field_release set to 72
Updated by Anonymous over 9 years ago
- Status changed from Ready For Testing to Closed
- % Done changed from 0 to 100
Applied in changeset 7fad1fa0e253e793511df1cde24d8b1885d640c4.
Updated by Tomer Brisker over 9 years ago
- Has duplicate Bug #5990: multiple calls to create or update domain throws deadlock error added
Updated by Tomer Brisker over 9 years ago
- Related to Bug #12241: Counter cache update didn't pick up changes from after_commit callback added
Updated by Tomer Brisker over 8 years ago
- Related to Bug #7246: Remove counter workaround for #5692 on upgrade to rails 4.x added