Project

General

Profile

Actions

Bug #31540

closed

Command exceeded timeout while Installer executes foreman-rake db:migrate

Added by Justin Sherrill about 4 years ago. Updated about 4 years ago.

Status:
Closed
Priority:
High
Category:
Content Views
Target version:
Difficulty:
Triaged:
Yes
Fixed in Releases:
Found in Releases:

Description

Cloned from https://bugzilla.redhat.com/show_bug.cgi?id=1904963

Description of problem:
During an upgrade of 6.7.5 to 6.8.2 (snap 2) the db:migrate step in the installer fails, as it doesn't finish in 5 minutes (which is the default timeout for Puppet exec).

[DEBUG 2020-12-04T13:14:09 main] Exec[foreman-rake-db:migrate](provider=posix): Executing '/usr/sbin/foreman-rake db:migrate'
[DEBUG 2020-12-04T13:14:09 main] Executing with uid=foreman: '/usr/sbin/foreman-rake db:migrate'
[ERROR 2020-12-04T13:19:09 main] Command exceeded timeout

One interesting detail is that Puppet does NOT kill the process, so the db:migrate is running in the background and tries to re-run the installer will result in other, confusing errors (ActiveRecord::ConcurrentMigrationError, Foreman::PermissionMissingException -- the later seems to happen on the "unless" command)

The migration run took roughtly 6 hours on my setup, and I could finish the upgrade just fine after that.

Looking at the PostgreSQL activity during the migration, I've seen the following query over and over again:

postgres=# select backend_start,xact_start,query_start,query from pg_stat_activity where state='active';

2020-12-04 13:15:07.457939-05 | 2020-12-04 13:15:08.027364-05 | 2020-12-04 14:07:56.01994-05  | SELECT COUNT(*) FROM "katello_rpms" WHERE "katello_rpms"."id" IN (SELECT "katello_repository_rpms"."rpm_id" FROM "katello_repository_r
pms" WHERE "katello_repository_rpms"."repository_id" IN (SELECT "katello_repositories"."id" FROM "katello_repositories" WHERE "katello_repositories"."content_view_version_id" = $1 AND (environment_id is NULL)))

As the migrations ran without a terminal attached, I don't know which migration triggered that, but looking at the Katello migrations between 6.7.5 and 6.8.2, I think it could be one of these:
db/migrate/20191204214919_add_content_view_version_counts.rb (most probably!)
db/migrate/20200129172534_add_epoch_version_release_arch_to_katello_installed_packages.rb
db/migrate/20200501155054_installed_package_unique_nvrea.rb

I think there are actually two bugs here:
1. the installer should not assume db:migrate can finish in 5 minutes, this is not realistic.
2. the migration should not require 6 hours to finish.

Version-Release number of selected component (if applicable):
Satellite 6.8.2 Snap 2

How reproducible:
100%

Steps to Reproduce:
1. take a 6.7.5 dogfood server (big db, rather slow-ish storage)
2. try to upgrade to 6.8.2

Actual results:
Fails

Expected results:
Succeeds ;-)

Additional info:
This is probably related, but not exactly the same as #1888983

Actions

Also available in: Atom PDF