Project

General

Profile

Bug #17905

katello:upgrade_check aborts on systems without an UUID

Added by Ivan Necas over 2 years ago. Updated about 1 year ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
Upgrades
Target version:
Difficulty:
Triaged:
Yes
Bugzilla link:
Pull request:
Fixed in Releases:
Found in Releases:

Description

Cloned from https://bugzilla.redhat.com/show_bug.cgi?id=1409795

Satellite 6.1.11 (not .10 as in the BZ version, seems there is no .11?)

Description of problem:
While running "foreman-rake katello:upgrade_check" on a big customer database, rake would abort while scanning the systems:

$ foreman-rake katello:preupgrade_content_host_check --trace
  • Invoke katello:preupgrade_content_host_check (first_time)
  • Invoke environment (first_time)
  • Execute environment
    API controllers newer than Apipie cache! Run apipie:cache rake task to regenerate cache.
  • Execute katello:preupgrade_content_host_check
    Calculating Host changes on upgrade. This may take a few minutes.
    rake aborted!
    Expect initializer to return hash if a group of attributes is defined by lazy_accessor
    /opt/rh/ruby193/root/usr/share/gems/gems/katello-2.2.0.93/app/lib/katello/lazy_accessor.rb:177:in `run_initializer'
    /opt/rh/ruby193/root/usr/share/gems/gems/katello-2.2.0.93/app/lib/katello/lazy_accessor.rb:154:in `lazy_attribute_get'
    /opt/rh/ruby193/root/usr/share/gems/gems/katello-2.2.0.93/app/lib/katello/lazy_accessor.rb:74:in `block (2 levels) in lazy_accessor'
    /opt/rh/ruby193/root/usr/share/gems/gems/katello-2.2.0.93/lib/katello/tasks/preupgrade_content_host_check.rake:48:in `block in get_systems_with_facts'
    /opt/rh/ruby193/root/usr/share/gems/gems/katello-2.2.0.93/lib/katello/tasks/preupgrade_content_host_check.rake:46:in `each'
    /opt/rh/ruby193/root/usr/share/gems/gems/katello-2.2.0.93/lib/katello/tasks/preupgrade_content_host_check.rake:46:in `get_systems_with_facts'
    /opt/rh/ruby193/root/usr/share/gems/gems/katello-2.2.0.93/lib/katello/tasks/preupgrade_content_host_check.rake:18:in `ensure_one_system_per_hostname'
    /opt/rh/ruby193/root/usr/share/gems/gems/katello-2.2.0.93/lib/katello/tasks/preupgrade_content_host_check.rake:103:in `block (2 levels) in <top (required)>'
    /opt/rh/ruby193/root/usr/share/ruby/rake/task.rb:205:in `call'
    /opt/rh/ruby193/root/usr/share/ruby/rake/task.rb:205:in `block in execute'
    /opt/rh/ruby193/root/usr/share/ruby/rake/task.rb:200:in `each'
    /opt/rh/ruby193/root/usr/share/ruby/rake/task.rb:200:in `execute'
    /opt/rh/ruby193/root/usr/share/ruby/rake/task.rb:158:in `block in invoke_with_call_chain'
    /opt/rh/ruby193/root/usr/share/ruby/monitor.rb:211:in `mon_synchronize'
    /opt/rh/ruby193/root/usr/share/ruby/rake/task.rb:151:in `invoke_with_call_chain'
    /opt/rh/ruby193/root/usr/share/ruby/rake/task.rb:144:in `invoke'
    /opt/rh/ruby193/root/usr/share/ruby/rake/application.rb:116:in `invoke_task'
    /opt/rh/ruby193/root/usr/share/ruby/rake/application.rb:94:in `block (2 levels) in top_level'
    /opt/rh/ruby193/root/usr/share/ruby/rake/application.rb:94:in `each'
    /opt/rh/ruby193/root/usr/share/ruby/rake/application.rb:94:in `block in top_level'
    /opt/rh/ruby193/root/usr/share/ruby/rake/application.rb:133:in `standard_exception_handling'
    /opt/rh/ruby193/root/usr/share/ruby/rake/application.rb:88:in `top_level'
    /opt/rh/ruby193/root/usr/share/ruby/rake/application.rb:66:in `block in run'
    /opt/rh/ruby193/root/usr/share/ruby/rake/application.rb:133:in `standard_exception_handling'
    /opt/rh/ruby193/root/usr/share/ruby/rake/application.rb:63:in `run'
    /opt/rh/ruby193/root/usr/bin/rake:32:in `<main>'
    Tasks: TOP => katello:preupgrade_content_host_check

The code in question seems to be:
systems.each do |system|
begin
facts = system.facts
unless facts
systems_to_remove.push(system)
end
rescue RestClient::Exception
systems_to_remove.push(system)
end
end

Line 48 is "facts = system.facts".

After adding a tactical "puts system.inspect" just before the system.facts line, we could identify the bad system:

#<Katello::System id: 6891, uuid: nil, name: "hostname", description: "Initial Registration Params", location: "None", environment_id: 4, created_at: "2016-11-15 12:58:01", updated_at: "2016-11-15 12:58:01", type: "Katello::System", content_view_id: 15, host_id: nil>
rake aborted!

Looking into PostgreSQL revealed that we actually had two systems with that symptom:

foreman=# select * from katello_systems where uuid is null;

id  | uuid |         name          |         description         | location | environment_id |         created_at         |         updated_at         |      type       | content_view_id | host_id

------+------+-----------------------+-----------------------------+----------+----------------+----------------------------+----------------------------+-----------------+-----------------+---------

6891 | | hostname | Initial Registration Params | None | 4 | 2016-11-15 12:58:01.246012 | 2016-11-15 12:58:01.246012 | Katello::System | 15 |

6262 | | hostname2 | Initial Registration Params | None | 4 | 2016-09-26 09:06:38.945969 | 2016-09-26 09:06:38.945969 | Katello::System | 16 |

(2 rows)

PostgreSQL would also tell us that there were another two systems with those hostnames, but now with proper UUIDs.
Seems the initial registration of those wen't badly and they were re-registered.

After erasing the two broken systems from the DB the upgrade_check would run fine.

I think the upgrade_check.rake needs a bit more of error handling, as I would expect it to catch this bad systems and tell me about them, not choke on them.

Version-Release number of selected component (if applicable):
Satellite 6.1.11

How reproducible:
Always, but no idea how the initial problematic host was created

Steps to Reproduce:
1. create a katello::system without a uuid
2. run foreman-rake katello:upgrade_check

Actual results:
rake aborted

Expected results:
system is said to be faulty

NOTE:

this seems like a consequence of a record without orchestration task successfully finished

History

#1 Updated by Justin Sherrill over 2 years ago

  • Subject changed from katello:upgrade_check aborts on systems without an UUID to katello:upgrade_check aborts on systems without an UUID
  • Legacy Backlogs Release (now unused) set to 114

Also available in: Atom PDF