Bug #27160
closedForeman fails to find IP address for boot server if main resolver is down
Description
Foreman seems to fail to correctly failover between the resolvers listed in /etc/resolv.conf, leading to failures to build hosts when the first listed resolver is unavailable.
The error;
Failed to set Build on host.example.com: ["failed to detect boot server: ERF50-9294 [Foreman::WrappedException]: Unable to find IP address for 'dhcp.example.com' ([Net::Error]: execution expired)", "Error connecting to 'example.com' domain DNS servers: ns2.example.com - check query_local_nameservers and dns_conflict_timeout settings"]
The backtrace;
failed to detect boot server: ERF50-9294 [Foreman::WrappedException]: Unable to find IP address for 'dhcp.example.com' ([Net::Error]: execution expired) Foreman::WrappedException: ERF50-9294 [Foreman::WrappedException]: Unable to find IP address for 'dhcp.example.com' ([Net::Error]: execution expired) /usr/share/foreman/app/services/nic_ip_resolver.rb:24:in `rescue in to_ip_address' /usr/share/foreman/app/services/nic_ip_resolver.rb:15:in `to_ip_address' /usr/share/foreman/app/models/concerns/orchestration/dhcp.rb:91:in `boot_server' /usr/share/foreman/app/models/concerns/orchestration/dhcp.rb:126:in `dhcp_attrs' /usr/share/foreman/app/models/concerns/orchestration/dhcp.rb:103:in `build_dhcp_record' /usr/share/foreman/app/models/concerns/orchestration/dhcp.rb:30:in `block in dhcp_records' /usr/share/foreman/app/models/concerns/orchestration/dhcp.rb:29:in `map' /usr/share/foreman/app/models/concerns/orchestration/dhcp.rb:29:in `dhcp_records' /usr/share/foreman/app/models/concerns/orchestration/dhcp.rb:202:in `dhcp_conflict_detected?' /usr/share/foreman/vendor/ruby/2.3.0/gems/activesupport-5.2.1/lib/active_support/callbacks.rb:426:in `block in make_lambda' /usr/share/foreman/vendor/ruby/2.3.0/gems/activesupport-5.2.1/lib/active_support/callbacks.rb:236:in `block in halting_and_conditional' /usr/share/foreman/vendor/ruby/2.3.0/gems/activesupport-5.2.1/lib/active_support/callbacks.rb:517:in `block in invoke_after' /usr/share/foreman/vendor/ruby/2.3.0/gems/activesupport-5.2.1/lib/active_support/callbacks.rb:517:in `each' /usr/share/foreman/vendor/ruby/2.3.0/gems/activesupport-5.2.1/lib/active_support/callbacks.rb:517:in `invoke_after' /usr/share/foreman/vendor/ruby/2.3.0/gems/activesupport-5.2.1/lib/active_support/callbacks.rb:133:in `run_callbacks' /usr/share/foreman/vendor/ruby/2.3.0/gems/activesupport-5.2.1/lib/active_support/callbacks.rb:816:in `_run_validation_callbacks' /usr/share/foreman/vendor/ruby/2.3.0/gems/activemodel-5.2.1/lib/active_model/validations/callbacks.rb:118:in `run_validations!' /usr/share/foreman/vendor/ruby/2.3.0/gems/activemodel-5.2.1/lib/active_model/validations.rb:339:in `valid?' /usr/share/foreman/vendor/ruby/2.3.0/gems/activerecord-5.2.1/lib/active_record/validations.rb:67:in `valid?' /usr/share/foreman/app/models/concerns/orchestration.rb:86:in `valid?' /usr/share/foreman/app/models/host/managed.rb:935:in `trigger_nic_orchestration' ...
Updated by Lukas Zapletal over 5 years ago
- Category set to DNS
- Status changed from New to Assigned
- Assignee set to Lukas Zapletal
- Triaged changed from No to Yes
Hello, we use Timeout Ruby class in our codebase which is not good approach. Actually Ruby DNS Resolver reads all "nameserver" entries and tries one after another, however the way this is implemented is that the default Ruby DNS timeout is high (5, 10, 20, 40 seconds) and it probably does not reply fast enough and the global timeout (in our codebase) fires off.
The solution is to remove the extra timeout we have there and rely only on the Ruby DNS resolver timeout. No idea why this wasn't there from the day one. Try the patch and let me know if it solved the problem.
Updated by The Foreman Bot over 5 years ago
- Status changed from Assigned to Ready For Testing
- Pull request https://github.com/theforeman/foreman/pull/6861 added
Updated by Anonymous over 5 years ago
- Status changed from Ready For Testing to Closed
Applied in changeset e75700fd8d376933db02197c03323d03bf77a627.
Updated by The Foreman Bot over 5 years ago
- Pull request https://github.com/theforeman/foreman/pull/6954 added
Updated by Amir Fefer over 5 years ago
- Related to Bug #27585: default dns timeout value is nil added
Updated by Amir Fefer over 5 years ago
I think its very non-trivial for a user (like me) to understand how to use this setting.... also, this seems a lot like implementation details that leak into the settings page, and we should find a better way to represent it?