Foreman fails to find IP address for boot server if main resolver is down
Foreman seems to fail to correctly failover between the resolvers listed in /etc/resolv.conf, leading to failures to build hosts when the first listed resolver is unavailable.
Failed to set Build on host.example.com: ["failed to detect boot server: ERF50-9294 [Foreman::WrappedException]: Unable to find IP address for 'dhcp.example.com' ([Net::Error]: execution expired)", "Error connecting to 'example.com' domain DNS servers: ns2.example.com - check query_local_nameservers and dns_conflict_timeout settings"]
failed to detect boot server: ERF50-9294 [Foreman::WrappedException]: Unable to find IP address for 'dhcp.example.com' ([Net::Error]: execution expired) Foreman::WrappedException: ERF50-9294 [Foreman::WrappedException]: Unable to find IP address for 'dhcp.example.com' ([Net::Error]: execution expired) /usr/share/foreman/app/services/nic_ip_resolver.rb:24:in `rescue in to_ip_address' /usr/share/foreman/app/services/nic_ip_resolver.rb:15:in `to_ip_address' /usr/share/foreman/app/models/concerns/orchestration/dhcp.rb:91:in `boot_server' /usr/share/foreman/app/models/concerns/orchestration/dhcp.rb:126:in `dhcp_attrs' /usr/share/foreman/app/models/concerns/orchestration/dhcp.rb:103:in `build_dhcp_record' /usr/share/foreman/app/models/concerns/orchestration/dhcp.rb:30:in `block in dhcp_records' /usr/share/foreman/app/models/concerns/orchestration/dhcp.rb:29:in `map' /usr/share/foreman/app/models/concerns/orchestration/dhcp.rb:29:in `dhcp_records' /usr/share/foreman/app/models/concerns/orchestration/dhcp.rb:202:in `dhcp_conflict_detected?' /usr/share/foreman/vendor/ruby/2.3.0/gems/activesupport-5.2.1/lib/active_support/callbacks.rb:426:in `block in make_lambda' /usr/share/foreman/vendor/ruby/2.3.0/gems/activesupport-5.2.1/lib/active_support/callbacks.rb:236:in `block in halting_and_conditional' /usr/share/foreman/vendor/ruby/2.3.0/gems/activesupport-5.2.1/lib/active_support/callbacks.rb:517:in `block in invoke_after' /usr/share/foreman/vendor/ruby/2.3.0/gems/activesupport-5.2.1/lib/active_support/callbacks.rb:517:in `each' /usr/share/foreman/vendor/ruby/2.3.0/gems/activesupport-5.2.1/lib/active_support/callbacks.rb:517:in `invoke_after' /usr/share/foreman/vendor/ruby/2.3.0/gems/activesupport-5.2.1/lib/active_support/callbacks.rb:133:in `run_callbacks' /usr/share/foreman/vendor/ruby/2.3.0/gems/activesupport-5.2.1/lib/active_support/callbacks.rb:816:in `_run_validation_callbacks' /usr/share/foreman/vendor/ruby/2.3.0/gems/activemodel-5.2.1/lib/active_model/validations/callbacks.rb:118:in `run_validations!' /usr/share/foreman/vendor/ruby/2.3.0/gems/activemodel-5.2.1/lib/active_model/validations.rb:339:in `valid?' /usr/share/foreman/vendor/ruby/2.3.0/gems/activerecord-5.2.1/lib/active_record/validations.rb:67:in `valid?' /usr/share/foreman/app/models/concerns/orchestration.rb:86:in `valid?' /usr/share/foreman/app/models/host/managed.rb:935:in `trigger_nic_orchestration' ...
Updated by Lukas Zapletal over 4 years ago
- Category set to DNS
- Status changed from New to Assigned
- Assignee set to Lukas Zapletal
- Triaged changed from No to Yes
Hello, we use Timeout Ruby class in our codebase which is not good approach. Actually Ruby DNS Resolver reads all "nameserver" entries and tries one after another, however the way this is implemented is that the default Ruby DNS timeout is high (5, 10, 20, 40 seconds) and it probably does not reply fast enough and the global timeout (in our codebase) fires off.
The solution is to remove the extra timeout we have there and rely only on the Ruby DNS resolver timeout. No idea why this wasn't there from the day one. Try the patch and let me know if it solved the problem.