Project

General

Profile

Bug #23523

Infoblox DHCP gives unreliable free IPs

Added by Sébastien Bernard over 2 years ago. Updated 6 months ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
DHCP plugin
Difficulty:
Triaged:
Yes
Bugzilla link:
Pull request:
Fixed in Releases:
Found in Releases:

Description

Since 1.17 upgrade, foreman dhcp proxy changed the way it gives free adresses.
We're seeing now random adresses from setup ranges being given.
However, we had randoms errors when creating hosts saying that the adresses used for the creation were not available.

What I'm seeing now is that foreman_proxy uses now the Proxy::DHCP::FreeIps::find_free_ips to get the new ips.
I'm not really sure this use the infoblox URL to get a new IPs from the infoblox reservation.

Instead it tries to generate a new random address from the <from -> to> range and tries to ping it with tcp_pingable? function or icmp_pingable? function.
However, the logic behind seems to be a little flawed.
Every error returned by the TCPSocket.new will be considered as adddresse available.
It assumes also that the host should be pingable from the foreman host.

I don't really know which module should be corrected, the smart_proxy or the infoblox_dhcp_plugin.
So I'm opening this issue on the main component.

smart-proxy-patch-dhcp-free-ips.diff smart-proxy-patch-dhcp-free-ips.diff 1.21 KB Sébastien Bernard, 05/17/2018 10:57 PM

Related issues

Related to Smart Proxy - Refactor #22496: Suggest new IP does not workClosed2018-02-02
Related to Smart Proxy - Bug #20173: Concurrent calls to Subnet#unused_ip may return the same ip addressClosed2017-06-30
Related to Smart Proxy - Feature #23406: Foreman should pre allocate ip addresses in IPAM to prevent conflictsDuplicate2018-04-26

History

#1 Updated by Dmitri Dolguikh over 2 years ago

Could you confirm that hosts in the ip address range in question are reachable (no firewall interference) via tcp pings (port 7) and/or icmp pings? Reachability via pings is required for correct operation of free_ip functionality.

#2 Updated by Adam Winberg over 2 years ago

Hm, relying on icmp to decide if an ip address is available for use or not seems like a bad idea to me. If a host is down, it wont work. If an ip address is reserved for future use, it wont work. If icmp is not allowed from the dhcp proxy to the subnet, it wont work. I just see a lot of cases where this would fail, which seriously disrupts automation pipelines. Any link to the discussion where this decision was made? I don't see anything in the release notes, and this seems like a pretty big change that would affect a lot of customers.

We use InfoBlox which has worked very smoothly in this aspect, by Foreman using infoblox api methods to get next available ip address.

#3 Updated by Dmitri Dolguikh over 2 years ago

  • Tracker changed from Bug to Support

relying on icmp to decide if an ip address is available for use or not seems like a bad idea to me

This is a pretty common approach. Both dhcpd and MS dhcp servers by default use icmp pings to determine address availability (although it can be switched off).

> Any link to the discussion where this decision was made?

There was no discussion about this: the functionality was always present in the isc_dhcp provider and is now reused in other providers.

We cannot rely on built-in "free address" functionality (which is what was used before) as it fails in cases when more than one request for an ip address is made in parallel/rapid succession -- it always returns the same address which breaks provisioning workflows. In cases where address pools mix addresses managed by Foreman with addresses managed manually or via other systems, there is no other way for Foreman to determine address availability other than by using icmp/tcp pings, which is the reason we rely on them. Ideally, Foreman should have an address pool solely dedicated to it.

#4 Updated by Adam Winberg over 2 years ago

There was no discussion about this: the functionality was always present in the isc_dhcp provider and is now reused in other providers.

I would have thought that to be because there was a lack of a better alternative for that provider.

We cannot rely on built-in "free address" functionality (which is what was used before) as it fails in cases when more than one request for an ip address is made in parallel/rapid succession -- it always returns the same address which breaks provisioning workflows.

I don't see how thats worse/different from the 'new' approach which can and will fail under certain, not uncommon, circumstances (as outlined in previous posts) as well.

In cases where address pools mix addresses managed by Foreman with addresses managed manually or via other systems, there is no other way for Foreman to determine address availability other than by using icmp/tcp pings, which is the reason we rely on them.

Well, you cant rely on icmp/tcp pings either, can you? It's proven by this issue! The information about address availability is in InfoBlox, so the natural thing I think would be to just ask InfoBlox to do the work for you (as it worked pre 1.17) or at the very least provide this as an option and let the user decide which method to use.

Ideally, Foreman should have an address pool solely dedicated to it.

Thats an unreasonable requirement to make in many larger environments.

As it is for us, I haven't even set up from/to ip addresses in my subnets in Foreman. I dont have to, since that stuff is already configured in InfoBlox. With this change all that has to be configured twice - once in InfoBlox when setting up a subnet there, and then once again in Foreman. If any changes are made in InfoBlox, they have to be made in Foreman as well. And InfoBlox admins and Foreman admins are often not the same team, so I see a lot of risk for misaligned configuration between the two.

#5 Updated by Adam Winberg over 2 years ago

An option if you want to maintain some internal logic in Foreman getting final decision which ip to use might be to just ask InfoBlox for all ip addresses in the subnet:

/ipv4address?network=10.0.0.0/24

This returns all ip addresses in the subnet with information about reserved ranges, ip status (used/unused), mac address and so forth. So you could pick an unused ip from that output (which would essentially replace the random selection being made today) and then reserve it internally in Foreman so it doesn't get picked again. That would both prevent errors when creating multiple hosts simultaneously as well as prevent that Foreman suggests 'bad'/unusable ip addresses.

#6 Updated by Adam Winberg over 2 years ago

I understand the theory about problems 'when more than one request for an ip address is made in parallel/rapid succession', but are there any actual bug reports regarding this? I would be interested in reading those.

#7 Updated by Dmitri Dolguikh over 2 years ago

#8 Updated by Dmitri Dolguikh over 2 years ago

  • Related to Bug #20173: Concurrent calls to Subnet#unused_ip may return the same ip address added

#9 Updated by Dmitri Dolguikh over 2 years ago

  • Related to Feature #23406: Foreman should pre allocate ip addresses in IPAM to prevent conflicts added

#10 Updated by Dmitri Dolguikh over 2 years ago

Please see "related issues" for the history of the issue being discussed. Additional discussion can be found in PRs associated with the tickets.

#11 Updated by Sébastien Bernard over 2 years ago

Just a few comments to put back the issue in context.
  1. - I find the new way of assigning the ips a step forward in the good direction. I've been dealing with application that may call host creation in automated way through the api, and had prevent this application from calling the host creation in parallel. Now this problem is gone.
  2. - The problem it solved is real.

That's all the good points.

Now on the negative side,
  1. - setup deployment in production ceased to work, that's called a regression.
  2. - there is zero mention of this change in the release notes

If it was my first time dhcp setup and I couldn't figure how to make it work, I'll be ok with your classification as Support.
Now upgrade to 1.17 makes the host creation totaly unreliable. Host could fails to be created with address conflict on a random basis.
Worse than that the less free address remaining, the more the host creation trough foreman will fails.
As this modifications breaks perfectly working setup, this should be marked as a bug and treated as such.

Now, if you don't mind, I would like to go forward and find a way to improve the reliability of the ip reservation while keeping the problem it was supposed to fix.

IMHO, the Proxy::DHCP::FreeIps::find_free_ips method tries hard to get different address each time it's called and not gave it back in following calls.
However, the ip it returns is sometimes not free. Why ? Because it doesn't check at the end the dhcp providers.
The ip foreman gets back will be used to create a DNS or DHCP Lease record further in the workflow. If IP is not checked against the provider, sooner or later the problem will arise with conflicts raise by one provider.

I may propose two solutions :
  1. - synchronize the @allocated_ips with the DHCP_Provider.
    As mentioned, InfoBlox have some way to return the entire block of unused IP to use as a base for random ip selection.
    isc_dhcp may have an equivalent with dhcp-leases. The main benefit is you call it once before the loop and then you're sure about the free ips in the range. Drawback, it adds some duration calling the provider.
  2. - add an extra check or add parameters to check the ip against differents methods of validation :
    1. - ping ip
    2. - tcp port 7
    3. - check if dns reverse give a FQDN or nothing
    4. - call a script to check if the IP is free
    5. - call a class to check if the IP is free, each provider may override the generic version with a more appropriate version

I have attached the modification I made in production to fix the ip allocation. Since I'm using the dhcp Infoblox to create DNS entries, I'm just checking if reverse address is resolvable or not through the DNS.

My .02 euros contribution.

#12 Updated by Lukas Zapletal 6 months ago

  • Triaged changed from No to Yes
  • Category set to DHCP plugin
  • Subject changed from infoblox DHCP gives unreliable free ips. to Infoblox DHCP gives unreliable free IPs
  • Project changed from Smart Proxy to Infoblox
  • Tracker changed from Support to Bug

Hello guys,

I am trying to understand the problem here. So correct me if I am wrong.

Previously, unused_ip call made a request to Infoblox to find a free IP. This was refactored to use our common DHCP module code that handles this on its own via #22622. This caused regression for deployments where those free ranges are defined in Infoblox and not in Foreman. However I struggle to understand, why this is a problem.

The code in common DHCP module fetches all IPs from a subnet and then tries to ping the next available IP address. The "pinging" can now be optionally turned off as some customers complained about it. If we ignore the pinging (which is a bit weird to be honest I agree with that), the IP address should be free because Infoblox does not have any knowledge about it. So why it fails? Why you see a conflict? I must be missing something, I assume that the range is set properly in Foreman Subnet tho.

I would like to resolve this, the DNS check is a nice addition we chould possibly implement as an "alternative way" of checking free IPs, but this should be fixed properly.

FYI we are finishing External IPAM implementation in Foreman core which will allow to create smart-proxy "external IPAM" plugin providing IPAM capabilities when a host is being created. This is IMHO the correct way of doing this, writing such plugin should be quite easy. See https://github.com/theforeman/foreman/pull/7113 for more details and the API.

Also available in: Atom PDF