Bug #19623

Changes to vmware vm gives 'Could not find network X on VMWare compute resource'

Added by Adam Winberg 8 months ago. Updated about 1 month ago.

Status:Closed
Priority:Normal
Assigned To:-
Category:Compute resources - VMware
Target version:-
Difficulty: Bugzilla link:
Found in release:1.15.0 Pull request:https://github.com/theforeman/foreman/pull/4611, https://github.com/theforeman/foreman/pull/4689, https://github.com/theforeman/foreman/pull/4721
Story points-
Velocity based estimate-
Release1.16.0Release relationshipAuto

Description

I have created a vm on our vsphere cluster from Foreman without problems, everything looks fine. But if I try to update the host in Foreman (any update, parameter change, class change, etc) I get the following error:

ERF42-6324 [Foreman::Exception]: Could not find network dvportgroup-56 on VMWare compute resource

The problem is that the portgroup for the vm is not 'dvportgroup-56', and the correct portgroup is displayed on the Interfaces tab when I edit the host. Digging a bit further in our vmware environment it seems that we have several portgroups that has been migrated from 'Windows vCenter' to 'vCSA'. These portgroups has different values on the 'Key' and 'Uid' value on the portgroup - the 'Uid' value matches the one that Foreman displays on the Interfaces tab, but the 'Key' value matches the one in the error message. If we create a new portgroup these two attributes have the same values, so this may be a fringe case where under certain circumstances the two attributes in vmware are not necessarily the same.

It seems that Foreman is looking at different attributes at different times, which may be a bug. The same attribute ( I believe the Key value is best suited) should be used throughout the code to identify the network in vmware. Unfortunately I haven't been able to track where in the code the different attributes are used, it might be in the fog-vsphere gem.

Associated revisions

Revision cccc26e7
Added by wiad 7 months ago

Fixes #19623 - add comparison to portkey for vmware networks

Revision 23abc7dc
Added by Klaas Demter 6 months ago

refs #19623 - fix if key method is not present for network

credit to jsherrill for the one line solution :)

History

#1 Updated by Adam Winberg 8 months ago

Apparently there is a third portgroup attribute in vmware that may be of relevance - 'Id'. This matches the value displayed on the Interfaces tab of the host in Foreman. So for this host the output for the portgroup in vmware is as follows:

Name          : VLAN300
Key           : dvportgroup-56
Notes         : R-TestDev
NumPorts      : 512
Datacenter    : NetApp-datacenter
PortBinding   : Static
ExtensionData : VMware.Vim.DistributedVirtualPortgroup
Id            : DistributedVirtualPortgroup-dvportgroup-491
Uid           : /VIServer=co\user@vcenter:443/DistributedPortgroup=DistributedVirtualPortgroup-dvportgroup-491/
Client        : VMware.VimAutomation.ViCore.Impl.V1.VimClient

The Interfaces tab shows that the host is connected to the portgroup "dvportgroup-491"

#2 Updated by Adam Winberg 8 months ago

Ok, talked to VMware and according to them it is perfectly normal that a portgroup has different values in these attributes under certain circumstances (for example after migrating from another vcenter) and that it shouldn't cause any problems. Unfortunately it causes problems in Foreman, but as long as Foreman consistently uses the 'Key' attribute in vmware to identify the portgroup it should be fine.

#3 Updated by Adam Winberg 8 months ago

After doing some crude debugging in the 'select_nic' function in app/models/concerns/fog_extensions/vsphere/server.rb I may be able to clarify the problem somewhat:

The select_nic function uses the 'nic_attrs' array, which in my case contains the following:

{"network"=>"dvportgroup-491", "type"=>"VirtualVmxnet3"}

It also uses the 'vm_network' object which has these attributes (among others):

vm_network._ref = dvportgroup-491
vm_network.name = VLAN300
vm_network.key  = dvportgroup-56

Here it is clear that there are two different portgroup id's (56 and 491) for the same portgroup. The correct one to use should be the one that is used as key for the portgroup itself, in this case 'dvportgroup-56'. I haven't been able to track down the origin of the values in the 'nic_attrs' (nic.compute_attributes) array, but in my mind the 'network' key in this array should be 'dvportgroup-56', not 'dvportgroup-491'.

#4 Updated by Adam Winberg 8 months ago

After a lot of debugging and digging I have finally found a way to make this work the way I need it to. Basically I want my networks in vsphere to be identified by 'portgroupkey'. As it is now, different attributes for the network in vsphere are used under different circumstances, which normally works fine since the attributes holds the same value, but as I have shown in previous comments this is not always true.

So, when Foreman queries vsphere for all networks the '_ref' attribute for the network is used as an id. But when Foreman creates a NIC and the network is a 'DistributedVirtualPortgroup' then Foreman uses the 'key' attribute of the network, which is required for the DistributedVirtualSwitchPortConnection function (used in the 'create_nic_backing' function in create_vm.rb from tfm-rubygem-fog-vsphere). As I said many times, these to attributes normally hold the same value, but in my case they differ for a lot of networks which causes problems when trying to update or delete a host in Foreman.

It's all very confusing to me which is why I'm making such a bad job of explaining it, it's hard to keep track of everything. But heres what i've done to make it work (it probably is not the 'right' way to do it, but it might make it clearer what i'm trying to achieve and help someone from the dev team to create a more permanent fix):

In fog-vsphere/lib/fog/vsphere/requests/compute/list_networks.rb:
Adjusted the 'network_attributes' def so it uses the portgroupkey as id if the network is a distributedvirtualportgroup:

26a27,32
>           raw_network = get_raw_network(network.name, datacenter)
>           if raw_network.kind_of? RbVmomi::VIM::DistributedVirtualPortgroup
>             id = raw_network.key
>           else
>             id = managed_obj_id(network)
>           end
28c34
<             :id            => managed_obj_id(network),
---
>             :id            => id,

In app/models/concerns/fog_extensions/vsphere/server.rb:
Added this line in the 'select_nic' def to make it compare against the portgroupkey attribute as well:

43d42
<         vm_network ||= all_networks.detect { |network| network.key == nic_attrs['network'] }

This works for me. Hopefully someones sees this and understands what the hell I'm talking about.

#5 Updated by Marek Hulán 7 months ago

I'm not vmware expert, but it sounds reasonable. Would you mind opening a PR with the change? The suggested patch seems also good to me. If you create a PR, chances are that more experienced vmware guys will review it. If you want to send a patch but don't know how, here's help https://www.theforeman.org/contribute.html#SubmitPatches

#6 Updated by Nathan Ward 7 months ago

This is a good catch Adam. I'm not sure, but I think we're being impacted by this as well - new hosts we create are being put on to incorrect dvportgroups and they show up as "dvportgroup-XX" rather than their friendly name.

I've submitted an issue to the Fog project here: https://github.com/fog/fog-vsphere/issues/92

#7 Updated by The Foreman Bot 7 months ago

  • Status changed from New to Ready For Testing
  • Pull request https://github.com/theforeman/foreman/pull/4611 added

#8 Updated by Adam Winberg 7 months ago

Created PR for the change in Foreman code:
https://github.com/theforeman/foreman/pull/4611

#9 Updated by Daniel Lobato Garcia 7 months ago

  • Release set to 1.16.0

#10 Updated by Anonymous 7 months ago

  • Status changed from Ready For Testing to Closed
  • % Done changed from 0 to 100

#11 Updated by Klaas D 6 months ago

Hey,
this would need a version bump for fog-vshpere to 1.11.1 aswell, right?

Greetings
Klaas

#12 Updated by Daniel Lobato Garcia 6 months ago

Nope, https://github.com/theforeman/foreman/pull/4611 fixed it without needing to update the fog-vsphere version

#13 Updated by Adam Winberg 6 months ago

As far as I can tell, you need the fix on the fog-vsphere side as well, which is put into 1.11.1:
https://github.com/fog/fog-vsphere/pull/94

#14 Updated by Klaas D 6 months ago

just changing the line from https://github.com/theforeman/foreman/pull/4611 doesn't help with 1.15 here, does this patch have prerequisits in 1.16 that I need to backport aswell?

#15 Updated by Klaas D 6 months ago

#16 Updated by Klaas D 6 months ago

I have another problem with this fix, I seem to have networks that are not of the type DistributedVirtualPortgroup, they don't have a key. This produces an error message like: "undefined method `key' for #<RbVmomi::VIM::Network:0x0000000eb3fa48>"]}"

#17 Updated by The Foreman Bot 6 months ago

  • Pull request https://github.com/theforeman/foreman/pull/4689 added

#18 Updated by The Foreman Bot 6 months ago

  • Pull request https://github.com/theforeman/foreman/pull/4721 added

#19 Updated by Adam Winberg about 1 month ago

This is closed and you reference it in the release notes for 1.16 as a fixed bug, but you do not ship a sufficiently recent fog-vsphere rpm to make it work. The rpm you have is

tfm-rubygem-fog-vsphere-1.9.2-1.el7.noarch

And as stated several times in this bug you need 1.11 to make it work. So I would like this bug to be reopened because it wont work without the newer fog-vpshere version.

Also available in: Atom PDF