Bug #19623
closedChanges to vmware vm gives 'Could not find network X on VMWare compute resource'
Description
I have created a vm on our vsphere cluster from Foreman without problems, everything looks fine. But if I try to update the host in Foreman (any update, parameter change, class change, etc) I get the following error:
ERF42-6324 [Foreman::Exception]: Could not find network dvportgroup-56 on VMWare compute resource
The problem is that the portgroup for the vm is not 'dvportgroup-56', and the correct portgroup is displayed on the Interfaces tab when I edit the host. Digging a bit further in our vmware environment it seems that we have several portgroups that has been migrated from 'Windows vCenter' to 'vCSA'. These portgroups has different values on the 'Key' and 'Uid' value on the portgroup - the 'Uid' value matches the one that Foreman displays on the Interfaces tab, but the 'Key' value matches the one in the error message. If we create a new portgroup these two attributes have the same values, so this may be a fringe case where under certain circumstances the two attributes in vmware are not necessarily the same.
It seems that Foreman is looking at different attributes at different times, which may be a bug. The same attribute ( I believe the Key value is best suited) should be used throughout the code to identify the network in vmware. Unfortunately I haven't been able to track where in the code the different attributes are used, it might be in the fog-vsphere gem.
Updated by Adam Winberg over 7 years ago
Apparently there is a third portgroup attribute in vmware that may be of relevance - 'Id'. This matches the value displayed on the Interfaces tab of the host in Foreman. So for this host the output for the portgroup in vmware is as follows:
Name : VLAN300 Key : dvportgroup-56 Notes : R-TestDev NumPorts : 512 Datacenter : NetApp-datacenter PortBinding : Static ExtensionData : VMware.Vim.DistributedVirtualPortgroup Id : DistributedVirtualPortgroup-dvportgroup-491 Uid : /VIServer=co\user@vcenter:443/DistributedPortgroup=DistributedVirtualPortgroup-dvportgroup-491/ Client : VMware.VimAutomation.ViCore.Impl.V1.VimClient
The Interfaces tab shows that the host is connected to the portgroup "dvportgroup-491"
Updated by Adam Winberg over 7 years ago
Ok, talked to VMware and according to them it is perfectly normal that a portgroup has different values in these attributes under certain circumstances (for example after migrating from another vcenter) and that it shouldn't cause any problems. Unfortunately it causes problems in Foreman, but as long as Foreman consistently uses the 'Key' attribute in vmware to identify the portgroup it should be fine.
Updated by Adam Winberg over 7 years ago
After doing some crude debugging in the 'select_nic' function in app/models/concerns/fog_extensions/vsphere/server.rb I may be able to clarify the problem somewhat:
The select_nic function uses the 'nic_attrs' array, which in my case contains the following:
{"network"=>"dvportgroup-491", "type"=>"VirtualVmxnet3"}
It also uses the 'vm_network' object which has these attributes (among others):
vm_network._ref = dvportgroup-491 vm_network.name = VLAN300 vm_network.key = dvportgroup-56
Here it is clear that there are two different portgroup id's (56 and 491) for the same portgroup. The correct one to use should be the one that is used as key for the portgroup itself, in this case 'dvportgroup-56'. I haven't been able to track down the origin of the values in the 'nic_attrs' (nic.compute_attributes) array, but in my mind the 'network' key in this array should be 'dvportgroup-56', not 'dvportgroup-491'.
Updated by Adam Winberg over 7 years ago
After a lot of debugging and digging I have finally found a way to make this work the way I need it to. Basically I want my networks in vsphere to be identified by 'portgroupkey'. As it is now, different attributes for the network in vsphere are used under different circumstances, which normally works fine since the attributes holds the same value, but as I have shown in previous comments this is not always true.
So, when Foreman queries vsphere for all networks the '_ref' attribute for the network is used as an id. But when Foreman creates a NIC and the network is a 'DistributedVirtualPortgroup' then Foreman uses the 'key' attribute of the network, which is required for the DistributedVirtualSwitchPortConnection function (used in the 'create_nic_backing' function in create_vm.rb from tfm-rubygem-fog-vsphere). As I said many times, these to attributes normally hold the same value, but in my case they differ for a lot of networks which causes problems when trying to update or delete a host in Foreman.
It's all very confusing to me which is why I'm making such a bad job of explaining it, it's hard to keep track of everything. But heres what i've done to make it work (it probably is not the 'right' way to do it, but it might make it clearer what i'm trying to achieve and help someone from the dev team to create a more permanent fix):
In fog-vsphere/lib/fog/vsphere/requests/compute/list_networks.rb:
Adjusted the 'network_attributes' def so it uses the portgroupkey as id if the network is a distributedvirtualportgroup:
26a27,32 > raw_network = get_raw_network(network.name, datacenter) > if raw_network.kind_of? RbVmomi::VIM::DistributedVirtualPortgroup > id = raw_network.key > else > id = managed_obj_id(network) > end 28c34 < :id => managed_obj_id(network), --- > :id => id,
In app/models/concerns/fog_extensions/vsphere/server.rb:
Added this line in the 'select_nic' def to make it compare against the portgroupkey attribute as well:
43d42 < vm_network ||= all_networks.detect { |network| network.key == nic_attrs['network'] }
This works for me. Hopefully someones sees this and understands what the hell I'm talking about.
Updated by Marek Hulán over 7 years ago
I'm not vmware expert, but it sounds reasonable. Would you mind opening a PR with the change? The suggested patch seems also good to me. If you create a PR, chances are that more experienced vmware guys will review it. If you want to send a patch but don't know how, here's help https://www.theforeman.org/contribute.html#SubmitPatches
Updated by Nathan Ward over 7 years ago
This is a good catch Adam. I'm not sure, but I think we're being impacted by this as well - new hosts we create are being put on to incorrect dvportgroups and they show up as "dvportgroup-XX" rather than their friendly name.
I've submitted an issue to the Fog project here: https://github.com/fog/fog-vsphere/issues/92
Updated by The Foreman Bot over 7 years ago
- Status changed from New to Ready For Testing
- Pull request https://github.com/theforeman/foreman/pull/4611 added
Updated by Adam Winberg over 7 years ago
Created PR for the change in Foreman code:
https://github.com/theforeman/foreman/pull/4611
Updated by Daniel Lobato Garcia over 7 years ago
- Translation missing: en.field_release set to 240
Updated by Anonymous over 7 years ago
- Status changed from Ready For Testing to Closed
- % Done changed from 0 to 100
Applied in changeset cccc26e703d5352982cda2cd426c214356f222c9.
Updated by Klaas D over 7 years ago
Hey,
this would need a version bump for fog-vshpere to 1.11.1 aswell, right?
Greetings
Klaas
Updated by Daniel Lobato Garcia over 7 years ago
Nope, https://github.com/theforeman/foreman/pull/4611 fixed it without needing to update the fog-vsphere version
Updated by Adam Winberg over 7 years ago
As far as I can tell, you need the fix on the fog-vsphere side as well, which is put into 1.11.1:
https://github.com/fog/fog-vsphere/pull/94
Updated by Klaas D over 7 years ago
just changing the line from https://github.com/theforeman/foreman/pull/4611 doesn't help with 1.15 here, does this patch have prerequisits in 1.16 that I need to backport aswell?
Updated by Klaas D over 7 years ago
using both https://github.com/theforeman/foreman/pull/4611 and https://github.com/fog/fog-vsphere/pull/94 solves the issue on my foreman 1.15 for me
Updated by Klaas D over 7 years ago
I have another problem with this fix, I seem to have networks that are not of the type DistributedVirtualPortgroup, they don't have a key. This produces an error message like: "undefined method `key' for #<RbVmomi::VIM::Network:0x0000000eb3fa48>"]}"
Updated by The Foreman Bot over 7 years ago
- Pull request https://github.com/theforeman/foreman/pull/4689 added
Updated by The Foreman Bot over 7 years ago
- Pull request https://github.com/theforeman/foreman/pull/4721 added
Updated by Adam Winberg almost 7 years ago
This is closed and you reference it in the release notes for 1.16 as a fixed bug, but you do not ship a sufficiently recent fog-vsphere rpm to make it work. The rpm you have is
tfm-rubygem-fog-vsphere-1.9.2-1.el7.noarch
And as stated several times in this bug you need 1.11 to make it work. So I would like this bug to be reopened because it wont work without the newer fog-vpshere version.
Updated by Klaas D almost 7 years ago
https://github.com/theforeman/foreman/pull/4721 (last comment added)
Updated by Marc 'Zugschlus' Haber over 6 years ago
I have an 1.16 installation that is suffering of this issue because of the too old fog-vsphere. Is there a workaround? Can I update fog-vsphere independently of foreman without losing the ability to use the regular foreman update process?
Any hints will be appreciated.