Bug #9170
closedWrong nic order in libvirt domain
Description
Hi,
We are testing the new interfaces code that just got merged to develop, and are running a test instance off of 43c4bd72e4646cc9b2b4cb0533e84e08825eadda.
So far, it's pretty great, but we found a small issue with libvirt compute resources.
The scenario is:
- go to the New Host page
- fill in Host information, and choose a libvirt compute resource to deploy on
- go to the Interfaces tab
- edit the default interface, choose a domain, subnet, ip, leave the default information as-is
- choose a bridge from the list, say bridgeA
- submit
- add a new interface, choose a domain, subnet, ip
- choose a different bridge from the list, say bridgeB
- submit
- fill in the remaining information and save the host, which will get built
The libvirt domain created for the new host has 2 NICs, but they are in the "wrong order", the first NIC is bridged to bridgeB and the second to bridgeA. Plus, the MAC addresses that get saved to the database once the domain has been created are mixed up.
The other data, however, looks right, with the proper bridge name in the NIC attributes.
Updated by Dominic Cleal almost 10 years ago
- Related to Tracker #2409: Networking added
Updated by Dominic Cleal almost 10 years ago
- Related to Refactor #7456: Extract primary interface from host added
Updated by Dominic Cleal almost 10 years ago
- Translation missing: en.field_release deleted (
28)
Updated by Marek Hulán almost 10 years ago
Greg is there something we can do about this (except fixing fog so it returns custom ids back)?
Updated by Greg Sutcliffe almost 10 years ago
- Status changed from New to Need more information
Marek, Eric, I can't reproduce this I'm afraid. I've created several hosts with various configurations of bridges and in every case the correct interface gets the correct bridge. Here's one example:
Submitted data:
nic1 (primary: true, provision true) on bridge virbr4
nic2 (no flags) on bridge virbr3
libvirt data:
virbr4 nic has mac 52:54:00:59:b7:61
virbr3 nic has mac 52:54:00:60:82:97
rails console:
[3] pry(main)> Host.first.interfaces
Host::Managed Load (0.8ms) SELECT "hosts".* FROM "hosts" WHERE "hosts"."type" IN ('Host::Managed') LIMIT 1
Nic::Base Load (0.2ms) SELECT "nics".* FROM "nics" WHERE "nics"."host_id" = 3 ORDER BY identifier
=> [#<Nic::Managed id: 5, mac: "52:54:00:59:b7:61", ip: "192.168.130.24", type: "Nic::Managed", name: "foo5.primary.sapphire.local", host_id: 3, subnet_id: 1, domain_id: 1, attrs: {}, created_at: "2015-02-09 17:21:39", updated_at: "2015-02-09 17:21:39", provider: nil, username: nil, password: nil, virtual: false, link: true, identifier: "", tag: "", attached_to: "", managed: false, mode: "balance-rr", attached_devices: "", bond_options: "", primary: true, provision: true, compute_attributes: {"type"=>"bridge", "bridge"=>"virbr4", "model"=>"virtio"}>,
#<Nic::Managed id: 6, mac: "52:54:00:60:82:97", ip: "192.168.131.2", type: "Nic::Managed", name: "bar5.provisioning.sapphire.local", host_id: 3, subnet_id: 2, domain_id: 2, attrs: {}, created_at: "2015-02-09 17:21:39", updated_at: "2015-02-09 17:21:39", provider: nil, username: nil, password: nil, virtual: false, link: true, identifier: "", tag: "", attached_to: "", managed: false, mode: "balance-rr", attached_devices: "", bond_options: "", primary: false, provision: false, compute_attributes: {"type"=>"bridge", "bridge"=>"virbr3", "model"=>"virtio"}>]
As you can see, the primary nic in the DB has the :61 mac, and the other nic has the :97 mac.
Do you guys have a further notes for a reliable reproducer, or a public system I can log into to see the bug firsthand?
Updated by Anonymous almost 10 years ago
Greg Sutcliffe wrote:
Marek, Eric, I can't reproduce this I'm afraid. I've created several hosts with various configurations of bridges and in every case the correct interface gets the correct bridge. Here's one example:
Greg, thank you for looking into this.
As you can see, the primary nic in the DB has the :61 mac, and the other nic has the :97 mac.
That is weird... I can reproduce the issue every time I create a virtual machine, and now do so with the bridge in the "wrong" order so that the libvirt domain gets created properly.
Do you guys have a further notes for a reliable reproducer, or a public system I can log into to see the bug firsthand?
Unfortunately, our foreman deployment cannot be reached from outside the corporate network. I could change the logging level, or trace any request/response you think could be interesting. I tried to find where, in the code, the nic attributes were used to create the libvirt domain but couldn't do so. If you tell me where, I could at least add a few statements in foreman and/or fog to see what the data looks like.
Updated by Greg Sutcliffe almost 10 years ago
Ok, so if you up the logs to debug, you should see the assignments between fog nic objects and foreman nics being logged. That logging comes from `app/models/concerns/orchestration/compute.rb:288` - you could drop some more logger.debug lines in there or at line 281 outside the loop. From there the code goes to `app/models/concerns/fog_extensions/libvirt/server.rb:42` so that's another place where you could try to dig into what's happening.
Updated by Anonymous almost 10 years ago
Greg, thank you for your help.
After adding a heavy dose of logging, I found the issue to be in the `add_interfaces_to_compute_attrs` method (https://github.com/theforeman/foreman/blob/develop/app/models/concerns/orchestration/compute.rb#L259). As the Hash is later sorted according to the key (https://github.com/theforeman/foreman/blob/develop/app/models/compute_resource.rb#L274), the nics end up in reverse order. We may have this behaviour because we are using ruby 2.1.5 to run foreman, I'm not sure. I'm not sure either why the `object_id` is used as a key in the Hash :)
If I change the code in `add_interfaces_to_compute_attrs` to rely on the `interfaces` Hash order (since it's guaranteed to be stable since ruby 1.9), it works properly. I just used a regular `index`, starting at `0`, for the key.
Also, the code in `nested_attributes_for` seems to expect keys with `new_` in their name, I'm not sure why.
Do you think replacing `object_id` as an ordered key would be a proper solution to the issue?
Updated by Anonymous almost 10 years ago
This patch fixed the issue for me: https://github.com/cloudwatt/foreman/commit/280aec3a02330994c52d57b6fb56fafafd4d6e7c
Updated by Marek Hulán almost 10 years ago
Would you mind creating a PR? You can find instructions at our wiki page if needed.
Updated by Anonymous almost 10 years ago
Marek, I don't mind but I'm definitely not sure about the impacts of the change, especially since the issue seems to only happen in my environment (and may be due to the use of ruby 2.1).
Updated by Marek Hulán almost 10 years ago
If you make a PR, we can run our test suite and someone will review and test it sooner or later. We might get feedback from others whether this is good or bad idea. TBH I'm not entirely sure how nic.object_id could fail.
Updated by Anonymous almost 10 years ago
Alright then, I will make a PR :) Thank you for your help.
Updated by The Foreman Bot almost 10 years ago
- Status changed from Need more information to Ready For Testing
- Pull request https://github.com/theforeman/foreman/pull/2191 added
- Pull request deleted (
)
Updated by Daniel Lobato Garcia almost 10 years ago
I cannot reproduce this either, I can record a screencast of what I do if needed
Updated by Marek Hulán almost 10 years ago
Even though I was unable to reproduce, I think this can potentially happen, http://projects.theforeman.org/issues/9170#note-8 explains it pretty good. Since none of use was able to reproduce I requested some proof from reporter, which might take some time but I'd consider this as 1.8 blocker. Not setting release to it, as I don't know how long can that take.
Updated by Dominic Cleal over 9 years ago
- Translation missing: en.field_release set to 50
- Assignee set to Anonymous
Updated by Anonymous over 9 years ago
- Status changed from Ready For Testing to Closed
- % Done changed from 0 to 100
Applied in changeset 8aeac1b7b89fab21e0b92f54eb460fcc04415e07.