Bug #9576
closedibnetdiscover is applying non-CA port GUID name maps as expected but ignoring CA port GUID names
Description
Cloned from https://bugzilla.redhat.com/show_bug.cgi?id=1192740
A customer is asking a query about the behaviour of ibnetdiscover and is wondering if it is because of a bug. We are not familiar enough with IB to immediately determine this. Any guidance or pointers would be appreciated.
Here are three test cases.
File "node-name-map.txt" in the CWD contains records of '{GUID} "{name}"' as per historic 'ibnetdiscover --node-name-map {mapfile}' use. All map records take the form "{name}[IB{plane}]" and nodes "service5" and "r6i3n15" are SM-active on both locally connected fabrics. The infiniband-diags default config is used and is unchanged from stock (ie., all commented) so all defaults should apply. The default name map for CAs should be "{node} mthca_0" because the first link-up device located will always be 0 and all CAs are Mellanox via mlx4. The default name map for SWs should be the vendor strings, which are all Mellanox MT47's.
A netdiscover run for ports without the name map reports the test GUIDS as "service5 mthca_0", "r6i3n15 mthca_0", and "MT47396 Infiniscale-III Mellanox Technologies" as expected. Applying the map results in all SWs being reported as "{name}[IB{plane}]", as I'd expect. Service5 and r6i3n15, along with all other CAs, are still reported as "{name} mthca_0}".
Why is the node name map being applied only to the SW (or non-CA) GUIDs, and how do I get ibnetdiscover to apply it uniformly to all GUIDs?
##- test node service5 is a standalone CA with two adapters, each with two ports:
##
[root@r1lead ~]# grep service5 node-name-map.txt
0x0002c9020028e8bd "service5[IB=0]"
0x0002c9020028e8be "service5[IB=1]"
0x0002c9020028e5d5 "service5[IB=2]"
0x0002c9020028e5d6 "service5[IB=3]"
- test node r6i3n15 is a diskless backplane-connected CA with one embedded twin-port adapter:
##
[root@r1lead ~]# grep r6i3n15 node-name-map.txt
0x003048c9c9040001 "r6i3n15[IB=0]"
0x003048c9c9040002 "r6i3n15[IB=1]"
- r6i3sw3 is a 24-port IB switch in the last slot of the last IRU of rack 6 (in this case
- I'm using the node-level GUID of the switch's backplane rather than a port-level GUID):
##
[root@r1lead ~]# grep r6i3sw3 node-name-map.txt
0x0800690000004e8c "r6i3sw3[IB=0]"
- the standalone CAs, without mapping, return connected GUIDs with default names, as they should:
##
[root@r1lead ~]# ibnetdiscover -p |grep service5
SW 810 23 0x0800690000004de4 4x DDR - CA 878 1 0x0002c9020028e8bd ( 'MT47396 Infiniscale-III Mellanox Technologies' - 'service5 mthca0' )
CA 879 2 0x0002c9020028e8be 4x DDR - SW 405 23 0x0800690000004e22 ( 'service5 mthca0' - 'MT47396 Infiniscale-III Mellanox Technologies' )
CA 878 1 0x0002c9020028e8bd 4x DDR - SW 810 23 0x0800690000004de4 ( 'service5 mthca0' - 'MT47396 Infiniscale-III Mellanox Technologies' )
SW 405 23 0x0800690000004e22 4x DDR - CA 879 2 0x0002c9020028e8be ( 'MT47396 Infiniscale-III Mellanox Technologies' - 'service5 mthca0' )
- the same, with mapping added, and only the SWs are mapped -- the CAs still have their unmapped default names:
##
[root@r1lead ~]# ibnetdiscover -p --node-name-map node-name-map.txt |grep service5
SW 810 23 0x0800690000004de4 4x DDR - CA 878 1 0x0002c9020028e8bd ( 'r3i3sw2[IB=1]' - 'service5 mthca0' )
CA 879 2 0x0002c9020028e8be 4x DDR - SW 405 23 0x0800690000004e22 ( 'service5 mthca0' - 'r3i3sw3[IB=0]' )
CA 878 1 0x0002c9020028e8bd 4x DDR - SW 810 23 0x0800690000004de4 ( 'service5 mthca0' - 'r3i3sw2[IB=1]' )
SW 405 23 0x0800690000004e22 4x DDR - CA 879 2 0x0002c9020028e8be ( 'r3i3sw3[IB=0]' - 'service5 mthca0' )
- the backplane-attached CAs, without mapping, are correct like the standalone CAs were:
##
[root@r1lead ~]# ibnetdiscover -p |grep r6i3n15
SW 539 10 0x0800690000004c58 4x DDR - CA 168 2 0x003048c9c9040002 ( 'MT47396 Infiniscale-III Mellanox Technologies' - 'r6i3n15 mlx4_0' )
CA 168 2 0x003048c9c9040002 4x DDR - SW 539 10 0x0800690000004c58 ( 'r6i3n15 mlx4_0' - 'MT47396 Infiniscale-III Mellanox Technologies' )
CA 8 1 0x003048c9c9040001 4x DDR - SW 651 4 0x0800690000004e8c ( 'r6i3n15 mlx4_0' - 'MT47396 Infiniscale-III Mellanox Technologies' )
SW 651 4 0x0800690000004e8c 4x DDR - CA 8 1 0x003048c9c9040001 ( 'MT47396 Infiniscale-III Mellanox Technologies' - 'r6i3n15 mlx4_0' )
- with mapping added, they do the same thing as the previous case -- SWs are mapped, CAs aren't:\
##
[root@r1lead ~]# ibnetdiscover -p --node-name-map node-name-map.txt |grep r6i3n15
SW 539 10 0x0800690000004c58 4x DDR - CA 168 2 0x003048c9c9040002 ( 'r6i3sw2[IB=1]' - 'r6i3n15 mlx4_0' )
CA 168 2 0x003048c9c9040002 4x DDR - SW 539 10 0x0800690000004c58 ( 'r6i3n15 mlx4_0' - 'r6i3sw2[IB=1]' )
CA 8 1 0x003048c9c9040001 4x DDR - SW 651 4 0x0800690000004e8c ( 'r6i3n15 mlx4_0' - 'r6i3sw3[IB=0]' )
SW 651 4 0x0800690000004e8c 4x DDR - CA 8 1 0x003048c9c9040001 ( 'r6i3sw3[IB=0]' - 'r6i3n15 mlx4_0' )
- SW-to-SW links do the same thing as previous cases -- without mapping, default names:
##
[root@r1lead ~]# ibnetdiscover -p |grep 0x0800690000004e8c
...snip-snip...
CA 112 1 0x003048c9baa80001 4x DDR - SW 651 2 0x0800690000004e8c ( 'r6i3n12 mlx4_0' - 'MT47396 Infiniscale-III Mellanox Technologies' )
CA 80 1 0x003048c9b28c0001 4x DDR - SW 651 1 0x0800690000004e8c ( 'r6i3n13 mlx4_0' - 'MT47396 Infiniscale-III Mellanox Technologies' )
SW 651 24 0x0800690000004e8c 4x SDR 'MT47396 Infiniscale-III Mellanox Technologies'
SW 651 23 0x0800690000004e8c 4x DDR - SW 690 23 0x0800690000004ea2 ( 'MT47396 Infiniscale-III Mellanox Technologies' - 'MT47396 Infiniscale-III Mellanox Technologies' )
SW 651 22 0x0800690000004e8c 4x DDR - SW 690 22 0x0800690000004ea2 ( 'MT47396 Infiniscale-III Mellanox Technologies' - 'MT47396 Infiniscale-III Mellanox Technologies' )
SW 651 21 0x0800690000004e8c 4x SDR 'MT47396 Infiniscale-III Mellanox Technologies'
SW 651 20 0x0800690000004e8c 4x SDR 'MT47396 Infiniscale-III Mellanox Technologies'
SW 651 19 0x0800690000004e8c 4x DDR - SW 378 19 0x0800690000004e1a ( 'MT47396 Infiniscale-III Mellanox Technologies' - 'MT47396 Infiniscale-III Mellanox Technologies' )
SW 651 18 0x0800690000004e8c 4x DDR - SW 378 18 0x0800690000004e1a ( 'MT47396 Infiniscale-III Mellanox Technologies' - 'MT47396 Infiniscale-III Mellanox Technologies' )
...snip-snip...
- and with mapping, mapped names, including single-sided states (ports 20, 21, and 24 have transceivers but
- the far ends are disconnected, they map correctly just like active links):
##
[root@r1lead ~]# ibnetdiscover -p --node-name-map node-name-map.txt |grep 0x0800690000004e8c
...snip-snip...
CA 112 1 0x003048c9baa80001 4x DDR - SW 651 2 0x0800690000004e8c ( 'r6i3n12 mlx4_0' - 'r6i3sw3[IB=0]' )
CA 80 1 0x003048c9b28c0001 4x DDR - SW 651 1 0x0800690000004e8c ( 'r6i3n13 mlx4_0' - 'r6i3sw3[IB=0]' )
SW 651 24 0x0800690000004e8c 4x SDR 'r6i3sw3[IB=0]'
SW 651 23 0x0800690000004e8c 4x DDR - SW 690 23 0x0800690000004ea2 ( 'r6i3sw3[IB=0]' - 'r2i3sw3[IB=0]' )
SW 651 22 0x0800690000004e8c 4x DDR - SW 690 22 0x0800690000004ea2 ( 'r6i3sw3[IB=0]' - 'r2i3sw3[IB=0]' )
SW 651 21 0x0800690000004e8c 4x SDR 'r6i3sw3[IB=0]'
SW 651 20 0x0800690000004e8c 4x SDR 'r6i3sw3[IB=0]'
SW 651 19 0x0800690000004e8c 4x DDR - SW 378 19 0x0800690000004e1a ( 'r6i3sw3[IB=0]' - 'r5i3sw3[IB=0]' )
SW 651 18 0x0800690000004e8c 4x DDR - SW 378 18 0x0800690000004e1a ( 'r6i3sw3[IB=0]' - 'r5i3sw3[IB=0]' )
...snip-snip...
- ib diags version for netdiscover
##
[root@r1lead ~]# rpm -q --whatprovides `which ibnetdiscover`
infiniband-diags-1.6.4-1.el6.x86_64
Updated by Justin Sherrill about 10 years ago
- Status changed from New to Rejected
Whoops! mistaken clone from downstream closing
Updated by Eric Helms almost 9 years ago
- Translation missing: en.field_release set to 166