I have newly installed ofed-1.5-3 on following machine :
SUSE Linux Enterprise Server 11 (x86_64) VERSION = 11 PATCHLEVEL = 1
Our admin has installed CX354A card (MCX354A-FCBT) card on the machine. $ lspci | grep -i mel 04:00.0 Network controller: Mellanox Technologies Device 1003
(Question 1: I dont know if this output of lspci is correct or not, I suspect it to be though, but dont know how do I get the correct output.)
After I installed ofed package through ./mlnxofedinstall script :
Output of ofed_info | head -1 : MLNX_OFED_LINUX-1.5.3-3.1.0 (OFED-1.5.3-3.1.0):
The installation was successful, and openibd did load all the required modules/drivers. $ service openibd status
HCA driver loaded
Configured IPoIB devices: ib0 ib1
Currently active IPoIB devices:
The following OFED modules are loaded:
rdma_ucm ib_srp rdma_cm ib_addr ib_ipoib mlx4_core mlx4_ib mlx4_en ib_mthca ib_uverbs ib_umad ib_ucm ib_sa ib_cm ib_mad ib_core iw_cxgb3 iw_nes
Output of hca_self_test.ofed
---- Performing Adapter Device Self Test ---- Number of CAs Detected ................. 1 PCI Device Check ....................... PASS Kernel Arch ............................ x86_64 Host Driver Version .................... MLNX_OFED_LINUX-1.5.3-3.1.0 (OFED-1.5.3-3.1.0): 2.6.32.12-0.7-default Host Driver RPM Check .................. PASS Firmware on CA #0 VPI .................. v2.10.700 Firmware Check on CA #0 (VPI) .......... NA REASON: NO required fw version Host Driver Initialization ............. PASS Number of CA Ports Active .............. 0 Port State of Port #1 on CA #0 (VPI)..... DOWN (InfiniBand) Port State of Port #2 on CA #0 (VPI)..... DOWN (InfiniBand) Error Counter Check on CA #0 (VPI)...... PASS Kernel Syslog Check .................... PASS Node GUID on CA #0 (VPI) ............... 00:02:c9:03:00:f9:ed:e0 ------------------ DONE ---------------------
Question 2: Now when I tried ibstatus or ibv_devinfo, I see the port status as PORT_DOWN. While I read in docs is that it should be in INIT state. My question is how do I get this to INIT state. $ ibv_devinfo
hca_id: mlx4_0 transport: InfiniBand (0) fw_ver: 2.10.700 node_guid: 0002:c903:00f9:ede0 sys_image_guid: 0002:c903:00f9:ede3 vendor_id: 0x02c9 vendor_part_id: 4099 hw_ver: 0x0 board_id: MT_1090120019 phys_port_cnt: 2 port: 1 state: PORT_DOWN (1) max_mtu: 2048 (4) active_mtu: 2048 (4) sm_lid: 0 port_lid: 0 port_lmc: 0x00 link_layer: IB
port: 2
state: PORT_DOWN (1)
max_mtu: 2048 (4)
active_mtu: 2048 (4)
sm_lid: 0
port_lid: 0
port_lmc: 0x00
link_layer: IB
When I tried following command I got error :
$ ibportstate -G 0x0002c90300f9ede0 1 query/enable/disable ibwarn: [9318] mad_rpc_open_port: can't open UMAD port ((null):0) ibportstate: iberror: failed: Failed to open '(null)' port '0'
Question 3: Didnt get why I got this error, or how do I get rid of it. Any help will be very appreciated.
Please let me know if you need more info.
Thanks
The port status in ibv_devinfo will be down until you connect this node to either another node or infiniband switch with infiniband cable. From modules it looks like you got all the necessary things setup, just missing another node to talk to. As soon as you connect it to another node, you should be able to see state changed to PORT_INIT and link speed etc populated.