Thursday, June 20, 2019

Network interfaces disappears upon reboot linked to IB switches ?

Issue:

InfiniBand SUN DCS 36p switch port is set to auto-disabled when the link exhibits sub-optimal link speed or bandwidth.

Error in messages file:

ip : [ID 505317 kern.error] ibd: DL_ATTACH_REQ failed: DL_SYSERR (errno 22)
ip:  [ID 590039 kern.error] ibd: DL_BIND_REQ failed: DL_OUTSTATE 
ip: [ID 312130 kern.error] ibd: DL_UNBIND_REQ failed: DL_OUTSTATE 
ip: [ID 505317 kern.error] ibd: DL_ATTACH_REQ failed: DL_SYSERR (errno 22)

Below is a description of Autodisable Functionality:
Switch chip ports and their connectors can be configured to automatically disable should their links exhibit high error rates or sub-optimal link speed or width.

You use the autodisable command to add the connectors to the autodisable list, which has two parts; one for connectors whose links fail from high error rates, and another for connectors whose links fail from suboptimal link speed or width. A connector can be configured for both parts.

The autodisable feature monitors the following to determine if a connector and its respective link are experiencing high error rates:
     SNMP traps
     Oracle ILOM event log
     Syslog
     Email alerts
The autodisable feature also monitors the link speed and width, and if any of the following combinations are discovered, the link is considered suboptimal:
     1x SDR
     1x DDR
     1x QDR
     4x SDR
     4x DDR
 As a side note, this issue may also be caused by a bad partner link or misconfiguration.

Solution:

This feature is enabled from firmware 2.X. When the port goes down with AutomaticBadSpeedOrWidth, then re-enable using enableswitchport --automatic once confirmed there are no faults reported at physical layer.
For more details:  Refer to (Doc ID 1605955.1) on how to enable the switch port when it is auto-disabled.