10. Troubleshooting

This section covers general troubleshooting and commonly reported problems.

10.1 Genernal fabric troubleshooting

The ibdiagnet program can be used to troubleshoot potential issues with your infiniband fabric.

ibdiagnet -r

10.2 ib_query_gid() failed errors on mlx4 platforms

ibstat or opensm hangs and the following kernel messages are printed:

kernel: [   78.170077] ib0: ib_query_gid() failed
kernel: [   89.272789] ib0: ib_query_port failed

Fix: Load the mlx4_core module with the msi_x=0 option.

cat > /etc/modprobe.d/mlx4_core <<EOF
options mlx4_core msi_x=0

update-initramfs -u

10.3 Missing XRC support

If you see error messages pertaining to missing support for XRC, it means you have mis-matched kernel modules and userspace libraries.

mlx4: There is a mismatch between the kernel and the userspace  
libraries: Kernel does not support XRC. Exiting.
Fix: Make sure that you build and install the OFED kernel modules as described in section X.

