Introduction ============ libibverbs is a library that allows programs to use InfiniBand "verbs" for direct access to IB hardware from userspace. For more information on verbs, see the InfiniBand Architecture Specification vol. 1, especially chapter 11. Using libibverbs ================ Device nodes ------------ The verbs library expects special character device files named /dev/infiniband/uverbsN to be created. When you load the kernel modules, including both the low-level driver for your IB hardware as well as the ib_uverbs module, you should see one or more uverbsN entries in /sys/class/infiniband_verbs in addition to the /dev/infiniband/uverbsN character device files. To create the appropriate character device files automatically with udev, a rule like KERNEL="uverbs*", NAME="infiniband/%k" can be used. This will create device nodes named /dev/infiniband/uverbs0 and so on. Since the InfiniBand userspace verbs should be safe for use by non-privileged, you may want to add an appropriate MODE or GROUP to your udev rule. Permissions ----------- To use IB verbs from userspace, a process must be able to access the appropriate /dev/infiniband/uverbsN special device file. You can check the permissions on this file with the command ls -l /dev/infiniband/uverbs* Make sure that the permissions on these files are such that the user/group that your verbs program runs as can access the device file. To use IB verbs from userspace, a process must also have permission to tell the kernel to lock sufficient memory for all of your registered memory regions as well as the memory used internally by IB resources such as queue pairs (QPs) and completion queues (CQs). To check your resource limits, use the command ulimit -l (or "limit memorylocked" for csh-like shells). If you see a small number such as 32 (the units are KB) then you will need to increase this limit. This is usually done for ordinary users via the file /etc/security/limits.conf. More configuration may be necessary if you are logging in via OpenSSH and your sshd is configured to use privilege separation. Static linking -------------- In almost all cases it is better to dynamically link libibverbs into an application. However, if you are forced to use static linking for libibverbs, then you will also have to link a device-specific userspace driver (such as libmthca, libipathverbs, libehca, etc) statically into your application. This is because of limitations on dynamically loading new modules into a static executable. In particular, a static application can only be linked against a single device-specific driver, which means that the application will only work with a single type of device. This limitation will be removed in future libibverbs releases, but this will require a change to the libibverbs ABI, so it cannot be done as part of the libibverbs 1.0 release series. Valgrind support ---------------- When running applications that use libibverbs under the Valgrind memory-checking debugger, Valgrind will falsely report "read from uninitialized" for memory that was initialized by the kernel drivers. Specifically, Valgrind cannot see when kernel drivers write to userspace memory, so when the process reads from that memory, Valgrind incorrectly assumes that the memory contents are uninitialized, and therefore raises a warning. libibverbs can be built with specific support for the Valgrind memory-checking debugger by specifying the --with-valgrind command line argument to configure. This flag enables code in libibverbs to tell Valgrind "this memory may look uninitialized, but it's really OK," which therefore suppresses the incorrect "read from uninitialized" warnings. This code adds trivial overhead to the critical performance path, so it is disabled by default. The intent is that production users can use a "normal" build of libibverbs and developers can use the "valgrind debug" build by simply switching their LD_LIBRARY_PATH and/or OPENIB_DRIVER_PATH environment variables. Libibverbs needs some header files from Valgrind in order to compile this support; it is important to use the header files from the same version of Valgrind that will be used at run time. You may need to specify the directory where Valgrind's header files are installed as an argument to --with-valgrind. For example ./configure --with-valgrind=/opt/valgrind will make the libibverbs build look for valgrind headers in /opt/valgrind/include Reporting bugs ============== Bugs should be reported to the OpenIB mailing list <openib-general@openib.org>. In your bug report, please include: * Information about your system: - Linux distribution and version - Linux kernel and version - InfiniBand hardware and firmware version - ... any other relevant information * How to reproduce the bug. Command line arguments for a libibverbs example program or source code that other developers can compile and run is most convenient. * If the bug is a crash, the exact output printed out when the crash occurred, including any kernel messages produced. * If a verbs call is mysteriously returning an error or failing, the output of "strace -ewrite -ewrite=all <command>". Submitting patches ================== Patches should also be submitted to the OpenIB mailing list <openib-general@openib.org>. Please use unified diff form (the -u option to GNU diff), and include a good description of what your patch does and why it should be applied. If your patch fixes a bug, please make sure to describe the bug and how your fix works. Please include a change to the ChangeLog file (in standard GNU changelog format) as part of your patch. Make sure that your contribution can be licensed under the same license as the original code you are patching, and that you have all necessary permissions to release your work. TODO ==== 1.0 series ---------- * Use the MADV_DONTFORK advice for madvise(2) to make applications that use fork(2) work better. 1.1 series ---------- The libibverbs API and ABI are frozen for all releases in the 1.0 series. The following changes that break API or ABI are planned for the 1.1 release: * Implement memory window (MW) support. This will break the device driver ABI, because new methods will need to be added to struct ibv_context_ops. * Implement the reregister memory region (MR) verb. We will add an extension to the IB spec to allow the application to indicate that the region is only being extended, and that operations in progress should _not_ fail (contrary to the IB spec, which states that reregister must be implemented so that it behaves equivalently to a deregister followed by a register). This will break the device driver ABI, because a new method will need to be added to struct ibv_context_ops. * Eliminate the dependency on libsysfs by implementing the required sysfs handling directly. This will break the API, because the dev and ibdev members of struct ibv_device will be removed. It will also break the device driver ABI, because the signature of the driver initialization function will change. The driver initialization function will be changed as part of this work; this has the added benefit of allowing us to choose a better name than "openib_driver_init." Other possibilities ------------------- There are no plans to implement the following features, which would be needed for completeness but don't seem particularly useful. However, if there is demand from application developers or an implementation is contributed, then the feature may be added. * Implement the query address handle (AH) verb. * Implement the query memory region (MR) verb.