aboutsummaryrefslogtreecommitdiffstats
Introduction
============

libibverbs is a library that allows programs to use InfiniBand "verbs"
for direct access to IB hardware from userspace.  For more information
on verbs, see the InfiniBand Architecture Specification vol. 1,
especially chapter 11.

Using libibverbs
================

Device nodes
------------

The verbs library expects special character device files named
/dev/infiniband/uverbsN to be created.  When you load the kernel
modules, including both the low-level driver for your IB hardware as
well as the ib_uverbs module, you should see one or more uverbsN
entries in /sys/class/infiniband_verbs in addition to the
/dev/infiniband/uverbsN character device files.

To create the appropriate character device files automatically with
udev, a rule like

    KERNEL="uverbs*", NAME="infiniband/%k"

can be used.  This will create device nodes named

    /dev/infiniband/uverbs0

and so on.  Since the InfiniBand userspace verbs should be safe for
use by non-privileged, you may want to add an appropriate MODE or
GROUP to your udev rule.

Permissions
-----------

To use IB verbs from userspace, a process must be able to access the
appropriate /dev/infiniband/uverbsN special device file.  You can
check the permissions on this file with the command

	ls -l /dev/infiniband/uverbs*

Make sure that the permissions on these files are such that the
user/group that your verbs program runs as can access the device file.

To use IB verbs from userspace, a process must also have permission to
tell the kernel to lock sufficient memory for all of your registered
memory regions as well as the memory used internally by IB resources
such as queue pairs (QPs) and completion queues (CQs).  To check your
resource limits, use the command

	ulimit -l

(or "limit memorylocked" for csh-like shells).

If you see a small number such as 32 (the units are KB) then you will
need to increase this limit.  This is usually done for ordinary users
via the file /etc/security/limits.conf.  More configuration may be
necessary if you are logging in via OpenSSH and your sshd is
configured to use privilege separation.

Static linking
--------------

In almost all cases it is better to dynamically link libibverbs into
an application.  However, if you are forced to use static linking for
libibverbs, then you will also have to link a device-specific
userspace driver (such as libmthca, libipathverbs, libehca, etc)
statically into your application.  This is because of limitations on
dynamically loading new modules into a static executable.

In particular, a static application can only be linked against a
single device-specific driver, which means that the application will
only work with a single type of device.  This limitation will be
removed in future libibverbs releases, but this will require a change
to the libibverbs ABI, so it cannot be done as part of the libibverbs
1.0 release series.

Valgrind support
----------------

When running applications that use libibverbs under the Valgrind
memory-checking debugger, Valgrind will falsely report "read from
uninitialized" for memory that was initialized by the kernel drivers.
Specifically, Valgrind cannot see when kernel drivers write to
userspace memory, so when the process reads from that memory, Valgrind
incorrectly assumes that the memory contents are uninitialized, and
therefore raises a warning.

libibverbs can be built with specific support for the Valgrind
memory-checking debugger by specifying the --with-valgrind command
line argument to configure.  This flag enables code in libibverbs to
tell Valgrind "this memory may look uninitialized, but it's really
OK," which therefore suppresses the incorrect "read from
uninitialized" warnings.  This code adds trivial overhead to the
critical performance path, so it is disabled by default.  The intent
is that production users can use a "normal" build of libibverbs and
developers can use the "valgrind debug" build by simply switching
their LD_LIBRARY_PATH and/or OPENIB_DRIVER_PATH environment variables.

Libibverbs needs some header files from Valgrind in order to compile
this support; it is important to use the header files from the same
version of Valgrind that will be used at run time.  You may need to
specify the directory where Valgrind's header files are installed as
an argument to --with-valgrind.  For example

	./configure --with-valgrind=/opt/valgrind

will make the libibverbs build look for valgrind headers in
/opt/valgrind/include

Reporting bugs
==============

Bugs should be reported to the OpenIB mailing list
<openib-general@openib.org>.  In your bug report, please include:

 * Information about your system:
   - Linux distribution and version
   - Linux kernel and version
   - InfiniBand hardware and firmware version
   - ... any other relevant information

 * How to reproduce the bug.  Command line arguments for a libibverbs
   example program or source code that other developers can
   compile and run is most convenient.

 * If the bug is a crash, the exact output printed out when the crash
   occurred, including any kernel messages produced.

 * If a verbs call is mysteriously returning an error or failing, the
   output of "strace -ewrite -ewrite=all <command>".

Submitting patches
==================

Patches should also be submitted to the OpenIB mailing list
<openib-general@openib.org>.  Please use unified diff form (the -u
option to GNU diff), and include a good description of what your patch
does and why it should be applied.  If your patch fixes a bug, please
make sure to describe the bug and how your fix works.

Please include a change to the ChangeLog file (in standard GNU
changelog format) as part of your patch.

Make sure that your contribution can be licensed under the same
license as the original code you are patching, and that you have all
necessary permissions to release your work.

TODO
====

1.0 series
----------

 * Use the MADV_DONTFORK advice for madvise(2) to make applications
   that use fork(2) work better.

1.1 series
----------

The libibverbs API and ABI are frozen for all releases in the 1.0
series.  The following changes that break API or ABI are planned for
the 1.1 release:

 * Implement memory window (MW) support.  This will break the
   device driver ABI, because new methods will need to be added to
   struct ibv_context_ops.

 * Implement the reregister memory region (MR) verb.  We will add an
   extension to the IB spec to allow the application to indicate that
   the region is only being extended, and that operations in progress
   should _not_ fail (contrary to the IB spec, which states that
   reregister must be implemented so that it behaves equivalently to a
   deregister followed by a register).  This will break the device
   driver ABI, because a new method will need to be added to struct
   ibv_context_ops. 

 * Eliminate the dependency on libsysfs by implementing the required
   sysfs handling directly.  This will break the API, because the dev
   and ibdev members of struct ibv_device will be removed.  It will
   also break the device driver ABI, because the signature of the
   driver initialization function will change.  The driver
   initialization function will be changed as part of this work; this
   has the added benefit of allowing us to choose a better name than
   "openib_driver_init."

Other possibilities
-------------------

There are no plans to implement the following features, which would be
needed for completeness but don't seem particularly useful.  However,
if there is demand from application developers or an implementation is
contributed, then the feature may be added.

 * Implement the query address handle (AH) verb.
 * Implement the query memory region (MR) verb.