sphinx.addnodesdocument)}( rawsourcechildren]( translations LanguagesNode)}(hhh](h pending_xref)}(hhh]docutils.nodesTextChinese (Simplified)}parenthsba attributes}(ids]classes]names]dupnames]backrefs] refdomainstdreftypedoc reftarget&/translations/zh_CN/target/tcmu-designmodnameN classnameN refexplicitutagnamehhh ubh)}(hhh]hChinese (Traditional)}hh2sbah}(h]h ]h"]h$]h&] refdomainh)reftypeh+ reftarget&/translations/zh_TW/target/tcmu-designmodnameN classnameN refexplicituh1hhh ubh)}(hhh]hItalian}hhFsbah}(h]h ]h"]h$]h&] refdomainh)reftypeh+ reftarget&/translations/it_IT/target/tcmu-designmodnameN classnameN refexplicituh1hhh ubh)}(hhh]hJapanese}hhZsbah}(h]h ]h"]h$]h&] refdomainh)reftypeh+ reftarget&/translations/ja_JP/target/tcmu-designmodnameN classnameN refexplicituh1hhh ubh)}(hhh]hKorean}hhnsbah}(h]h ]h"]h$]h&] refdomainh)reftypeh+ reftarget&/translations/ko_KR/target/tcmu-designmodnameN classnameN refexplicituh1hhh ubh)}(hhh]hSpanish}hhsbah}(h]h ]h"]h$]h&] refdomainh)reftypeh+ reftarget&/translations/sp_SP/target/tcmu-designmodnameN classnameN refexplicituh1hhh ubeh}(h]h ]h"]h$]h&]current_languageEnglishuh1h hh _documenthsourceNlineNubhsection)}(hhh](htitle)}(hTCM Userspace Designh]hTCM Userspace Design}(hhhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhhh@/var/lib/git/docbuild/linux/Documentation/target/tcmu-design.rsthKubhcomment)}(hXContents: 1) Design a) Background b) Benefits c) Design constraints d) Implementation overview i. Mailbox ii. Command ring iii. Data Area e) Device discovery f) Device events g) Other contingencies 2) Writing a user pass-through handler a) Discovering and configuring TCMU uio devices b) Waiting for events on the device(s) c) Managing the command ring 3) A final noteh]hXContents: 1) Design a) Background b) Benefits c) Design constraints d) Implementation overview i. Mailbox ii. Command ring iii. Data Area e) Device discovery f) Device events g) Other contingencies 2) Writing a user pass-through handler a) Discovering and configuring TCMU uio devices b) Waiting for events on the device(s) c) Managing the command ring 3) A final note}hhsbah}(h]h ]h"]h$]h&] xml:spacepreserveuh1hhhhhhhhKubh)}(hhh](h)}(hDesignh]hDesign}(hhhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhhhhhKubh paragraph)}(hTCM is another name for LIO, an in-kernel iSCSI target (server). Existing TCM targets run in the kernel. TCMU (TCM in Userspace) allows userspace programs to be written which act as iSCSI targets. This document describes the design.h]hTCM is another name for LIO, an in-kernel iSCSI target (server). Existing TCM targets run in the kernel. TCMU (TCM in Userspace) allows userspace programs to be written which act as iSCSI targets. This document describes the design.}(hhhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhhhhubh)}(hXKThe existing kernel provides modules for different SCSI transport protocols. TCM also modularizes the data storage. There are existing modules for file, block device, RAM or using another SCSI device as storage. These are called "backstores" or "storage engines". These built-in modules are implemented entirely as kernel code.h]hXSThe existing kernel provides modules for different SCSI transport protocols. TCM also modularizes the data storage. There are existing modules for file, block device, RAM or using another SCSI device as storage. These are called “backstores” or “storage engines”. These built-in modules are implemented entirely as kernel code.}(hhhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhK"hhhhubh)}(hhh](h)}(h Backgroundh]h Background}(hhhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhhhhhK)ubh)}(hXIn addition to modularizing the transport protocol used for carrying SCSI commands ("fabrics"), the Linux kernel target, LIO, also modularizes the actual data storage as well. These are referred to as "backstores" or "storage engines". The target comes with backstores that allow a file, a block device, RAM, or another SCSI device to be used for the local storage needed for the exported SCSI LUN. Like the rest of LIO, these are implemented entirely as kernel code.h]hXIn addition to modularizing the transport protocol used for carrying SCSI commands (“fabrics”), the Linux kernel target, LIO, also modularizes the actual data storage as well. These are referred to as “backstores” or “storage engines”. The target comes with backstores that allow a file, a block device, RAM, or another SCSI device to be used for the local storage needed for the exported SCSI LUN. Like the rest of LIO, these are implemented entirely as kernel code.}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhK+hhhhubh)}(hXThese backstores cover the most common use cases, but not all. One new use case that other non-kernel target solutions, such as tgt, are able to support is using Gluster's GLFS or Ceph's RBD as a backstore. The target then serves as a translator, allowing initiators to store data in these non-traditional networked storage systems, while still only using standard protocols themselves.h]hXThese backstores cover the most common use cases, but not all. One new use case that other non-kernel target solutions, such as tgt, are able to support is using Gluster’s GLFS or Ceph’s RBD as a backstore. The target then serves as a translator, allowing initiators to store data in these non-traditional networked storage systems, while still only using standard protocols themselves.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhK3hhhhubh)}(hIf the target is a userspace process, supporting these is easy. tgt, for example, needs only a small adapter module for each, because the modules just use the available userspace libraries for RBD and GLFS.h]hIf the target is a userspace process, supporting these is easy. tgt, for example, needs only a small adapter module for each, because the modules just use the available userspace libraries for RBD and GLFS.}(hj%hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhK:hhhhubh)}(hX'Adding support for these backstores in LIO is considerably more difficult, because LIO is entirely kernel code. Instead of undertaking the significant work to port the GLFS or RBD APIs and protocols to the kernel, another approach is to create a userspace pass-through backstore for LIO, "TCMU".h]hX+Adding support for these backstores in LIO is considerably more difficult, because LIO is entirely kernel code. Instead of undertaking the significant work to port the GLFS or RBD APIs and protocols to the kernel, another approach is to create a userspace pass-through backstore for LIO, “TCMU”.}(hj3hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhK>hhhhubeh}(h] backgroundah ]h"] backgroundah$]h&]uh1hhhhhhhhK)ubh)}(hhh](h)}(hBenefitsh]hBenefits}(hjLhhhNhNubah}(h]h ]h"]h$]h&]uh1hhjIhhhhhKFubh)}(hX2In addition to allowing relatively easy support for RBD and GLFS, TCMU will also allow easier development of new backstores. TCMU combines with the LIO loopback fabric to become something similar to FUSE (Filesystem in Userspace), but at the SCSI layer instead of the filesystem layer. A SUSE, if you will.h]hX2In addition to allowing relatively easy support for RBD and GLFS, TCMU will also allow easier development of new backstores. TCMU combines with the LIO loopback fabric to become something similar to FUSE (Filesystem in Userspace), but at the SCSI layer instead of the filesystem layer. A SUSE, if you will.}(hjZhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKHhjIhhubh)}(hThe disadvantage is there are more distinct components to configure, and potentially to malfunction. This is unavoidable, but hopefully not fatal if we're careful to keep things as simple as possible.h]hThe disadvantage is there are more distinct components to configure, and potentially to malfunction. This is unavoidable, but hopefully not fatal if we’re careful to keep things as simple as possible.}(hjhhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKNhjIhhubeh}(h]benefitsah ]h"]benefitsah$]h&]uh1hhhhhhhhKFubh)}(hhh](h)}(hDesign constraintsh]hDesign constraints}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhj~hhhhhKSubh bullet_list)}(hhh](h list_item)}(h.Good performance: high throughput, low latencyh]h)}(hjh]h.Good performance: high throughput, low latency}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKUhjubah}(h]h ]h"]h$]h&]uh1jhjhhhhhNubj)}(hSCleanly handle if userspace: 1) never attaches 2) hangs 3) dies 4) misbehaves h](h)}(hCleanly handle if userspace:h]hCleanly handle if userspace:}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKVhjubh block_quote)}(h11) never attaches 2) hangs 3) dies 4) misbehaves h]henumerated_list)}(hhh](j)}(hnever attachesh]h)}(hjh]hnever attaches}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKXhjubah}(h]h ]h"]h$]h&]uh1jhjubj)}(hhangsh]h)}(hjh]hhangs}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKYhjubah}(h]h ]h"]h$]h&]uh1jhjubj)}(hdiesh]h)}(hjh]hdies}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKZhjubah}(h]h ]h"]h$]h&]uh1jhjubj)}(h misbehaves h]h)}(h misbehavesh]h misbehaves}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhK[hjubah}(h]h ]h"]h$]h&]uh1jhjubeh}(h]h ]h"]h$]h&]enumtypearabicprefixhsuffix)uh1jhjubah}(h]h ]h"]h$]h&]uh1jhhhKXhjubeh}(h]h ]h"]h$]h&]uh1jhjhhhhhNubj)}(h9Allow future flexibility in user & kernel implementationsh]h)}(hj@h]h9Allow future flexibility in user & kernel implementations}(hjBhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhK]hj>ubah}(h]h ]h"]h$]h&]uh1jhjhhhhhNubj)}(hBe reasonably memory-efficienth]h)}(hjWh]hBe reasonably memory-efficient}(hjYhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhK^hjUubah}(h]h ]h"]h$]h&]uh1jhjhhhhhNubj)}(hSimple to configure & runh]h)}(hjnh]hSimple to configure & run}(hjphhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhK_hjlubah}(h]h ]h"]h$]h&]uh1jhjhhhhhNubj)}(h%Simple to write a userspace backend h]h)}(h#Simple to write a userspace backendh]h#Simple to write a userspace backend}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhK`hjubah}(h]h ]h"]h$]h&]uh1jhjhhhhhNubeh}(h]h ]h"]h$]h&]bullet-uh1jhhhKUhj~hhubeh}(h]design-constraintsah ]h"]design constraintsah$]h&]uh1hhhhhhhhKSubh)}(hhh](h)}(hImplementation overviewh]hImplementation overview}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhjhhhhhKdubh)}(hXThe core of the TCMU interface is a memory region that is shared between kernel and userspace. Within this region is: a control area (mailbox); a lockless producer/consumer circular buffer for commands to be passed up, and status returned; and an in/out data buffer area.h]hXThe core of the TCMU interface is a memory region that is shared between kernel and userspace. Within this region is: a control area (mailbox); a lockless producer/consumer circular buffer for commands to be passed up, and status returned; and an in/out data buffer area.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKfhjhhubh)}(hXTCMU uses the pre-existing UIO subsystem. UIO allows device driver development in userspace, and this is conceptually very close to the TCMU use case, except instead of a physical device, TCMU implements a memory-mapped layout designed for SCSI commands. Using UIO also benefits TCMU by handling device introspection (e.g. a way for userspace to determine how large the shared region is) and signaling mechanisms in both directions.h]hXTCMU uses the pre-existing UIO subsystem. UIO allows device driver development in userspace, and this is conceptually very close to the TCMU use case, except instead of a physical device, TCMU implements a memory-mapped layout designed for SCSI commands. Using UIO also benefits TCMU by handling device introspection (e.g. a way for userspace to determine how large the shared region is) and signaling mechanisms in both directions.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKkhjhhubh)}(hXThere are no embedded pointers in the memory region. Everything is expressed as an offset from the region's starting address. This allows the ring to still work if the user process dies and is restarted with the region mapped at a different virtual address.h]hXThere are no embedded pointers in the memory region. Everything is expressed as an offset from the region’s starting address. This allows the ring to still work if the user process dies and is restarted with the region mapped at a different virtual address.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKshjhhubh)}(h2See target_core_user.h for the struct definitions.h]h2See target_core_user.h for the struct definitions.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKxhjhhubeh}(h]implementation-overviewah ]h"]implementation overviewah$]h&]uh1hhhhhhhhKdubh)}(hhh](h)}(h The Mailboxh]h The Mailbox}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhjhhhhhK{ubh)}(hX3The mailbox is always at the start of the shared memory region, and contains a version, details about the starting offset and size of the command ring, and head and tail pointers to be used by the kernel and userspace (respectively) to put commands on the ring, and indicate when the commands are completed.h]hX3The mailbox is always at the start of the shared memory region, and contains a version, details about the starting offset and size of the command ring, and head and tail pointers to be used by the kernel and userspace (respectively) to put commands on the ring, and indicate when the commands are completed.}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhK}hjhhubh)}(h1version - 1 (userspace should abort if otherwise)h]h1version - 1 (userspace should abort if otherwise)}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhjhhubhdefinition_list)}(hhh](hdefinition_list_item)}(hflags: - TCMU_MAILBOX_FLAG_CAP_OOOC: indicates out-of-order completion is supported. See "The Command Ring" for details. h](hterm)}(hflags:h]hflags:}(hj6hhhNhNubah}(h]h ]h"]h$]h&]uh1j4hhhKhj0ubh definition)}(hhh]j)}(hhh]j)}(htTCMU_MAILBOX_FLAG_CAP_OOOC: indicates out-of-order completion is supported. See "The Command Ring" for details. h]j*)}(hhh]j/)}(hpTCMU_MAILBOX_FLAG_CAP_OOOC: indicates out-of-order completion is supported. See "The Command Ring" for details. h](j5)}(hTCMU_MAILBOX_FLAG_CAP_OOOC:h]hTCMU_MAILBOX_FLAG_CAP_OOOC:}(hjWhhhNhNubah}(h]h ]h"]h$]h&]uh1j4hhhKhjSubjE)}(hhh]h)}(hSindicates out-of-order completion is supported. See "The Command Ring" for details.h]hWindicates out-of-order completion is supported. See “The Command Ring” for details.}(hjhhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhjeubah}(h]h ]h"]h$]h&]uh1jDhjSubeh}(h]h ]h"]h$]h&]uh1j.hhhKhjPubah}(h]h ]h"]h$]h&]uh1j)hjLubah}(h]h ]h"]h$]h&]uh1jhjIubah}(h]h ]h"]h$]h&]jjuh1jhhhKhjFubah}(h]h ]h"]h$]h&]uh1jDhj0ubeh}(h]h ]h"]h$]h&]uh1j.hhhKhj+ubj/)}(hzcmdr_off The offset of the start of the command ring from the start of the memory region, to account for the mailbox size.h](j5)}(hcmdr_offh]hcmdr_off}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1j4hhhKhjubjE)}(hhh]h)}(hqThe offset of the start of the command ring from the start of the memory region, to account for the mailbox size.h]hqThe offset of the start of the command ring from the start of the memory region, to account for the mailbox size.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhjubah}(h]h ]h"]h$]h&]uh1jDhjubeh}(h]h ]h"]h$]h&]uh1j.hhhKhj+hhubj/)}(hRcmdr_size The size of the command ring. This does *not* need to be a power of two.h](j5)}(h cmdr_sizeh]h cmdr_size}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1j4hhhKhjubjE)}(hhh]h)}(hHThe size of the command ring. This does *not* need to be a power of two.h](h(The size of the command ring. This does }(hjhhhNhNubhemphasis)}(h*not*h]hnot}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjubh need to be a power of two.}(hjhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhKhjubah}(h]h ]h"]h$]h&]uh1jDhjubeh}(h]h ]h"]h$]h&]uh1j.hhhKhj+hhubj/)}(hWcmd_head Modified by the kernel to indicate when a command has been placed on the ring.h](j5)}(hcmd_headh]hcmd_head}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1j4hhhKhjubjE)}(hhh]h)}(hNModified by the kernel to indicate when a command has been placed on the ring.h]hNModified by the kernel to indicate when a command has been placed on the ring.}(hj'hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhj$ubah}(h]h ]h"]h$]h&]uh1jDhjubeh}(h]h ]h"]h$]h&]uh1j.hhhKhj+hhubj/)}(hZcmd_tail Modified by userspace to indicate when it has completed processing of a command. h](j5)}(hcmd_tailh]hcmd_tail}(hjEhhhNhNubah}(h]h ]h"]h$]h&]uh1j4hhhKhjAubjE)}(hhh]h)}(hPModified by userspace to indicate when it has completed processing of a command.h]hPModified by userspace to indicate when it has completed processing of a command.}(hjVhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhjSubah}(h]h ]h"]h$]h&]uh1jDhjAubeh}(h]h ]h"]h$]h&]uh1j.hhhKhj+hhubeh}(h]h ]h"]h$]h&]uh1j)hjhhhNhNubeh}(h] the-mailboxah ]h"] the mailboxah$]h&]uh1hhhhhhhhK{ubh)}(hhh](h)}(hThe Command Ringh]hThe Command Ring}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhj~hhhhhKubh)}(hXCommands are placed on the ring by the kernel incrementing mailbox.cmd_head by the size of the command, modulo cmdr_size, and then signaling userspace via uio_event_notify(). Once the command is completed, userspace updates mailbox.cmd_tail in the same way and signals the kernel via a 4-byte write(). When cmd_head equals cmd_tail, the ring is empty -- no commands are currently waiting to be processed by userspace.h]hXCommands are placed on the ring by the kernel incrementing mailbox.cmd_head by the size of the command, modulo cmdr_size, and then signaling userspace via uio_event_notify(). Once the command is completed, userspace updates mailbox.cmd_tail in the same way and signals the kernel via a 4-byte write(). When cmd_head equals cmd_tail, the ring is empty -- no commands are currently waiting to be processed by userspace.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhj~hhubh)}(hXTCMU commands are 8-byte aligned. They start with a common header containing "len_op", a 32-bit value that stores the length, as well as the opcode in the lowest unused bits. It also contains cmd_id and flags fields for setting by the kernel (kflags) and userspace (uflags).h]hXTCMU commands are 8-byte aligned. They start with a common header containing “len_op”, a 32-bit value that stores the length, as well as the opcode in the lowest unused bits. It also contains cmd_id and flags fields for setting by the kernel (kflags) and userspace (uflags).}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhj~hhubh)}(hDCurrently only two opcodes are defined, TCMU_OP_CMD and TCMU_OP_PAD.h]hDCurrently only two opcodes are defined, TCMU_OP_CMD and TCMU_OP_PAD.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhj~hhubh)}(hXWhen the opcode is CMD, the entry in the command ring is a struct tcmu_cmd_entry. Userspace finds the SCSI CDB (Command Data Block) via tcmu_cmd_entry.req.cdb_off. This is an offset from the start of the overall shared memory region, not the entry. The data in/out buffers are accessible via the req.iov[] array. iov_cnt contains the number of entries in iov[] needed to describe either the Data-In or Data-Out buffers. For bidirectional commands, iov_cnt specifies how many iovec entries cover the Data-Out area, and iov_bidi_cnt specifies how many iovec entries immediately after that in iov[] cover the Data-In area. Just like other fields, iov.iov_base is an offset from the start of the region.h]hXWhen the opcode is CMD, the entry in the command ring is a struct tcmu_cmd_entry. Userspace finds the SCSI CDB (Command Data Block) via tcmu_cmd_entry.req.cdb_off. This is an offset from the start of the overall shared memory region, not the entry. The data in/out buffers are accessible via the req.iov[] array. iov_cnt contains the number of entries in iov[] needed to describe either the Data-In or Data-Out buffers. For bidirectional commands, iov_cnt specifies how many iovec entries cover the Data-Out area, and iov_bidi_cnt specifies how many iovec entries immediately after that in iov[] cover the Data-In area. Just like other fields, iov.iov_base is an offset from the start of the region.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhj~hhubh)}(hWhen completing a command, userspace sets rsp.scsi_status, and rsp.sense_buffer if necessary. Userspace then increments mailbox.cmd_tail by entry.hdr.length (mod cmdr_size) and signals the kernel via the UIO method, a 4-byte write to the file descriptor.h]hWhen completing a command, userspace sets rsp.scsi_status, and rsp.sense_buffer if necessary. Userspace then increments mailbox.cmd_tail by entry.hdr.length (mod cmdr_size) and signals the kernel via the UIO method, a 4-byte write to the file descriptor.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhj~hhubh)}(hXIf TCMU_MAILBOX_FLAG_CAP_OOOC is set for mailbox->flags, kernel is capable of handling out-of-order completions. In this case, userspace can handle command in different order other than original. Since kernel would still process the commands in the same order it appeared in the command ring, userspace need to update the cmd->id when completing the command(a.k.a steal the original command's entry).h]hXIf TCMU_MAILBOX_FLAG_CAP_OOOC is set for mailbox->flags, kernel is capable of handling out-of-order completions. In this case, userspace can handle command in different order other than original. Since kernel would still process the commands in the same order it appeared in the command ring, userspace need to update the cmd->id when completing the command(a.k.a steal the original command’s entry).}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhj~hhubh)}(hWhen the opcode is PAD, userspace only updates cmd_tail as above -- it's a no-op. (The kernel inserts PAD entries to ensure each CMD entry is contiguous within the command ring.)h]hWhen the opcode is PAD, userspace only updates cmd_tail as above -- it’s a no-op. (The kernel inserts PAD entries to ensure each CMD entry is contiguous within the command ring.)}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhj~hhubh)}(hMore opcodes may be added in the future. If userspace encounters an opcode it does not handle, it must set UNKNOWN_OP bit (bit 0) in hdr.uflags, update cmd_tail, and proceed with processing additional commands, if any.h]hMore opcodes may be added in the future. If userspace encounters an opcode it does not handle, it must set UNKNOWN_OP bit (bit 0) in hdr.uflags, update cmd_tail, and proceed with processing additional commands, if any.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhj~hhubeh}(h]the-command-ringah ]h"]the command ringah$]h&]uh1hhhhhhhhKubh)}(hhh](h)}(h The Data Areah]h The Data Area}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1hhjhhhhhKubh)}(hThis is shared-memory space after the command ring. The organization of this area is not defined in the TCMU interface, and userspace should access only the parts referenced by pending iovs.h]hThis is shared-memory space after the command ring. The organization of this area is not defined in the TCMU interface, and userspace should access only the parts referenced by pending iovs.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhjhhubeh}(h] the-data-areaah ]h"] the data areaah$]h&]uh1hhhhhhhhKubh)}(hhh](h)}(hDevice Discoveryh]hDevice Discovery}(hj1hhhNhNubah}(h]h ]h"]h$]h&]uh1hhj.hhhhhKubh)}(hXOther devices may be using UIO besides TCMU. Unrelated user processes may also be handling different sets of TCMU devices. TCMU userspace processes must find their devices by scanning sysfs class/uio/uio*/name. For TCMU devices, these names will be of the format::h]hXOther devices may be using UIO besides TCMU. Unrelated user processes may also be handling different sets of TCMU devices. TCMU userspace processes must find their devices by scanning sysfs class/uio/uio*/name. For TCMU devices, these names will be of the format:}(hj?hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhj.hhubh literal_block)}(h1tcm-user////h]h1tcm-user////}hjOsbah}(h]h ]h"]h$]h&]hhuh1jMhhhKhj.hhubh)}(hwhere "tcm-user" is common for all TCMU-backed UIO devices. and allow userspace to find the device's path in the kernel target's configfs tree. Assuming the usual mount point, it is found at::h]hwhere “tcm-user” is common for all TCMU-backed UIO devices. and allow userspace to find the device’s path in the kernel target’s configfs tree. Assuming the usual mount point, it is found at:}(hj]hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhj.hhubjN)}(h;/sys/kernel/config/target/core/user_/h]h;/sys/kernel/config/target/core/user_/}hjksbah}(h]h ]h"]h$]h&]hhuh1jMhhhKhj.hhubh)}(hnThis location contains attributes such as "hw_block_size", that userspace needs to know for correct operation.h]hrThis location contains attributes such as “hw_block_size”, that userspace needs to know for correct operation.}(hjyhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhj.hhubh)}(hX& will be a userspace-process-unique string to identify the TCMU device as expecting to be backed by a certain handler, and will be an additional handler-specific string for the user process to configure the device, if needed. The name cannot contain ':', due to LIO limitations.h]hX* will be a userspace-process-unique string to identify the TCMU device as expecting to be backed by a certain handler, and will be an additional handler-specific string for the user process to configure the device, if needed. The name cannot contain ‘:’, due to LIO limitations.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhj.hhubh)}(hRFor all devices so discovered, the user handler opens /dev/uioX and calls mmap()::h]hQFor all devices so discovered, the user handler opens /dev/uioX and calls mmap():}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhj.hhubjN)}(h9mmap(NULL, size, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0)h]h9mmap(NULL, size, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0)}hjsbah}(h]h ]h"]h$]h&]hhuh1jMhhhKhj.hhubh)}(hSwhere size must be equal to the value read from /sys/class/uio/uioX/maps/map0/size.h]hSwhere size must be equal to the value read from /sys/class/uio/uioX/maps/map0/size.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhj.hhubeh}(h]device-discoveryah ]h"]device discoveryah$]h&]uh1hhhhhhhhKubh)}(hhh](h)}(h Device Eventsh]h Device Events}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhjhhhhhKubh)}(hXIf a new device is added or removed, a notification will be broadcast over netlink, using a generic netlink family name of "TCM-USER" and a multicast group named "config". This will include the UIO name as described in the previous section, as well as the UIO minor number. This should allow userspace to identify both the UIO device and the LIO device, so that after determining the device is supported (based on subtype) it can take the appropriate action.h]hXIf a new device is added or removed, a notification will be broadcast over netlink, using a generic netlink family name of “TCM-USER” and a multicast group named “config”. This will include the UIO name as described in the previous section, as well as the UIO minor number. This should allow userspace to identify both the UIO device and the LIO device, so that after determining the device is supported (based on subtype) it can take the appropriate action.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhjhhubeh}(h] device-eventsah ]h"] device eventsah$]h&]uh1hhhhhhhhKubh)}(hhh](h)}(hOther contingenciesh]hOther contingencies}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhjhhhhhMubh)}(h)Userspace handler process never attaches:h]h)Userspace handler process never attaches:}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMhjhhubj)}(hhh]j)}(hRTCMU will post commands, and then abort them after a timeout period (30 seconds.) h]h)}(hQTCMU will post commands, and then abort them after a timeout period (30 seconds.)h]hQTCMU will post commands, and then abort them after a timeout period (30 seconds.)}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMhjubah}(h]h ]h"]h$]h&]uh1jhj hhhhhNubah}(h]h ]h"]h$]h&]jjuh1jhhhMhjhhubh)}(h$Userspace handler process is killed:h]h$Userspace handler process is killed:}(hj.hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhM hjhhubj)}(hhh]j)}(hIt is still possible to restart and re-connect to TCMU devices. Command ring is preserved. However, after the timeout period, the kernel will abort pending tasks. h]h)}(hIt is still possible to restart and re-connect to TCMU devices. Command ring is preserved. However, after the timeout period, the kernel will abort pending tasks.h]hIt is still possible to restart and re-connect to TCMU devices. Command ring is preserved. However, after the timeout period, the kernel will abort pending tasks.}(hjChhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhM hj?ubah}(h]h ]h"]h$]h&]uh1jhj<hhhhhNubah}(h]h ]h"]h$]h&]jjuh1jhhhM hjhhubh)}(h Userspace handler process hangs:h]h Userspace handler process hangs:}(hj]hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMhjhhubj)}(hhh]j)}(h int handle_device_events(int fd, void *map) { struct tcmu_mailbox *mb = map; struct tcmu_cmd_entry *ent = (void *) mb + mb->cmdr_off + mb->cmd_tail; int did_some_work = 0; /* Process events from cmd ring until we catch up with cmd_head */ while (ent != (void *)mb + mb->cmdr_off + mb->cmd_head) { if (tcmu_hdr_get_op(ent->hdr.len_op) == TCMU_OP_CMD) { uint8_t *cdb = (void *)mb + ent->req.cdb_off; bool success = true; /* Handle command here. */ printf("SCSI opcode: 0x%x\n", cdb[0]); /* Set response fields */ if (success) ent->rsp.scsi_status = SCSI_NO_SENSE; else { /* Also fill in rsp->sense_buffer here */ ent->rsp.scsi_status = SCSI_CHECK_CONDITION; } } else if (tcmu_hdr_get_op(ent->hdr.len_op) != TCMU_OP_PAD) { /* Tell the kernel we didn't handle unknown opcodes */ ent->hdr.uflags |= TCMU_UFLAG_UNKNOWN_OP; } else { /* Do nothing for PAD entries except update cmd_tail */ } /* update cmd_tail */ mb->cmd_tail = (mb->cmd_tail + tcmu_hdr_get_len(&ent->hdr)) % mb->cmdr_size; ent = (void *) mb + mb->cmdr_off + mb->cmd_tail; did_some_work = 1; } /* Notify the kernel that work has been finished */ if (did_some_work) { uint32_t buf = 0; write(fd, &buf, 4); } return 0; } h](h)}(hManaging the command ring::h]hManaging the command ring:}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMZhjubjN)}(hXd#include int handle_device_events(int fd, void *map) { struct tcmu_mailbox *mb = map; struct tcmu_cmd_entry *ent = (void *) mb + mb->cmdr_off + mb->cmd_tail; int did_some_work = 0; /* Process events from cmd ring until we catch up with cmd_head */ while (ent != (void *)mb + mb->cmdr_off + mb->cmd_head) { if (tcmu_hdr_get_op(ent->hdr.len_op) == TCMU_OP_CMD) { uint8_t *cdb = (void *)mb + ent->req.cdb_off; bool success = true; /* Handle command here. */ printf("SCSI opcode: 0x%x\n", cdb[0]); /* Set response fields */ if (success) ent->rsp.scsi_status = SCSI_NO_SENSE; else { /* Also fill in rsp->sense_buffer here */ ent->rsp.scsi_status = SCSI_CHECK_CONDITION; } } else if (tcmu_hdr_get_op(ent->hdr.len_op) != TCMU_OP_PAD) { /* Tell the kernel we didn't handle unknown opcodes */ ent->hdr.uflags |= TCMU_UFLAG_UNKNOWN_OP; } else { /* Do nothing for PAD entries except update cmd_tail */ } /* update cmd_tail */ mb->cmd_tail = (mb->cmd_tail + tcmu_hdr_get_len(&ent->hdr)) % mb->cmdr_size; ent = (void *) mb + mb->cmdr_off + mb->cmd_tail; did_some_work = 1; } /* Notify the kernel that work has been finished */ if (did_some_work) { uint32_t buf = 0; write(fd, &buf, 4); } return 0; }h]hXd#include int handle_device_events(int fd, void *map) { struct tcmu_mailbox *mb = map; struct tcmu_cmd_entry *ent = (void *) mb + mb->cmdr_off + mb->cmd_tail; int did_some_work = 0; /* Process events from cmd ring until we catch up with cmd_head */ while (ent != (void *)mb + mb->cmdr_off + mb->cmd_head) { if (tcmu_hdr_get_op(ent->hdr.len_op) == TCMU_OP_CMD) { uint8_t *cdb = (void *)mb + ent->req.cdb_off; bool success = true; /* Handle command here. */ printf("SCSI opcode: 0x%x\n", cdb[0]); /* Set response fields */ if (success) ent->rsp.scsi_status = SCSI_NO_SENSE; else { /* Also fill in rsp->sense_buffer here */ ent->rsp.scsi_status = SCSI_CHECK_CONDITION; } } else if (tcmu_hdr_get_op(ent->hdr.len_op) != TCMU_OP_PAD) { /* Tell the kernel we didn't handle unknown opcodes */ ent->hdr.uflags |= TCMU_UFLAG_UNKNOWN_OP; } else { /* Do nothing for PAD entries except update cmd_tail */ } /* update cmd_tail */ mb->cmd_tail = (mb->cmd_tail + tcmu_hdr_get_len(&ent->hdr)) % mb->cmdr_size; ent = (void *) mb + mb->cmdr_off + mb->cmd_tail; did_some_work = 1; } /* Notify the kernel that work has been finished */ if (did_some_work) { uint32_t buf = 0; write(fd, &buf, 4); } return 0; }}hjsbah}(h]h ]h"]h$]h&]hhuh1jMhhhM\hjubeh}(h]h ]h"]h$]h&]uh1jhjhhhhhNubah}(h]h ]h"]h$]h&]j-j9j/hj0j1startKuh1jhjhhhhhMZubeh}(h]5writing-a-user-pass-through-handler-with-example-codeah ]h"]7writing a user pass-through handler (with example code)ah$]h&]uh1hhhhhhhhMubh)}(hhh](h)}(h A final noteh]h A final note}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhjhhhhhMubh)}(hPlease be careful to return codes as defined by the SCSI specifications. These are different than some values defined in the scsi/scsi.h include file. For example, CHECK CONDITION's status code is 2, not 1.h]hPlease be careful to return codes as defined by the SCSI specifications. These are different than some values defined in the scsi/scsi.h include file. For example, CHECK CONDITION’s status code is 2, not 1.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMhjhhubeh}(h] a-final-noteah ]h"] a final noteah$]h&]uh1hhhhhhhhMubeh}(h]tcm-userspace-designah ]h"]tcm userspace designah$]h&]uh1hhhhhhhhKubeh}(h]h ]h"]h$]h&]sourcehuh1hcurrent_sourceN current_lineNsettingsdocutils.frontendValues)}(hN generatorN datestampN source_linkN source_urlN toc_backlinksentryfootnote_backlinksK sectnum_xformKstrip_commentsNstrip_elements_with_classesN strip_classesN report_levelK halt_levelKexit_status_levelKdebugNwarning_streamN tracebackinput_encoding utf-8-siginput_encoding_error_handlerstrictoutput_encodingutf-8output_encoding_error_handlerjerror_encodingutf-8error_encoding_error_handlerbackslashreplace language_codeenrecord_dependenciesNconfigN id_prefixhauto_id_prefixid dump_settingsNdump_internalsNdump_transformsNdump_pseudo_xmlNexpose_internalsNstrict_visitorN_disable_configN_sourceh _destinationN _config_files]7/var/lib/git/docbuild/linux/Documentation/docutils.confafile_insertion_enabled raw_enabledKline_length_limitM'pep_referencesN pep_base_urlhttps://peps.python.org/pep_file_url_templatepep-%04drfc_referencesN rfc_base_url&https://datatracker.ietf.org/doc/html/ tab_widthKtrim_footnote_reference_spacesyntax_highlightlong smart_quotessmartquotes_locales]character_level_inline_markupdoctitle_xform docinfo_xformKsectsubtitle_xform image_loadinglinkembed_stylesheetcloak_email_addressessection_self_linkenvNubreporterNindirect_targets]substitution_defs}substitution_names}refnames}refids}nameids}(jjjjjFjCj{jxjjjjj{jxjjj+j(jjjjjjjjjju nametypes}(jjjFj{jjj{jj+jjjjjuh}(jhjhjChjxjIjj~jjjxjjj~j(jjj.jjjjjjjju footnote_refs} citation_refs} autofootnotes]autofootnote_refs]symbol_footnotes]symbol_footnote_refs] footnotes] citations]autofootnote_startKsymbol_footnote_startK id_counter collectionsCounter}Rparse_messages]hsystem_message)}(hhh]h)}(h:Enumerated list start value not ordinal-1: "c" (ordinal 3)h]h>Enumerated list start value not ordinal-1: “c” (ordinal 3)}(hjvhhhNhNubah}(h]h ]h"]h$]h&]uh1hhjsubah}(h]h ]h"]h$]h&]levelKtypeINFOsourcehlineKuh1jqhjhhhhhMZubatransform_messages] transformerN include_log] decorationNhhub.