*sphinx.addnodesdocument)}( rawsourcechildren]( translations LanguagesNode)}(hhh](h pending_xref)}(hhh]docutils.nodesTextChinese (Simplified)}parenthsba attributes}(ids]classes]names]dupnames]backrefs] refdomainstdreftypedoc reftarget%/translations/zh_CN/bpf/bpf_iteratorsmodnameN classnameN refexplicitutagnamehhh ubh)}(hhh]hChinese (Traditional)}hh2sbah}(h]h ]h"]h$]h&] refdomainh)reftypeh+ reftarget%/translations/zh_TW/bpf/bpf_iteratorsmodnameN classnameN refexplicituh1hhh ubh)}(hhh]hItalian}hhFsbah}(h]h ]h"]h$]h&] refdomainh)reftypeh+ reftarget%/translations/it_IT/bpf/bpf_iteratorsmodnameN classnameN refexplicituh1hhh ubh)}(hhh]hJapanese}hhZsbah}(h]h ]h"]h$]h&] refdomainh)reftypeh+ reftarget%/translations/ja_JP/bpf/bpf_iteratorsmodnameN classnameN refexplicituh1hhh ubh)}(hhh]hKorean}hhnsbah}(h]h ]h"]h$]h&] refdomainh)reftypeh+ reftarget%/translations/ko_KR/bpf/bpf_iteratorsmodnameN classnameN refexplicituh1hhh ubh)}(hhh]hSpanish}hhsbah}(h]h ]h"]h$]h&] refdomainh)reftypeh+ reftarget%/translations/sp_SP/bpf/bpf_iteratorsmodnameN classnameN refexplicituh1hhh ubeh}(h]h ]h"]h$]h&]current_languageEnglishuh1h hh _documenthsourceNlineNubhsection)}(hhh](htitle)}(h BPF Iteratorsh]h BPF Iterators}(hhhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhhh?/var/lib/git/docbuild/linux/Documentation/bpf/bpf_iterators.rsthKubh)}(hhh](h)}(hOverviewh]hOverview}(hhhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhhhhhKubh paragraph)}(hX?BPF supports two separate entities collectively known as "BPF iterators": BPF iterator *program type* and *open-coded* BPF iterators. The former is a stand-alone BPF program type which, when attached and activated by user, will be called once for each entity (task_struct, cgroup, etc) that is being iterated. The latter is a set of BPF-side APIs implementing iterator functionality and available across multiple BPF program types. Open-coded iterators provide similar functionality to BPF iterator programs, but gives more flexibility and control to all other BPF program types. BPF iterator programs, on the other hand, can be used to implement anonymous or BPF FS-mounted special files, whose contents are generated by attached BPF iterator program, backed by seq_file functionality. Both are useful depending on specific needs.h](h[BPF supports two separate entities collectively known as “BPF iterators”: BPF iterator }(hhhhhNhNubhemphasis)}(h*program type*h]h program type}(hhhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhubh and }(hhhhhNhNubh)}(h *open-coded*h]h open-coded}(hhhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhubhX BPF iterators. The former is a stand-alone BPF program type which, when attached and activated by user, will be called once for each entity (task_struct, cgroup, etc) that is being iterated. The latter is a set of BPF-side APIs implementing iterator functionality and available across multiple BPF program types. Open-coded iterators provide similar functionality to BPF iterator programs, but gives more flexibility and control to all other BPF program types. BPF iterator programs, on the other hand, can be used to implement anonymous or BPF FS-mounted special files, whose contents are generated by attached BPF iterator program, backed by seq_file functionality. Both are useful depending on specific needs.}(hhhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhK hhhhubh)}(hXWhen adding a new BPF iterator program, it is expected that similar functionality will be added as open-coded iterator for maximum flexibility. It's also expected that iteration logic and code will be maximally shared and reused between two iterator API surfaces.h]hX When adding a new BPF iterator program, it is expected that similar functionality will be added as open-coded iterator for maximum flexibility. It’s also expected that iteration logic and code will be maximally shared and reused between two iterator API surfaces.}(hhhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhhhhubeh}(h]overviewah ]h"]overviewah$]h&]uh1hhhhhhhhKubh)}(hhh](h)}(hOpen-coded BPF Iteratorsh]hOpen-coded BPF Iterators}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhjhhhhhKubh)}(hX;Open-coded BPF iterators are implemented as tightly-coupled trios of kfuncs (constructor, next element fetch, destructor) and iterator-specific type describing on-the-stack iterator state, which is guaranteed by the BPF verifier to not be tampered with outside of the corresponding constructor/destructor/next APIs.h]hX;Open-coded BPF iterators are implemented as tightly-coupled trios of kfuncs (constructor, next element fetch, destructor) and iterator-specific type describing on-the-stack iterator state, which is guaranteed by the BPF verifier to not be tampered with outside of the corresponding constructor/destructor/next APIs.}(hj%hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhjhhubh)}(hXzEach kind of open-coded BPF iterator has its own associated struct bpf_iter_, where denotes a specific type of iterator. bpf_iter_ state needs to live on BPF program stack, so make sure it's small enough to fit on BPF stack. For performance reasons its best to avoid dynamic memory allocation for iterator state and size the state struct big enough to fit everything necessary. But if necessary, dynamic memory allocation is a way to bypass BPF stack limitations. Note, state struct size is part of iterator's user-visible API, so changing it will break backwards compatibility, so be deliberate about designing it.h]hX~Each kind of open-coded BPF iterator has its own associated struct bpf_iter_, where denotes a specific type of iterator. bpf_iter_ state needs to live on BPF program stack, so make sure it’s small enough to fit on BPF stack. For performance reasons its best to avoid dynamic memory allocation for iterator state and size the state struct big enough to fit everything necessary. But if necessary, dynamic memory allocation is a way to bypass BPF stack limitations. Note, state struct size is part of iterator’s user-visible API, so changing it will break backwards compatibility, so be deliberate about designing it.}(hj3hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhK%hjhhubh)}(hXeAll kfuncs (constructor, next, destructor) have to be named consistently as bpf_iter__{new,next,destroy}(), respectively. represents iterator type, and iterator state should be represented as a matching `struct bpf_iter_` state type. Also, all iter kfuncs should have a pointer to this `struct bpf_iter_` as the very first argument.h](hAll kfuncs (constructor, next, destructor) have to be named consistently as bpf_iter__{new,next,destroy}(), respectively. represents iterator type, and iterator state should be represented as a matching }(hjAhhhNhNubhtitle_reference)}(h`struct bpf_iter_`h]hstruct bpf_iter_}(hjKhhhNhNubah}(h]h ]h"]h$]h&]uh1jIhjAubhA state type. Also, all iter kfuncs should have a pointer to this }(hjAhhhNhNubjJ)}(h`struct bpf_iter_`h]hstruct bpf_iter_}(hj]hhhNhNubah}(h]h ]h"]h$]h&]uh1jIhjAubh as the very first argument.}(hjAhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhK/hjhhubhdefinition_list)}(hhh]hdefinition_list_item)}(hXTAdditionally: - Constructor, i.e., `bpf_iter__new()`, can have arbitrary extra number of arguments. Return type is not enforced either. - Next method, i.e., `bpf_iter__next()`, has to return a pointer type and should have exactly one argument: `struct bpf_iter_ *` (const/volatile/restrict and typedefs are ignored). - Destructor, i.e., `bpf_iter__destroy()`, should return void and should have exactly one argument, similar to the next method. - `struct bpf_iter_` size is enforced to be positive and a multiple of 8 bytes (to fit stack slots correctly). h](hterm)}(h Additionally:h]h Additionally:}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhhhK>hj|ubh definition)}(hhh]h bullet_list)}(hhh](h list_item)}(h}Constructor, i.e., `bpf_iter__new()`, can have arbitrary extra number of arguments. Return type is not enforced either.h]h)}(h}Constructor, i.e., `bpf_iter__new()`, can have arbitrary extra number of arguments. Return type is not enforced either.h](hConstructor, i.e., }(hjhhhNhNubjJ)}(h`bpf_iter__new()`h]hbpf_iter__new()}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jIhjubhS, can have arbitrary extra number of arguments. Return type is not enforced either.}(hjhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhK6hjubah}(h]h ]h"]h$]h&]uh1jhjubj)}(hNext method, i.e., `bpf_iter__next()`, has to return a pointer type and should have exactly one argument: `struct bpf_iter_ *` (const/volatile/restrict and typedefs are ignored).h]h)}(hNext method, i.e., `bpf_iter__next()`, has to return a pointer type and should have exactly one argument: `struct bpf_iter_ *` (const/volatile/restrict and typedefs are ignored).h](hNext method, i.e., }(hjhhhNhNubjJ)}(h`bpf_iter__next()`h]hbpf_iter__next()}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jIhjubhE, has to return a pointer type and should have exactly one argument: }(hjhhhNhNubjJ)}(h`struct bpf_iter_ *`h]hstruct bpf_iter_ *}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jIhjubh4 (const/volatile/restrict and typedefs are ignored).}(hjhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhK8hjubah}(h]h ]h"]h$]h&]uh1jhjubj)}(hDestructor, i.e., `bpf_iter__destroy()`, should return void and should have exactly one argument, similar to the next method.h]h)}(hDestructor, i.e., `bpf_iter__destroy()`, should return void and should have exactly one argument, similar to the next method.h](hDestructor, i.e., }(hjhhhNhNubjJ)}(h`bpf_iter__destroy()`h]hbpf_iter__destroy()}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jIhjubhV, should return void and should have exactly one argument, similar to the next method.}(hjhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhK;hjubah}(h]h ]h"]h$]h&]uh1jhjubj)}(hs`struct bpf_iter_` size is enforced to be positive and a multiple of 8 bytes (to fit stack slots correctly). h]h)}(hr`struct bpf_iter_` size is enforced to be positive and a multiple of 8 bytes (to fit stack slots correctly).h](jJ)}(h`struct bpf_iter_`h]hstruct bpf_iter_}(hj4hhhNhNubah}(h]h ]h"]h$]h&]uh1jIhj0ubhZ size is enforced to be positive and a multiple of 8 bytes (to fit stack slots correctly).}(hj0hhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhK=hj,ubah}(h]h ]h"]h$]h&]uh1jhjubeh}(h]h ]h"]h$]h&]bullet-uh1jhhhK6hjubah}(h]h ]h"]h$]h&]uh1jhj|ubeh}(h]h ]h"]h$]h&]uh1jzhhhK>hjwubah}(h]h ]h"]h$]h&]uh1juhjhhhNhNubh)}(hXSuch strictness and consistency allows to build generic helpers abstracting important, but boilerplate, details to be able to use open-coded iterators effectively and ergonomically (see libbpf's bpf_for_each() macro). This is enforced at kfunc registration point by the kernel.h]hXSuch strictness and consistency allows to build generic helpers abstracting important, but boilerplate, details to be able to use open-coded iterators effectively and ergonomically (see libbpf’s bpf_for_each() macro). This is enforced at kfunc registration point by the kernel.}(hjlhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhK@hjhhubjv)}(hhh]j{)}(hXConstructor/next/destructor implementation contract is as follows: - constructor, `bpf_iter__new()`, always initializes iterator state on the stack. If any of the input arguments are invalid, constructor should make sure to still initialize it such that subsequent next() calls will return NULL. I.e., on error, *return error and construct empty iterator*. Constructor kfunc is marked with KF_ITER_NEW flag. - next method, `bpf_iter__next()`, accepts pointer to iterator state and produces an element. Next method should always return a pointer. The contract between BPF verifier is that next method *guarantees* that it will eventually return NULL when elements are exhausted. Once NULL is returned, subsequent next calls *should keep returning NULL*. Next method is marked with KF_ITER_NEXT (and should also have KF_RET_NULL as NULL-returning kfunc, of course). - destructor, `bpf_iter__destroy()`, is always called once. Even if constructor failed or next returned nothing. Destructor frees up any resources and marks stack space used by `struct bpf_iter_` as usable for something else. Destructor is marked with KF_ITER_DESTROY flag. h](j)}(hBConstructor/next/destructor implementation contract is as follows:h]hBConstructor/next/destructor implementation contract is as follows:}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhhhKWhj}ubj)}(hhh]j)}(hhh](j)}(hXYconstructor, `bpf_iter__new()`, always initializes iterator state on the stack. If any of the input arguments are invalid, constructor should make sure to still initialize it such that subsequent next() calls will return NULL. I.e., on error, *return error and construct empty iterator*. Constructor kfunc is marked with KF_ITER_NEW flag. h]h)}(hXXconstructor, `bpf_iter__new()`, always initializes iterator state on the stack. If any of the input arguments are invalid, constructor should make sure to still initialize it such that subsequent next() calls will return NULL. I.e., on error, *return error and construct empty iterator*. Constructor kfunc is marked with KF_ITER_NEW flag.h](h constructor, }(hjhhhNhNubjJ)}(h`bpf_iter__new()`h]hbpf_iter__new()}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jIhjubh, always initializes iterator state on the stack. If any of the input arguments are invalid, constructor should make sure to still initialize it such that subsequent next() calls will return NULL. I.e., on error, }(hjhhhNhNubh)}(h+*return error and construct empty iterator*h]h)return error and construct empty iterator}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhjubh4. Constructor kfunc is marked with KF_ITER_NEW flag.}(hjhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhKFhjubah}(h]h ]h"]h$]h&]uh1jhjubj)}(hXnext method, `bpf_iter__next()`, accepts pointer to iterator state and produces an element. Next method should always return a pointer. The contract between BPF verifier is that next method *guarantees* that it will eventually return NULL when elements are exhausted. Once NULL is returned, subsequent next calls *should keep returning NULL*. Next method is marked with KF_ITER_NEXT (and should also have KF_RET_NULL as NULL-returning kfunc, of course). h]h)}(hXnext method, `bpf_iter__next()`, accepts pointer to iterator state and produces an element. Next method should always return a pointer. The contract between BPF verifier is that next method *guarantees* that it will eventually return NULL when elements are exhausted. Once NULL is returned, subsequent next calls *should keep returning NULL*. Next method is marked with KF_ITER_NEXT (and should also have KF_RET_NULL as NULL-returning kfunc, of course).h](h next method, }(hjhhhNhNubjJ)}(h`bpf_iter__next()`h]hbpf_iter__next()}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jIhjubh, accepts pointer to iterator state and produces an element. Next method should always return a pointer. The contract between BPF verifier is that next method }(hjhhhNhNubh)}(h *guarantees*h]h guarantees}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhjubho that it will eventually return NULL when elements are exhausted. Once NULL is returned, subsequent next calls }(hjhhhNhNubh)}(h*should keep returning NULL*h]hshould keep returning NULL}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhjubhp. Next method is marked with KF_ITER_NEXT (and should also have KF_RET_NULL as NULL-returning kfunc, of course).}(hjhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhKLhjubah}(h]h ]h"]h$]h&]uh1jhjubj)}(hXdestructor, `bpf_iter__destroy()`, is always called once. Even if constructor failed or next returned nothing. Destructor frees up any resources and marks stack space used by `struct bpf_iter_` as usable for something else. Destructor is marked with KF_ITER_DESTROY flag. h]h)}(hXdestructor, `bpf_iter__destroy()`, is always called once. Even if constructor failed or next returned nothing. Destructor frees up any resources and marks stack space used by `struct bpf_iter_` as usable for something else. Destructor is marked with KF_ITER_DESTROY flag.h](h destructor, }(hj#hhhNhNubjJ)}(h`bpf_iter__destroy()`h]hbpf_iter__destroy()}(hj+hhhNhNubah}(h]h ]h"]h$]h&]uh1jIhj#ubh, is always called once. Even if constructor failed or next returned nothing. Destructor frees up any resources and marks stack space used by }(hj#hhhNhNubjJ)}(h`struct bpf_iter_`h]hstruct bpf_iter_}(hj=hhhNhNubah}(h]h ]h"]h$]h&]uh1jIhj#ubhN as usable for something else. Destructor is marked with KF_ITER_DESTROY flag.}(hj#hhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhKThjubah}(h]h ]h"]h$]h&]uh1jhjubeh}(h]h ]h"]h$]h&]jXjYuh1jhhhKFhjubah}(h]h ]h"]h$]h&]uh1jhj}ubeh}(h]h ]h"]h$]h&]uh1jzhhhKWhjzubah}(h]h ]h"]h$]h&]uh1juhjhhhNhNubh)}(hX,Any open-coded BPF iterator implementation has to implement at least these three methods. It is enforced that for any given type of iterator only applicable constructor/destructor/next are callable. I.e., verifier ensures you can't pass number iterator state into, say, cgroup iterator's next method.h]hX0Any open-coded BPF iterator implementation has to implement at least these three methods. It is enforced that for any given type of iterator only applicable constructor/destructor/next are callable. I.e., verifier ensures you can’t pass number iterator state into, say, cgroup iterator’s next method.}(hjshhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKYhjhhubh)}(hX/From a 10,000-feet BPF verification point of view, next methods are the points of forking a verification state, which are conceptually similar to what verifier is doing when validating conditional jumps. Verifier is branching out `call bpf_iter__next` instruction and simulates two outcomes: NULL (iteration is done) and non-NULL (new element is returned). NULL is simulated first and is supposed to reach exit without looping. After that non-NULL case is validated and it either reaches exit (for trivial examples with no real loop), or reaches another `call bpf_iter__next` instruction with the state equivalent to already (partially) validated one. State equivalency at that point means we technically are going to be looping forever without "breaking out" out of established "state envelope" (i.e., subsequent iterations don't add any new knowledge or constraints to the verifier state, so running 1, 2, 10, or a million of them doesn't matter). But taking into account the contract stating that iterator next method *has to* return NULL eventually, we can conclude that loop body is safe and will eventually terminate. Given we validated logic outside of the loop (NULL case), and concluded that loop body is safe (though potentially looping many times), verifier can claim safety of the overall program logic.h](hFrom a 10,000-feet BPF verification point of view, next methods are the points of forking a verification state, which are conceptually similar to what verifier is doing when validating conditional jumps. Verifier is branching out }(hjhhhNhNubjJ)}(h`call bpf_iter__next`h]hcall bpf_iter__next}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jIhjubhX/ instruction and simulates two outcomes: NULL (iteration is done) and non-NULL (new element is returned). NULL is simulated first and is supposed to reach exit without looping. After that non-NULL case is validated and it either reaches exit (for trivial examples with no real loop), or reaches another }(hjhhhNhNubjJ)}(h`call bpf_iter__next`h]hcall bpf_iter__next}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jIhjubhX instruction with the state equivalent to already (partially) validated one. State equivalency at that point means we technically are going to be looping forever without “breaking out” out of established “state envelope” (i.e., subsequent iterations don’t add any new knowledge or constraints to the verifier state, so running 1, 2, 10, or a million of them doesn’t matter). But taking into account the contract stating that iterator next method }(hjhhhNhNubh)}(h*has to*h]hhas to}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhjubhX return NULL eventually, we can conclude that loop body is safe and will eventually terminate. Given we validated logic outside of the loop (NULL case), and concluded that loop body is safe (though potentially looping many times), verifier can claim safety of the overall program logic.}(hjhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhK^hjhhubeh}(h]open-coded-bpf-iteratorsah ]h"]open-coded bpf iteratorsah$]h&]uh1hhhhhhhhKubh)}(hhh](h)}(hBPF Iterators Motivationh]hBPF Iterators Motivation}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhjhhhhhKsubh)}(hXWThere are a few existing ways to dump kernel data into user space. The most popular one is the ``/proc`` system. For example, ``cat /proc/net/tcp6`` dumps all tcp6 sockets in the system, and ``cat /proc/net/netlink`` dumps all netlink sockets in the system. However, their output format tends to be fixed, and if users want more information about these sockets, they have to patch the kernel, which often takes time to publish upstream and release. The same is true for popular tools like `ss `_ where any additional information needs a kernel patch.h](h_There are a few existing ways to dump kernel data into user space. The most popular one is the }(hjhhhNhNubhliteral)}(h ``/proc``h]h/proc}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjubh system. For example, }(hjhhhNhNubj)}(h``cat /proc/net/tcp6``h]hcat /proc/net/tcp6}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjubh+ dumps all tcp6 sockets in the system, and }(hjhhhNhNubj)}(h``cat /proc/net/netlink``h]hcat /proc/net/netlink}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1jhjubhX dumps all netlink sockets in the system. However, their output format tends to be fixed, and if users want more information about these sockets, they have to patch the kernel, which often takes time to publish upstream and release. The same is true for popular tools like }(hjhhhNhNubh reference)}(h7`ss `_h]hss}(hj hhhNhNubah}(h]h ]h"]h$]h&]namessrefuri/https://man7.org/linux/man-pages/man8/ss.8.htmluh1jhjubhtarget)}(h2 h]h}(h]ssah ]h"]ssah$]h&]refurij1uh1j2 referencedKhjubh7 where any additional information needs a kernel patch.}(hjhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhKuhjhhubh)}(hXTo solve this problem, the `drgn `_ tool is often used to dig out the kernel data with no kernel change. However, the main drawback for drgn is performance, as it cannot do pointer tracing inside the kernel. In addition, drgn cannot validate a pointer value and may read invalid data if the pointer becomes invalid inside the kernel.h](hTo solve this problem, the }(hjLhhhNhNubj)}(h>`drgn `_h]hdrgn}(hjThhhNhNubah}(h]h ]h"]h$]h&]namedrgnj04https://www.kernel.org/doc/html/latest/bpf/drgn.htmluh1jhjLubj3)}(h7 h]h}(h]drgnah ]h"]drgnah$]h&]refurijduh1j2jAKhjLubhX* tool is often used to dig out the kernel data with no kernel change. However, the main drawback for drgn is performance, as it cannot do pointer tracing inside the kernel. In addition, drgn cannot validate a pointer value and may read invalid data if the pointer becomes invalid inside the kernel.}(hjLhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhK~hjhhubh)}(hThe BPF iterator solves the above problem by providing flexibility on what data (e.g., tasks, bpf_maps, etc.) to collect by calling BPF programs for each kernel data object.h]hThe BPF iterator solves the above problem by providing flexibility on what data (e.g., tasks, bpf_maps, etc.) to collect by calling BPF programs for each kernel data object.}(hj|hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhjhhubeh}(h]bpf-iterators-motivationah ]h"]bpf iterators motivationah$]h&]uh1hhhhhhhhKsubh)}(hhh](h)}(hHow BPF Iterators Workh]hHow BPF Iterators Work}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhjhhhhhKubh)}(hXxA BPF iterator is a type of BPF program that allows users to iterate over specific types of kernel objects. Unlike traditional BPF tracing programs that allow users to define callbacks that are invoked at particular points of execution in the kernel, BPF iterators allow users to define callbacks that should be executed for every entry in a variety of kernel data structures.h]hXxA BPF iterator is a type of BPF program that allows users to iterate over specific types of kernel objects. Unlike traditional BPF tracing programs that allow users to define callbacks that are invoked at particular points of execution in the kernel, BPF iterators allow users to define callbacks that should be executed for every entry in a variety of kernel data structures.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhjhhubh)}(hX(For example, users can define a BPF iterator that iterates over every task on the system and dumps the total amount of CPU runtime currently used by each of them. Another BPF task iterator may instead dump the cgroup information for each task. Such flexibility is the core value of BPF iterators.h]hX(For example, users can define a BPF iterator that iterates over every task on the system and dumps the total amount of CPU runtime currently used by each of them. Another BPF task iterator may instead dump the cgroup information for each task. Such flexibility is the core value of BPF iterators.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhjhhubh)}(hXA BPF program is always loaded into the kernel at the behest of a user space process. A user space process loads a BPF program by opening and initializing the program skeleton as required and then invoking a syscall to have the BPF program verified and loaded by the kernel.h]hXA BPF program is always loaded into the kernel at the behest of a user space process. A user space process loads a BPF program by opening and initializing the program skeleton as required and then invoking a syscall to have the BPF program verified and loaded by the kernel.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhjhhubh)}(hXIn traditional tracing programs, a program is activated by having user space obtain a ``bpf_link`` to the program with ``bpf_program__attach()``. Once activated, the program callback will be invoked whenever the tracepoint is triggered in the main kernel. For BPF iterator programs, a ``bpf_link`` to the program is obtained using ``bpf_link_create()``, and the program callback is invoked by issuing system calls from user space.h](hVIn traditional tracing programs, a program is activated by having user space obtain a }(hjhhhNhNubj)}(h ``bpf_link``h]hbpf_link}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjubh to the program with }(hjhhhNhNubj)}(h``bpf_program__attach()``h]hbpf_program__attach()}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjubh. Once activated, the program callback will be invoked whenever the tracepoint is triggered in the main kernel. For BPF iterator programs, a }(hjhhhNhNubj)}(h ``bpf_link``h]hbpf_link}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjubh" to the program is obtained using }(hjhhhNhNubj)}(h``bpf_link_create()``h]hbpf_link_create()}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1jhjubhN, and the program callback is invoked by issuing system calls from user space.}(hjhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhKhjhhubh)}(hZNext, let us see how you can use the iterators to iterate on kernel objects and read data.h]hZNext, let us see how you can use the iterators to iterate on kernel objects and read data.}(hj#hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhjhhubeh}(h]how-bpf-iterators-workah ]h"]how bpf iterators workah$]h&]uh1hhhhhhhhKubh)}(hhh](h)}(hHow to Use BPF iteratorsh]hHow to Use BPF iterators}(hj<hhhNhNubah}(h]h ]h"]h$]h&]uh1hhj9hhhhhKubh)}(hXBPF selftests are a great resource to illustrate how to use the iterators. In this section, we’ll walk through a BPF selftest which shows how to load and use a BPF iterator program. To begin, we’ll look at `bpf_iter.c `_, which illustrates how to load and trigger BPF iterators on the user space side. Later, we’ll look at a BPF program that runs in kernel space.h](hBPF selftests are a great resource to illustrate how to use the iterators. In this section, we’ll walk through a BPF selftest which shows how to load and use a BPF iterator program. To begin, we’ll look at }(hjJhhhNhNubj)}(h`bpf_iter.c `_h]h bpf_iter.c}(hjRhhhNhNubah}(h]h ]h"]h$]h&]name bpf_iter.cj0whttps://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/tree/tools/testing/selftests/bpf/prog_tests/bpf_iter.cuh1jhjJubj3)}(hz h]h}(h] bpf-iter-cah ]h"] bpf_iter.cah$]h&]refurijbuh1j2jAKhjJubh, which illustrates how to load and trigger BPF iterators on the user space side. Later, we’ll look at a BPF program that runs in kernel space.}(hjJhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhKhj9hhubh)}(h\Loading a BPF iterator in the kernel from user space typically involves the following steps:h]h\Loading a BPF iterator in the kernel from user space typically involves the following steps:}(hjzhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhj9hhubj)}(hhh](j)}(hThe BPF program is loaded into the kernel through ``libbpf``. Once the kernel has verified and loaded the program, it returns a file descriptor (fd) to user space.h]h)}(hThe BPF program is loaded into the kernel through ``libbpf``. Once the kernel has verified and loaded the program, it returns a file descriptor (fd) to user space.h](h2The BPF program is loaded into the kernel through }(hjhhhNhNubj)}(h ``libbpf``h]hlibbpf}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjubhg. Once the kernel has verified and loaded the program, it returns a file descriptor (fd) to user space.}(hjhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhKhjubah}(h]h ]h"]h$]h&]uh1jhjhhhhhNubj)}(hObtain a ``link_fd`` to the BPF program by calling the ``bpf_link_create()`` specified with the BPF program file descriptor received from the kernel.h]h)}(hObtain a ``link_fd`` to the BPF program by calling the ``bpf_link_create()`` specified with the BPF program file descriptor received from the kernel.h](h Obtain a }(hjhhhNhNubj)}(h ``link_fd``h]hlink_fd}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjubh# to the BPF program by calling the }(hjhhhNhNubj)}(h``bpf_link_create()``h]hbpf_link_create()}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjubhI specified with the BPF program file descriptor received from the kernel.}(hjhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhKhjubah}(h]h ]h"]h$]h&]uh1jhjhhhhhNubj)}(hNext, obtain a BPF iterator file descriptor (``bpf_iter_fd``) by calling the ``bpf_iter_create()`` specified with the ``bpf_link`` received from Step 2.h]h)}(hNext, obtain a BPF iterator file descriptor (``bpf_iter_fd``) by calling the ``bpf_iter_create()`` specified with the ``bpf_link`` received from Step 2.h](h-Next, obtain a BPF iterator file descriptor (}(hjhhhNhNubj)}(h``bpf_iter_fd``h]h bpf_iter_fd}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjubh) by calling the }(hjhhhNhNubj)}(h``bpf_iter_create()``h]hbpf_iter_create()}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjubh specified with the }(hjhhhNhNubj)}(h ``bpf_link``h]hbpf_link}(hj!hhhNhNubah}(h]h ]h"]h$]h&]uh1jhjubh received from Step 2.}(hjhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhKhjubah}(h]h ]h"]h$]h&]uh1jhjhhhhhNubj)}(hRTrigger the iteration by calling ``read(bpf_iter_fd)`` until no data is available.h]h)}(hRTrigger the iteration by calling ``read(bpf_iter_fd)`` until no data is available.h](h!Trigger the iteration by calling }(hjChhhNhNubj)}(h``read(bpf_iter_fd)``h]hread(bpf_iter_fd)}(hjKhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjCubh until no data is available.}(hjChhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhKhj?ubah}(h]h ]h"]h$]h&]uh1jhjhhhhhNubj)}(h3Close the iterator fd using ``close(bpf_iter_fd)``.h]h)}(hjkh](hClose the iterator fd using }(hjmhhhNhNubj)}(h``close(bpf_iter_fd)``h]hclose(bpf_iter_fd)}(hjthhhNhNubah}(h]h ]h"]h$]h&]uh1jhjmubh.}(hjmhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhKhjiubah}(h]h ]h"]h$]h&]uh1jhjhhhhhNubj)}(hOIf needed to reread the data, get a new ``bpf_iter_fd`` and do the read again. h]h)}(hNIf needed to reread the data, get a new ``bpf_iter_fd`` and do the read again.h](h(If needed to reread the data, get a new }(hjhhhNhNubj)}(h``bpf_iter_fd``h]h bpf_iter_fd}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjubh and do the read again.}(hjhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhKhjubah}(h]h ]h"]h$]h&]uh1jhjhhhhhNubeh}(h]h ]h"]h$]h&]jX*uh1jhhhKhj9hhubh)}(hCThe following are a few examples of selftest BPF iterator programs:h]hCThe following are a few examples of selftest BPF iterator programs:}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhj9hhubj)}(hhh](j)}(h`bpf_iter_tcp4.c `_h]h)}(hjh](j)}(hjh]hbpf_iter_tcp4.c}(hjhhhNhNubah}(h]h ]h"]h$]h&]namebpf_iter_tcp4.cj0whttps://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/tree/tools/testing/selftests/bpf/progs/bpf_iter_tcp4.cuh1jhjubj3)}(hz h]h}(h]bpf-iter-tcp4-cah ]h"]bpf_iter_tcp4.cah$]h&]refurijuh1j2jAKhjubeh}(h]h ]h"]h$]h&]uh1hhhhKhjubah}(h]h ]h"]h$]h&]uh1jhjhhhhhNubj)}(h`bpf_iter_task_vmas.c `_h]h)}(hjh](j)}(hjh]hbpf_iter_task_vmas.c}(hj hhhNhNubah}(h]h ]h"]h$]h&]namebpf_iter_task_vmas.cj0|https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/tree/tools/testing/selftests/bpf/progs/bpf_iter_task_vmas.cuh1jhjubj3)}(h h]h}(h]bpf-iter-task-vmas-cah ]h"]bpf_iter_task_vmas.cah$]h&]refurijuh1j2jAKhjubeh}(h]h ]h"]h$]h&]uh1hhhhKhjubah}(h]h ]h"]h$]h&]uh1jhjhhhhhNubj)}(h`bpf_iter_task_file.c `_ h]h)}(h`bpf_iter_task_file.c `_h](j)}(hj:h]hbpf_iter_task_file.c}(hj<hhhNhNubah}(h]h ]h"]h$]h&]namebpf_iter_task_file.cj0|https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/tree/tools/testing/selftests/bpf/progs/bpf_iter_task_file.cuh1jhj8ubj3)}(h h]h}(h]bpf-iter-task-file-cah ]h"]bpf_iter_task_file.cah$]h&]refurijKuh1j2jAKhj8ubeh}(h]h ]h"]h$]h&]uh1hhhhKhj4ubah}(h]h ]h"]h$]h&]uh1jhjhhhhhNubeh}(h]h ]h"]h$]h&]jXjuh1jhhhKhj9hhubh)}(hDLet us look at ``bpf_iter_task_file.c``, which runs in kernel space:h](hLet us look at }(hjkhhhNhNubj)}(h``bpf_iter_task_file.c``h]hbpf_iter_task_file.c}(hjshhhNhNubah}(h]h ]h"]h$]h&]uh1jhjkubh, which runs in kernel space:}(hjkhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhKhj9hhubh)}(hX@Here is the definition of ``bpf_iter__task_file`` in `vmlinux.h `_. Any struct name in ``vmlinux.h`` in the format ``bpf_iter__`` represents a BPF iterator. The suffix ```` represents the type of iterator.h](hHere is the definition of }(hjhhhNhNubj)}(h``bpf_iter__task_file``h]hbpf_iter__task_file}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjubh in }(hjhhhNhNubj)}(hj`vmlinux.h `_h]h vmlinux.h}(hjhhhNhNubah}(h]h ]h"]h$]h&]name vmlinux.hj0[https://facebookmicrosites.github.io/bpf/blog/2020/02/19/bpf-portability-and-co-re.html#btfuh1jhjubj3)}(h^ h]h}(h] vmlinux-hah ]h"] vmlinux.hah$]h&]refurijuh1j2jAKhjubh. Any struct name in }(hjhhhNhNubj)}(h ``vmlinux.h``h]h vmlinux.h}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjubh in the format }(hjhhhNhNubj)}(h``bpf_iter__``h]hbpf_iter__}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjubh' represents a BPF iterator. The suffix }(hjhhhNhNubj)}(h````h]h }(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjubh! represents the type of iterator.}(hjhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhKhj9hhubh literal_block)}(hstruct bpf_iter__task_file { union { struct bpf_iter_meta *meta; }; union { struct task_struct *task; }; u32 fd; union { struct file *file; }; };h]hstruct bpf_iter__task_file { union { struct bpf_iter_meta *meta; }; union { struct task_struct *task; }; u32 fd; union { struct file *file; }; };}hjsbah}(h]h ]h"]h$]h&] xml:spacepreserveuh1jhhhKhj9hhubh)}(hXIn the above code, the field 'meta' contains the metadata, which is the same for all BPF iterator programs. The rest of the fields are specific to different iterators. For example, for task_file iterators, the kernel layer provides the 'task', 'fd' and 'file' field values. The 'task' and 'file' are `reference counted `_, so they won't go away when the BPF program runs.h](hXDIn the above code, the field ‘meta’ contains the metadata, which is the same for all BPF iterator programs. The rest of the fields are specific to different iterators. For example, for task_file iterators, the kernel layer provides the ‘task’, ‘fd’ and ‘file’ field values. The ‘task’ and ‘file’ are }(hjhhhNhNubj)}(h`reference counted `_h]hreference counted}(hjhhhNhNubah}(h]h ]h"]h$]h&]namereference countedj0uhttps://facebookmicrosites.github.io/bpf/blog/2018/08/31/object-lifetime.html#file-descriptors-and-reference-countersuh1jhjubj3)}(hx h]h}(h]reference-countedah ]h"]reference countedah$]h&]refurij-uh1j2jAKhjubh4, so they won’t go away when the BPF program runs.}(hjhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhKhj9hhubh)}(h:Here is a snippet from the ``bpf_iter_task_file.c`` file:h](hHere is a snippet from the }(hjEhhhNhNubj)}(h``bpf_iter_task_file.c``h]hbpf_iter_task_file.c}(hjMhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjEubh file:}(hjEhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhKhj9hhubj)}(hXSEC("iter/task_file") int dump_task_file(struct bpf_iter__task_file *ctx) { struct seq_file *seq = ctx->meta->seq; struct task_struct *task = ctx->task; struct file *file = ctx->file; __u32 fd = ctx->fd; if (task == NULL || file == NULL) return 0; if (ctx->meta->seq_num == 0) { count = 0; BPF_SEQ_PRINTF(seq, " tgid gid fd file\n"); } if (tgid == task->tgid && task->tgid != task->pid) count++; if (last_tgid != task->tgid) { last_tgid = task->tgid; unique_tgid_count++; } BPF_SEQ_PRINTF(seq, "%8d %8d %8d %lx\n", task->tgid, task->pid, fd, (long)file->f_op); return 0; }h]hXSEC("iter/task_file") int dump_task_file(struct bpf_iter__task_file *ctx) { struct seq_file *seq = ctx->meta->seq; struct task_struct *task = ctx->task; struct file *file = ctx->file; __u32 fd = ctx->fd; if (task == NULL || file == NULL) return 0; if (ctx->meta->seq_num == 0) { count = 0; BPF_SEQ_PRINTF(seq, " tgid gid fd file\n"); } if (tgid == task->tgid && task->tgid != task->pid) count++; if (last_tgid != task->tgid) { last_tgid = task->tgid; unique_tgid_count++; } BPF_SEQ_PRINTF(seq, "%8d %8d %8d %lx\n", task->tgid, task->pid, fd, (long)file->f_op); return 0; }}hjesbah}(h]h ]h"]h$]h&]jjuh1jhhhKhj9hhubh)}(hIn the above example, the section name ``SEC(iter/task_file)``, indicates that the program is a BPF iterator program to iterate all files from all tasks. The context of the program is ``bpf_iter__task_file`` struct.h](h'In the above example, the section name }(hjshhhNhNubj)}(h``SEC(iter/task_file)``h]hSEC(iter/task_file)}(hj{hhhNhNubah}(h]h ]h"]h$]h&]uh1jhjsubhz, indicates that the program is a BPF iterator program to iterate all files from all tasks. The context of the program is }(hjshhhNhNubj)}(h``bpf_iter__task_file``h]hbpf_iter__task_file}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjsubh struct.}(hjshhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhMhj9hhubh)}(hXThe user space program invokes the BPF iterator program running in the kernel by issuing a ``read()`` syscall. Once invoked, the BPF program can export data to user space using a variety of BPF helper functions. You can use either ``bpf_seq_printf()`` (and BPF_SEQ_PRINTF helper macro) or ``bpf_seq_write()`` function based on whether you need formatted output or just binary data, respectively. For binary-encoded data, the user space applications can process the data from ``bpf_seq_write()`` as needed. For the formatted data, you can use ``cat `` to print the results similar to ``cat /proc/net/netlink`` after pinning the BPF iterator to the bpffs mount. Later, use ``rm -f `` to remove the pinned iterator.h](h[The user space program invokes the BPF iterator program running in the kernel by issuing a }(hjhhhNhNubj)}(h ``read()``h]hread()}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjubh syscall. Once invoked, the BPF program can export data to user space using a variety of BPF helper functions. You can use either }(hjhhhNhNubj)}(h``bpf_seq_printf()``h]hbpf_seq_printf()}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjubh& (and BPF_SEQ_PRINTF helper macro) or }(hjhhhNhNubj)}(h``bpf_seq_write()``h]hbpf_seq_write()}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjubh function based on whether you need formatted output or just binary data, respectively. For binary-encoded data, the user space applications can process the data from }(hjhhhNhNubj)}(h``bpf_seq_write()``h]hbpf_seq_write()}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjubh0 as needed. For the formatted data, you can use }(hjhhhNhNubj)}(h``cat ``h]h cat }(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjubh! to print the results similar to }(hjhhhNhNubj)}(h``cat /proc/net/netlink``h]hcat /proc/net/netlink}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1jhjubh@ after pinning the BPF iterator to the bpffs mount. Later, use }(hjhhhNhNubj)}(h``rm -f ``h]h rm -f }(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1jhjubh to remove the pinned iterator.}(hjhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhM hj9hhubh)}(hFor example, you can use the following command to create a BPF iterator from the ``bpf_iter_ipv6_route.o`` object file and pin it to the ``/sys/fs/bpf/my_route`` path:h](hQFor example, you can use the following command to create a BPF iterator from the }(hj1 hhhNhNubj)}(h``bpf_iter_ipv6_route.o``h]hbpf_iter_ipv6_route.o}(hj9 hhhNhNubah}(h]h ]h"]h$]h&]uh1jhj1 ubh object file and pin it to the }(hj1 hhhNhNubj)}(h``/sys/fs/bpf/my_route``h]h/sys/fs/bpf/my_route}(hjK hhhNhNubah}(h]h ]h"]h$]h&]uh1jhj1 ubh path:}(hj1 hhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhMhj9hhubj)}(h@$ bpftool iter pin ./bpf_iter_ipv6_route.o /sys/fs/bpf/my_routeh]h@$ bpftool iter pin ./bpf_iter_ipv6_route.o /sys/fs/bpf/my_route}hjc sbah}(h]h ]h"]h$]h&]jjuh1jhhhMhj9hhubh)}(h;And then print out the results using the following command:h]h;And then print out the results using the following command:}(hjq hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMhj9hhubj)}(h$ cat /sys/fs/bpf/my_routeh]h$ cat /sys/fs/bpf/my_route}hj sbah}(h]h ]h"]h$]h&]jjuh1jhhhM"hj9hhubeh}(h]how-to-use-bpf-iteratorsah ]h"]how to use bpf iteratorsah$]h&]uh1hhhhhhhhKubh)}(hhh](h)}(h7Implement Kernel Support for BPF Iterator Program Typesh]h7Implement Kernel Support for BPF Iterator Program Types}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1hhj hhhhhM'ubh)}(hTo implement a BPF iterator in the kernel, the developer must make a one-time change to the following key data structure defined in the `bpf.h `_ file.h](hTo implement a BPF iterator in the kernel, the developer must make a one-time change to the following key data structure defined in the }(hj hhhNhNubj)}(hd`bpf.h `_h]hbpf.h}(hj hhhNhNubah}(h]h ]h"]h$]h&]namebpf.hj0Yhttps://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/tree/include/linux/bpf.huh1jhj ubj3)}(h\ h]h}(h]bpf-hah ]h"]bpf.hah$]h&]refurij uh1j2jAKhj ubh file.}(hj hhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhM)hj hhubj)}(hXstruct bpf_iter_reg { const char *target; bpf_iter_attach_target_t attach_target; bpf_iter_detach_target_t detach_target; bpf_iter_show_fdinfo_t show_fdinfo; bpf_iter_fill_link_info_t fill_link_info; bpf_iter_get_func_proto_t get_func_proto; u32 ctx_arg_info_size; u32 feature; struct bpf_ctx_arg_aux ctx_arg_info[BPF_ITER_CTX_ARG_MAX]; const struct bpf_iter_seq_info *seq_info; };h]hXstruct bpf_iter_reg { const char *target; bpf_iter_attach_target_t attach_target; bpf_iter_detach_target_t detach_target; bpf_iter_show_fdinfo_t show_fdinfo; bpf_iter_fill_link_info_t fill_link_info; bpf_iter_get_func_proto_t get_func_proto; u32 ctx_arg_info_size; u32 feature; struct bpf_ctx_arg_aux ctx_arg_info[BPF_ITER_CTX_ARG_MAX]; const struct bpf_iter_seq_info *seq_info; };}hj sbah}(h]h ]h"]h$]h&]jjuh1jhhhM0hj hhubh)}(hAfter filling the data structure fields, call ``bpf_iter_reg_target()`` to register the iterator to the main BPF iterator subsystem.h](h.After filling the data structure fields, call }(hj hhhNhNubj)}(h``bpf_iter_reg_target()``h]hbpf_iter_reg_target()}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1jhj ubh= to register the iterator to the main BPF iterator subsystem.}(hj hhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhM=hj hhubh)}(hIThe following is the breakdown for each field in struct ``bpf_iter_reg``.h](h8The following is the breakdown for each field in struct }(hj hhhNhNubj)}(h``bpf_iter_reg``h]h bpf_iter_reg}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1jhj ubh.}(hj hhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhM@hj hhubhtable)}(hhh]htgroup)}(hhh](hcolspec)}(hhh]h}(h]h ]h"]h$]h&]colwidthKuh1j. hj+ ubj/ )}(hhh]h}(h]h ]h"]h$]h&]j9 K2uh1j. hj+ ubhthead)}(hhh]hrow)}(hhh](hentry)}(hhh]h)}(hFieldsh]hFields}(hjR hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMFhjO ubah}(h]h ]h"]h$]h&]uh1jM hjJ ubjN )}(hhh]h)}(h Descriptionh]h Description}(hji hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMGhjf ubah}(h]h ]h"]h$]h&]uh1jM hjJ ubeh}(h]h ]h"]h$]h&]uh1jH hjE ubah}(h]h ]h"]h$]h&]uh1jC hj+ ubhtbody)}(hhh](jI )}(hhh](jN )}(hhh]h)}(htargeth]htarget}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMHhj ubah}(h]h ]h"]h$]h&]uh1jM hj ubjN )}(hhh]h)}(hSpecifies the name of the BPF iterator. For example: ``bpf_map``, ``bpf_map_elem``. The name should be different from other ``bpf_iter`` target names in the kernel.h](h5Specifies the name of the BPF iterator. For example: }(hj hhhNhNubj)}(h ``bpf_map``h]hbpf_map}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1jhj ubh, }(hj hhhNhNubj)}(h``bpf_map_elem``h]h bpf_map_elem}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1jhj ubh*. The name should be different from other }(hj hhhNhNubj)}(h ``bpf_iter``h]hbpf_iter}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1jhj ubh target names in the kernel.}(hj hhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhMIhj ubah}(h]h ]h"]h$]h&]uh1jM hj ubeh}(h]h ]h"]h$]h&]uh1jH hj ubjI )}(hhh](jN )}(hhh]h)}(hattach_target and detach_targeth]hattach_target and detach_target}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMKhj ubah}(h]h ]h"]h$]h&]uh1jM hj ubjN )}(hhh]h)}(hAllows for target specific ``link_create`` action since some targets may need special processing. Called during the user space link_create stage.h](hAllows for target specific }(hj hhhNhNubj)}(h``link_create``h]h link_create}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1jhj ubhg action since some targets may need special processing. Called during the user space link_create stage.}(hj hhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhMLhj ubah}(h]h ]h"]h$]h&]uh1jM hj ubeh}(h]h ]h"]h$]h&]uh1jH hj ubjI )}(hhh](jN )}(hhh]h)}(hshow_fdinfo and fill_link_infoh]hshow_fdinfo and fill_link_info}(hjJ hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMNhjG ubah}(h]h ]h"]h$]h&]uh1jM hjD ubjN )}(hhh]h)}(hiCalled to fill target specific information when user tries to get link info associated with the iterator.h]hiCalled to fill target specific information when user tries to get link info associated with the iterator.}(hja hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMOhj^ ubah}(h]h ]h"]h$]h&]uh1jM hjD ubeh}(h]h ]h"]h$]h&]uh1jH hj ubjI )}(hhh](jN )}(hhh]h)}(hget_func_protoh]hget_func_proto}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMQhj~ ubah}(h]h ]h"]h$]h&]uh1jM hj{ ubjN )}(hhh]h)}(hFPermits a BPF iterator to access BPF helpers specific to the iterator.h]hFPermits a BPF iterator to access BPF helpers specific to the iterator.}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMRhj ubah}(h]h ]h"]h$]h&]uh1jM hj{ ubeh}(h]h ]h"]h$]h&]uh1jH hj ubjI )}(hhh](jN )}(hhh]h)}(h"ctx_arg_info_size and ctx_arg_infoh]h"ctx_arg_info_size and ctx_arg_info}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMShj ubah}(h]h ]h"]h$]h&]uh1jM hj ubjN )}(hhh]h)}(hYSpecifies the verifier states for BPF program arguments associated with the bpf iterator.h]hYSpecifies the verifier states for BPF program arguments associated with the bpf iterator.}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMThj ubah}(h]h ]h"]h$]h&]uh1jM hj ubeh}(h]h ]h"]h$]h&]uh1jH hj ubjI )}(hhh](jN )}(hhh]h)}(hfeatureh]hfeature}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMVhj ubah}(h]h ]h"]h$]h&]uh1jM hj ubjN )}(hhh]h)}(hSpecifies certain action requests in the kernel BPF iterator infrastructure. Currently, only BPF_ITER_RESCHED is supported. This means that the kernel function cond_resched() is called to avoid other kernel subsystem (e.g., rcu) misbehaving.h]hSpecifies certain action requests in the kernel BPF iterator infrastructure. Currently, only BPF_ITER_RESCHED is supported. This means that the kernel function cond_resched() is called to avoid other kernel subsystem (e.g., rcu) misbehaving.}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMWhj ubah}(h]h ]h"]h$]h&]uh1jM hj ubeh}(h]h ]h"]h$]h&]uh1jH hj ubjI )}(hhh](jN )}(hhh]h)}(hseq_infoh]hseq_info}(hj& hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhM[hj# ubah}(h]h ]h"]h$]h&]uh1jM hj ubjN )}(hhh]h)}(hSpecifies the set of seq operations for the BPF iterator and helpers to initialize/free the private data for the corresponding ``seq_file``.h](hSpecifies the set of seq operations for the BPF iterator and helpers to initialize/free the private data for the corresponding }(hj= hhhNhNubj)}(h ``seq_file``h]hseq_file}(hjE hhhNhNubah}(h]h ]h"]h$]h&]uh1jhj= ubh.}(hj= hhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhM\hj: ubah}(h]h ]h"]h$]h&]uh1jM hj ubeh}(h]h ]h"]h$]h&]uh1jH hj ubeh}(h]h ]h"]h$]h&]uh1j hj+ ubeh}(h]h ]h"]h$]h&]colsKuh1j) hj& ubah}(h]h ]colwidths-givenah"]h$]h&]uh1j$ hj hhhNhNubh)}(h`Click here `_ to see an implementation of the ``task_vma`` BPF iterator in the kernel.h](j)}(hY`Click here `_h]h Click here}(hj hhhNhNubah}(h]h ]h"]h$]h&]name Click herej0Ihttps://lore.kernel.org/bpf/20210212183107.50963-2-songliubraving@fb.com/uh1jhj} ubj3)}(hL h]h}(h] click-hereah ]h"] click hereah$]h&]refurij uh1j2jAKhj} ubh! to see an implementation of the }(hj} hhhNhNubj)}(h ``task_vma``h]htask_vma}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1jhj} ubh BPF iterator in the kernel.}(hj} hhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhM_hj hhubeh}(h]7implement-kernel-support-for-bpf-iterator-program-typesah ]h"]7implement kernel support for bpf iterator program typesah$]h&]uh1hhhhhhhhM'ubh)}(hhh](h)}(h!Parameterizing BPF Task Iteratorsh]h!Parameterizing BPF Task Iterators}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1hhj hhhhhMeubh)}(hXBy default, BPF iterators walk through all the objects of the specified types (processes, cgroups, maps, etc.) across the entire system to read relevant kernel data. But often, there are cases where we only care about a much smaller subset of iterable kernel objects, such as only iterating tasks within a specific process. Therefore, BPF iterator programs support filtering out objects from iteration by allowing user space to configure the iterator program when it is attached.h]hXBy default, BPF iterators walk through all the objects of the specified types (processes, cgroups, maps, etc.) across the entire system to read relevant kernel data. But often, there are cases where we only care about a much smaller subset of iterable kernel objects, such as only iterating tasks within a specific process. Therefore, BPF iterator programs support filtering out objects from iteration by allowing user space to configure the iterator program when it is attached.}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMghj hhubeh}(h]!parameterizing-bpf-task-iteratorsah ]h"]!parameterizing bpf task iteratorsah$]h&]uh1hhhhhhhhMeubh)}(hhh](h)}(hBPF Task Iterator Programh]hBPF Task Iterator Program}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1hhj hhhhhMqubh)}(hThe following code is a BPF iterator program to print files and task information through the ``seq_file`` of the iterator. It is a standard BPF iterator program that visits every file of an iterator. We will use this BPF program in our example later.h](h]The following code is a BPF iterator program to print files and task information through the }(hj hhhNhNubj)}(h ``seq_file``h]hseq_file}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1jhj ubh of the iterator. It is a standard BPF iterator program that visits every file of an iterator. We will use this BPF program in our example later.}(hj hhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhMshj hhubj)}(hX~#include #include char _license[] SEC("license") = "GPL"; SEC("iter/task_file") int dump_task_file(struct bpf_iter__task_file *ctx) { struct seq_file *seq = ctx->meta->seq; struct task_struct *task = ctx->task; struct file *file = ctx->file; __u32 fd = ctx->fd; if (task == NULL || file == NULL) return 0; if (ctx->meta->seq_num == 0) { BPF_SEQ_PRINTF(seq, " tgid pid fd file\n"); } BPF_SEQ_PRINTF(seq, "%8d %8d %8d %lx\n", task->tgid, task->pid, fd, (long)file->f_op); return 0; }Uh]hX~#include #include char _license[] SEC("license") = "GPL"; SEC("iter/task_file") int dump_task_file(struct bpf_iter__task_file *ctx) { struct seq_file *seq = ctx->meta->seq; struct task_struct *task = ctx->task; struct file *file = ctx->file; __u32 fd = ctx->fd; if (task == NULL || file == NULL) return 0; if (ctx->meta->seq_num == 0) { BPF_SEQ_PRINTF(seq, " tgid pid fd file\n"); } BPF_SEQ_PRINTF(seq, "%8d %8d %8d %lx\n", task->tgid, task->pid, fd, (long)file->f_op); return 0; }}hj sbah}(h]h ]h"]h$]h&]jjuh1jhhhMzhj hhubeh}(h]bpf-task-iterator-programah ]h"]bpf task iterator programah$]h&]uh1hhhhhhhhMqubh)}(hhh](h)}(h(Creating a File Iterator with Parametersh]h(Creating a File Iterator with Parameters}(hj4 hhhNhNubah}(h]h ]h"]h$]h&]uh1hhj1 hhhhhMubh)}(hTNow, let us look at how to create an iterator that includes only files of a process.h]hTNow, let us look at how to create an iterator that includes only files of a process.}(hjB hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMhj1 hhubh)}(h@First, fill the ``bpf_iter_attach_opts`` struct as shown below:h](hFirst, fill the }(hjP hhhNhNubj)}(h``bpf_iter_attach_opts``h]hbpf_iter_attach_opts}(hjX hhhNhNubah}(h]h ]h"]h$]h&]uh1jhjP ubh struct as shown below:}(hjP hhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhMhj1 hhubj)}(hLIBBPF_OPTS(bpf_iter_attach_opts, opts); union bpf_iter_link_info linfo; memset(&linfo, 0, sizeof(linfo)); linfo.task.pid = getpid(); opts.link_info = &linfo; opts.link_info_len = sizeof(linfo);h]hLIBBPF_OPTS(bpf_iter_attach_opts, opts); union bpf_iter_link_info linfo; memset(&linfo, 0, sizeof(linfo)); linfo.task.pid = getpid(); opts.link_info = &linfo; opts.link_info_len = sizeof(linfo);}hjp sbah}(h]h ]h"]h$]h&]jjuh1jhhhMhj1 hhubh)}(hX``linfo.task.pid``, if it is non-zero, directs the kernel to create an iterator that only includes opened files for the process with the specified ``pid``. In this example, we will only be iterating files for our process. If ``linfo.task.pid`` is zero, the iterator will visit every opened file of every process. Similarly, ``linfo.task.tid`` directs the kernel to create an iterator that visits opened files of a specific thread, not a process. In this example, ``linfo.task.tid`` is different from ``linfo.task.pid`` only if the thread has a separate file descriptor table. In most circumstances, all process threads share a single file descriptor table.h](j)}(h``linfo.task.pid``h]hlinfo.task.pid}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1jhj~ ubh, if it is non-zero, directs the kernel to create an iterator that only includes opened files for the process with the specified }(hj~ hhhNhNubj)}(h``pid``h]hpid}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1jhj~ ubhG. In this example, we will only be iterating files for our process. If }(hj~ hhhNhNubj)}(h``linfo.task.pid``h]hlinfo.task.pid}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1jhj~ ubhQ is zero, the iterator will visit every opened file of every process. Similarly, }(hj~ hhhNhNubj)}(h``linfo.task.tid``h]hlinfo.task.tid}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1jhj~ ubhy directs the kernel to create an iterator that visits opened files of a specific thread, not a process. In this example, }(hj~ hhhNhNubj)}(h``linfo.task.tid``h]hlinfo.task.tid}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1jhj~ ubh is different from }(hj~ hhhNhNubj)}(h``linfo.task.pid``h]hlinfo.task.pid}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1jhj~ ubh only if the thread has a separate file descriptor table. In most circumstances, all process threads share a single file descriptor table.}(hj~ hhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhMhj1 hhubh)}(h`Now, in the userspace program, pass the pointer of struct to the ``bpf_program__attach_iter()``.h](hANow, in the userspace program, pass the pointer of struct to the }(hj hhhNhNubj)}(h``bpf_program__attach_iter()``h]hbpf_program__attach_iter()}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1jhj ubh.}(hj hhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhMhj1 hhubj)}(h\link = bpf_program__attach_iter(prog, &opts); iter_fd = bpf_iter_create(bpf_link__fd(link));h]h\link = bpf_program__attach_iter(prog, &opts); iter_fd = bpf_iter_create(bpf_link__fd(link));}hjsbah}(h]h ]h"]h$]h&]jjuh1jhhhMhj1 hhubh)}(hXIf both *tid* and *pid* are zero, an iterator created from this struct ``bpf_iter_attach_opts`` will include every opened file of every task in the system (in the namespace, actually.) It is the same as passing a NULL as the second argument to ``bpf_program__attach_iter()``.h](hIf both }(hj"hhhNhNubh)}(h*tid*h]htid}(hj*hhhNhNubah}(h]h ]h"]h$]h&]uh1hhj"ubh and }(hj"hhhNhNubh)}(h*pid*h]hpid}(hj<hhhNhNubah}(h]h ]h"]h$]h&]uh1hhj"ubh0 are zero, an iterator created from this struct }(hj"hhhNhNubj)}(h``bpf_iter_attach_opts``h]hbpf_iter_attach_opts}(hjNhhhNhNubah}(h]h ]h"]h$]h&]uh1jhj"ubh will include every opened file of every task in the system (in the namespace, actually.) It is the same as passing a NULL as the second argument to }(hj"hhhNhNubj)}(h``bpf_program__attach_iter()``h]hbpf_program__attach_iter()}(hj`hhhNhNubah}(h]h ]h"]h$]h&]uh1jhj"ubh.}(hj"hhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhMhj1 hhubh)}(h0The whole program looks like the following code:h]h0The whole program looks like the following code:}(hjxhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMhj1 hhubj)}(hXf#include #include #include #include #include "bpf_iter_task_ex.skel.h" static int do_read_opts(struct bpf_program *prog, struct bpf_iter_attach_opts *opts) { struct bpf_link *link; char buf[16] = {}; int iter_fd = -1, len; int ret = 0; link = bpf_program__attach_iter(prog, opts); if (!link) { fprintf(stderr, "bpf_program__attach_iter() fails\n"); return -1; } iter_fd = bpf_iter_create(bpf_link__fd(link)); if (iter_fd < 0) { fprintf(stderr, "bpf_iter_create() fails\n"); ret = -1; goto free_link; } /* not check contents, but ensure read() ends without error */ while ((len = read(iter_fd, buf, sizeof(buf) - 1)) > 0) { buf[len] = 0; printf("%s", buf); } printf("\n"); free_link: if (iter_fd >= 0) close(iter_fd); bpf_link__destroy(link); return 0; } static void test_task_file(void) { LIBBPF_OPTS(bpf_iter_attach_opts, opts); struct bpf_iter_task_ex *skel; union bpf_iter_link_info linfo; skel = bpf_iter_task_ex__open_and_load(); if (skel == NULL) return; memset(&linfo, 0, sizeof(linfo)); linfo.task.pid = getpid(); opts.link_info = &linfo; opts.link_info_len = sizeof(linfo); printf("PID %d\n", getpid()); do_read_opts(skel->progs.dump_task_file, &opts); bpf_iter_task_ex__destroy(skel); } int main(int argc, const char * const * argv) { test_task_file(); return 0; }h]hXf#include #include #include #include #include "bpf_iter_task_ex.skel.h" static int do_read_opts(struct bpf_program *prog, struct bpf_iter_attach_opts *opts) { struct bpf_link *link; char buf[16] = {}; int iter_fd = -1, len; int ret = 0; link = bpf_program__attach_iter(prog, opts); if (!link) { fprintf(stderr, "bpf_program__attach_iter() fails\n"); return -1; } iter_fd = bpf_iter_create(bpf_link__fd(link)); if (iter_fd < 0) { fprintf(stderr, "bpf_iter_create() fails\n"); ret = -1; goto free_link; } /* not check contents, but ensure read() ends without error */ while ((len = read(iter_fd, buf, sizeof(buf) - 1)) > 0) { buf[len] = 0; printf("%s", buf); } printf("\n"); free_link: if (iter_fd >= 0) close(iter_fd); bpf_link__destroy(link); return 0; } static void test_task_file(void) { LIBBPF_OPTS(bpf_iter_attach_opts, opts); struct bpf_iter_task_ex *skel; union bpf_iter_link_info linfo; skel = bpf_iter_task_ex__open_and_load(); if (skel == NULL) return; memset(&linfo, 0, sizeof(linfo)); linfo.task.pid = getpid(); opts.link_info = &linfo; opts.link_info_len = sizeof(linfo); printf("PID %d\n", getpid()); do_read_opts(skel->progs.dump_task_file, &opts); bpf_iter_task_ex__destroy(skel); } int main(int argc, const char * const * argv) { test_task_file(); return 0; }}hjsbah}(h]h ]h"]h$]h&]jjuh1jhhhMhj1 hhubh)}(h5The following lines are the output of the program. ::h]h2The following lines are the output of the program.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMhj1 hhubj)}(hXPID 1859 tgid pid fd file 1859 1859 0 ffffffff82270aa0 1859 1859 1 ffffffff82270aa0 1859 1859 2 ffffffff82270aa0 1859 1859 3 ffffffff82272980 1859 1859 4 ffffffff8225e120 1859 1859 5 ffffffff82255120 1859 1859 6 ffffffff82254f00 1859 1859 7 ffffffff82254d80 1859 1859 8 ffffffff8225abe0h]hXPID 1859 tgid pid fd file 1859 1859 0 ffffffff82270aa0 1859 1859 1 ffffffff82270aa0 1859 1859 2 ffffffff82270aa0 1859 1859 3 ffffffff82272980 1859 1859 4 ffffffff8225e120 1859 1859 5 ffffffff82255120 1859 1859 6 ffffffff82254f00 1859 1859 7 ffffffff82254d80 1859 1859 8 ffffffff8225abe0}hjsbah}(h]h ]h"]h$]h&]jjuh1jhhhMhj1 hhubeh}(h](creating-a-file-iterator-with-parametersah ]h"](creating a file iterator with parametersah$]h&]uh1hhhhhhhhMubh)}(hhh](h)}(hWithout Parametersh]hWithout Parameters}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhjhhhhhM ubh)}(hXeLet us look at how a BPF iterator without parameters skips files of other processes in the system. In this case, the BPF program has to check the pid or the tid of tasks, or it will receive every opened file in the system (in the current *pid* namespace, actually). So, we usually add a global variable in the BPF program to pass a *pid* to the BPF program.h](hLet us look at how a BPF iterator without parameters skips files of other processes in the system. In this case, the BPF program has to check the pid or the tid of tasks, or it will receive every opened file in the system (in the current }(hjhhhNhNubh)}(h*pid*h]hpid}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhjubhY namespace, actually). So, we usually add a global variable in the BPF program to pass a }(hjhhhNhNubh)}(h*pid*h]hpid}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhjubh to the BPF program.}(hjhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhM hjhhubh)}(h4The BPF program would look like the following block.h]h4The BPF program would look like the following block.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMhjhhubh block_quote)}(hX:: ...... int target_pid = 0; SEC("iter/task_file") int dump_task_file(struct bpf_iter__task_file *ctx) { ...... if (task->tgid != target_pid) /* Check task->pid instead to check thread IDs */ return 0; BPF_SEQ_PRINTF(seq, "%8d %8d %8d %lx\n", task->tgid, task->pid, fd, (long)file->f_op); return 0; } h]j)}(hXg...... int target_pid = 0; SEC("iter/task_file") int dump_task_file(struct bpf_iter__task_file *ctx) { ...... if (task->tgid != target_pid) /* Check task->pid instead to check thread IDs */ return 0; BPF_SEQ_PRINTF(seq, "%8d %8d %8d %lx\n", task->tgid, task->pid, fd, (long)file->f_op); return 0; }h]hXg...... int target_pid = 0; SEC("iter/task_file") int dump_task_file(struct bpf_iter__task_file *ctx) { ...... if (task->tgid != target_pid) /* Check task->pid instead to check thread IDs */ return 0; BPF_SEQ_PRINTF(seq, "%8d %8d %8d %lx\n", task->tgid, task->pid, fd, (long)file->f_op); return 0; }}hjsbah}(h]h ]h"]h$]h&]jjuh1jhhhMhj ubah}(h]h ]h"]h$]h&]uh1j hhhMhjhhubh)}(h;The user space program would look like the following block:h]h;The user space program would look like the following block:}(hj#hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhM%hjhhubj )}(hX:: ...... static void test_task_file(void) { ...... skel = bpf_iter_task_ex__open_and_load(); if (skel == NULL) return; skel->bss->target_pid = getpid(); /* process ID. For thread id, use gettid() */ memset(&linfo, 0, sizeof(linfo)); linfo.task.pid = getpid(); opts.link_info = &linfo; opts.link_info_len = sizeof(linfo); ...... } h]j)}(hX...... static void test_task_file(void) { ...... skel = bpf_iter_task_ex__open_and_load(); if (skel == NULL) return; skel->bss->target_pid = getpid(); /* process ID. For thread id, use gettid() */ memset(&linfo, 0, sizeof(linfo)); linfo.task.pid = getpid(); opts.link_info = &linfo; opts.link_info_len = sizeof(linfo); ...... }h]hX...... static void test_task_file(void) { ...... skel = bpf_iter_task_ex__open_and_load(); if (skel == NULL) return; skel->bss->target_pid = getpid(); /* process ID. For thread id, use gettid() */ memset(&linfo, 0, sizeof(linfo)); linfo.task.pid = getpid(); opts.link_info = &linfo; opts.link_info_len = sizeof(linfo); ...... }}hj5sbah}(h]h ]h"]h$]h&]jjuh1jhhhM)hj1ubah}(h]h ]h"]h$]h&]uh1j hhhM'hjhhubh)}(hX5``target_pid`` is a global variable in the BPF program. The user space program should initialize the variable with a process ID to skip opened files of other processes in the BPF program. When you parametrize a BPF iterator, the iterator calls the BPF program fewer times which can save significant resources.h](j)}(h``target_pid``h]h target_pid}(hjMhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjIubhX' is a global variable in the BPF program. The user space program should initialize the variable with a process ID to skip opened files of other processes in the BPF program. When you parametrize a BPF iterator, the iterator calls the BPF program fewer times which can save significant resources.}(hjIhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhM8hjhhubeh}(h]without-parametersah ]h"]without parametersah$]h&]uh1hhhhhhhhM ubh)}(hhh](h)}(hParametrizing VMA Iteratorsh]hParametrizing VMA Iterators}(hjphhhNhNubah}(h]h ]h"]h$]h&]uh1hhjmhhhhhM?ubh)}(hX#By default, a BPF VMA iterator includes every VMA in every process. However, you can still specify a process or a thread to include only its VMAs. Unlike files, a thread can not have a separate address space (since Linux 2.6.0-test6). Here, using *tid* makes no difference from using *pid*.h](hBy default, a BPF VMA iterator includes every VMA in every process. However, you can still specify a process or a thread to include only its VMAs. Unlike files, a thread can not have a separate address space (since Linux 2.6.0-test6). Here, using }(hj~hhhNhNubh)}(h*tid*h]htid}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhj~ubh makes no difference from using }(hj~hhhNhNubh)}(h*pid*h]hpid}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhj~ubh.}(hj~hhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhMAhjmhhubeh}(h]parametrizing-vma-iteratorsah ]h"]parametrizing vma iteratorsah$]h&]uh1hhhhhhhhM?ubh)}(hhh](h)}(hParametrizing Task Iteratorsh]hParametrizing Task Iterators}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhjhhhhhMHubh)}(hA BPF task iterator with *pid* includes all tasks (threads) of a process. The BPF program receives these tasks one after another. You can specify a BPF task iterator with *tid* parameter to include only the tasks that match the given *tid*.h](hA BPF task iterator with }(hjhhhNhNubh)}(h*pid*h]hpid}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhjubh includes all tasks (threads) of a process. The BPF program receives these tasks one after another. You can specify a BPF task iterator with }(hjhhhNhNubh)}(h*tid*h]htid}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhjubh: parameter to include only the tasks that match the given }(hjhhhNhNubh)}(h*tid*h]htid}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhjubh.}(hjhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhMJhjhhubeh}(h]parametrizing-task-iteratorsah ]h"]parametrizing task iteratorsah$]h&]uh1hhhhhhhhMHubeh}(h] bpf-iteratorsah ]h"] bpf iteratorsah$]h&]uh1hhhhhhhhKubeh}(h]h ]h"]h$]h&]sourcehuh1hcurrent_sourceN current_lineNsettingsdocutils.frontendValues)}(hN generatorN datestampN source_linkN source_urlN toc_backlinksjM footnote_backlinksK sectnum_xformKstrip_commentsNstrip_elements_with_classesN strip_classesN report_levelK halt_levelKexit_status_levelKdebugNwarning_streamN tracebackinput_encoding utf-8-siginput_encoding_error_handlerstrictoutput_encodingutf-8output_encoding_error_handlerj?error_encodingutf-8error_encoding_error_handlerbackslashreplace language_codeenrecord_dependenciesNconfigN id_prefixhauto_id_prefixid dump_settingsNdump_internalsNdump_transformsNdump_pseudo_xmlNexpose_internalsNstrict_visitorN_disable_configN_sourceh _destinationN _config_files]7/var/lib/git/docbuild/linux/Documentation/docutils.confafile_insertion_enabled raw_enabledKline_length_limitM'pep_referencesN pep_base_urlhttps://peps.python.org/pep_file_url_templatepep-%04drfc_referencesN rfc_base_url&https://datatracker.ietf.org/doc/html/ tab_widthKtrim_footnote_reference_spacesyntax_highlightlong smart_quotessmartquotes_locales]character_level_inline_markupdoctitle_xform docinfo_xformKsectsubtitle_xform image_loadinglinkembed_stylesheetcloak_email_addressessection_self_linkenvNubreporterNindirect_targets]substitution_defs}substitution_names}refnames}refids}nameids}(jjjjjjjjj=j:jnjkj6j3j j jljijjj$j!jUjRjjj7j4j j j j j j j j j. j+ jjjjjgjjjju nametypes}(jjjjj=jnj6j jljj$jUjj7j j j j j. jjjjjuh}(jhjhjjjjj:j4jkjej3jj j9jijcjjj!jjRjLjjj4j.j j j j j j j j j+ j jj1 jgjjjmjju footnote_refs} citation_refs} autofootnotes]autofootnote_refs]symbol_footnotes]symbol_footnote_refs] footnotes] citations]autofootnote_startKsymbol_footnote_startK id_counter collectionsCounter}Rparse_messages]hsystem_message)}(hhh]h)}(hfPossible title underline, too short for the title. Treating it as ordinary text because it's so short.h]hhPossible title underline, too short for the title. Treating it as ordinary text because it’s so short.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhjubah}(h]h ]h"]h$]h&]levelKtypeINFOlineMsourcehuh1jhj1 hhhhhMubatransform_messages] transformerN include_log] decorationNhhub.