߇sphinx.addnodesdocument)}( rawsourcechildren]( translations LanguagesNode)}(hhh](h pending_xref)}(hhh]docutils.nodesTextChinese (Simplified)}parenthsba attributes}(ids]classes]names]dupnames]backrefs] refdomainstdreftypedoc reftarget&/translations/zh_CN/userspace-api/rseqmodnameN classnameN refexplicitutagnamehhh ubh)}(hhh]hChinese (Traditional)}hh2sbah}(h]h ]h"]h$]h&] refdomainh)reftypeh+ reftarget&/translations/zh_TW/userspace-api/rseqmodnameN classnameN refexplicituh1hhh ubh)}(hhh]hItalian}hhFsbah}(h]h ]h"]h$]h&] refdomainh)reftypeh+ reftarget&/translations/it_IT/userspace-api/rseqmodnameN classnameN refexplicituh1hhh ubh)}(hhh]hJapanese}hhZsbah}(h]h ]h"]h$]h&] refdomainh)reftypeh+ reftarget&/translations/ja_JP/userspace-api/rseqmodnameN classnameN refexplicituh1hhh ubh)}(hhh]hKorean}hhnsbah}(h]h ]h"]h$]h&] refdomainh)reftypeh+ reftarget&/translations/ko_KR/userspace-api/rseqmodnameN classnameN refexplicituh1hhh ubh)}(hhh]hPortuguese (Brazilian)}hhsbah}(h]h ]h"]h$]h&] refdomainh)reftypeh+ reftarget&/translations/pt_BR/userspace-api/rseqmodnameN classnameN refexplicituh1hhh ubh)}(hhh]hSpanish}hhsbah}(h]h ]h"]h$]h&] refdomainh)reftypeh+ reftarget&/translations/sp_SP/userspace-api/rseqmodnameN classnameN refexplicituh1hhh ubeh}(h]h ]h"]h$]h&]current_languageEnglishuh1h hh _documenthsourceNlineNubhsection)}(hhh](htitle)}(hRestartable Sequencesh]hRestartable Sequences}(hhhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhhh@/var/lib/git/docbuild/linux/Documentation/userspace-api/rseq.rsthKubh paragraph)}(hRestartable Sequences allow to register a per thread userspace memory area to be used as an ABI between kernel and userspace for three purposes:h]hRestartable Sequences allow to register a per thread userspace memory area to be used as an ABI between kernel and userspace for three purposes:}(hhhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhhhhubh block_quote)}(h* userspace restartable sequences * quick access to read the current CPU number, node ID from userspace * scheduler time slice extensions h]h bullet_list)}(hhh](h list_item)}(h userspace restartable sequences h]h)}(huserspace restartable sequencesh]huserspace restartable sequences}(hhhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhhubah}(h]h ]h"]h$]h&]uh1hhhubh)}(hDquick access to read the current CPU number, node ID from userspace h]h)}(hCquick access to read the current CPU number, node ID from userspaceh]hCquick access to read the current CPU number, node ID from userspace}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhK hjubah}(h]h ]h"]h$]h&]uh1hhhubh)}(h scheduler time slice extensions h]h)}(hscheduler time slice extensionsh]hscheduler time slice extensions}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhK hjubah}(h]h ]h"]h$]h&]uh1hhhubeh}(h]h ]h"]h$]h&]bullet*uh1hhhhKhhubah}(h]h ]h"]h$]h&]uh1hhhhKhhhhubh)}(hhh](h)}(h'Restartable sequences (per-cpu atomics)h]h'Restartable sequences (per-cpu atomics)}(hjAhhhNhNubah}(h]h ]h"]h$]h&]uh1hhj>hhhhhKubh)}(hRestartable sequences allow userspace to perform update operations on per-cpu data without requiring heavyweight atomic operations. The actual ABI is unfortunately only available in the code and selftests.h]hRestartable sequences allow userspace to perform update operations on per-cpu data without requiring heavyweight atomic operations. The actual ABI is unfortunately only available in the code and selftests.}(hjOhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhj>hhubeh}(h]%restartable-sequences-per-cpu-atomicsah ]h"]'restartable sequences (per-cpu atomics)ah$]h&]uh1hhhhhhhhKubh)}(hhh](h)}(h#Quick access to CPU number, node IDh]h#Quick access to CPU number, node ID}(hjhhhhNhNubah}(h]h ]h"]h$]h&]uh1hhjehhhhhKubh)}(hXAllows to implement per CPU data efficiently. Documentation is in code and selftests. :(h]hXAllows to implement per CPU data efficiently. Documentation is in code and selftests. :(}(hjvhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhjehhubeh}(h]"quick-access-to-cpu-number-node-idah ]h"]#quick access to cpu number, node idah$]h&]uh1hhhhhhhhKubh)}(hhh](h)}(hOptimized RSEQ V2h]hOptimized RSEQ V2}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhjhhhhhKubh)}(hOn architectures which utilize the generic entry code and generic TIF bits the kernel supports runtime optimizations for RSEQ, which also enable enhanced features like scheduler time slice extensions.h]hOn architectures which utilize the generic entry code and generic TIF bits the kernel supports runtime optimizations for RSEQ, which also enable enhanced features like scheduler time slice extensions.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhjhhubh)}(h}To enable them a task has to register the RSEQ region with at least the length advertised by getauxval(AT_RSEQ_FEATURE_SIZE).h]h}To enable them a task has to register the RSEQ region with at least the length advertised by getauxval(AT_RSEQ_FEATURE_SIZE).}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhK"hjhhubh)}(hIf existing binaries register with RSEQ_ORIG_SIZE (32 bytes), the kernel keeps the legacy low performance mode enabled to fulfil the expectations of existing users regarding the original RSEQ implementation behaviour.h]hIf existing binaries register with RSEQ_ORIG_SIZE (32 bytes), the kernel keeps the legacy low performance mode enabled to fulfil the expectations of existing users regarding the original RSEQ implementation behaviour.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhK%hjhhubh)}(hhThe following table documents the ABI and behavioral guarantees of the legacy and the optimized V2 mode.h]hhThe following table documents the ABI and behavioral guarantees of the legacy and the optimized V2 mode.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhK)hjhhubhtable)}(hhh](h)}(h RSEQ modesh]h RSEQ modes}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhK,hjubhtgroup)}(hhh](hcolspec)}(hhh]h}(h]h ]h"]h$]h&]colwidthKuh1jhjubj)}(hhh]h}(h]h ]h"]h$]h&]jKuh1jhjubj)}(hhh]h}(h]h ]h"]h$]h&]jKuh1jhjubj)}(hhh]h}(h]h ]h"]h$]h&]jKuh1jhjubhthead)}(hhh]hrow)}(hhh](hentry)}(hhh]h)}(hNrh]hNr}(hj#hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhK/hj ubah}(h]h ]h"]h$]h&]uh1jhjubj)}(hhh]h)}(hWhath]hWhat}(hj:hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhK0hj7ubah}(h]h ]h"]h$]h&]uh1jhjubj)}(hhh]h)}(hLegacyh]hLegacy}(hjQhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhK2hjNubah}(h]h ]h"]h$]h&]uh1jhjubj)}(hhh]h)}(h Optimized V2h]h Optimized V2}(hjhhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhK3hjeubah}(h]h ]h"]h$]h&]uh1jhjubeh}(h]h ]h"]h$]h&]uh1jhjubah}(h]h ]h"]h$]h&]uh1jhjubhtbody)}(hhh](j)}(hhh](j)}(hhh]h)}(h1h]h1}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhK5hjubah}(h]h ]h"]h$]h&]uh1jhjubj)}(hhh]h)}(hSThe cpu_id_start, cpu_id, node_id and mm_cid fields (User mode read only) .. Legacyh]hSThe cpu_id_start, cpu_id, node_id and mm_cid fields (User mode read only) .. Legacy}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhK6hjubah}(h]h ]h"]h$]h&]uh1jhjubj)}(hhh]h)}(hjUpdated by the kernel unconditionally after each context switch and before signal delivery .. Optimized V2h]hjUpdated by the kernel unconditionally after each context switch and before signal delivery .. Optimized V2}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhK9hjubah}(h]h ]h"]h$]h&]uh1jhjubj)}(hhh]h)}(h`Updated by the kernel if and only if they change, i.e. if the task is migrated or mm_cid changesh]h`Updated by the kernel if and only if they change, i.e. if the task is migrated or mm_cid changes}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhK_hj;ubah}(h]h ]h"]h$]h&]uh1jhj!ubj)}(hhh]h)}(hNot supported .. Optimized V2h]hNot supported .. Optimized V2}(hjUhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKahjRubah}(h]h ]h"]h$]h&]uh1jhj!ubj)}(hhh]h)}(h Supportedh]h Supported}(hjlhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKchjiubah}(h]h ]h"]h$]h&]uh1jhj!ubeh}(h]h ]h"]h$]h&]uh1jhjubeh}(h]h ]h"]h$]h&]uh1jhjubeh}(h]h ]h"]h$]h&]colsKuh1jhjubeh}(h]id1ah ]h"]h$]h&]uh1jhjhhhNhNubh)}(hXThe legacy mode is obviously less performant as it does unconditional updates and critical section checks even if not strictly required by the ABI contract. That can't be changed anymore as some users depend on that observed behavior, which in turn enables them to violate the ABI and overwrite the cpu_id_start field for their own purposes. This is obviously discouraged as it renders RSEQ incompatible with the intended usage and breaks the expectation of other libraries in the same application.h]hXThe legacy mode is obviously less performant as it does unconditional updates and critical section checks even if not strictly required by the ABI contract. That can’t be changed anymore as some users depend on that observed behavior, which in turn enables them to violate the ABI and overwrite the cpu_id_start field for their own purposes. This is obviously discouraged as it renders RSEQ incompatible with the intended usage and breaks the expectation of other libraries in the same application.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKehjhhubh)}(hXThe ABI compliant optimized v2 mode, which respects the read only fields, does not require unconditional updates and therefore is way more performant. The kernel validates the read only fields for compliance. If user space modifies them, the process is killed. Compliant usage allows multiple libraries in the same application to benefit from the RSEQ functionality without disturbing each other. The ABI compliant optimized v2 mode also enables extended RSEQ features like time slice extensions.h]hXThe ABI compliant optimized v2 mode, which respects the read only fields, does not require unconditional updates and therefore is way more performant. The kernel validates the read only fields for compliance. If user space modifies them, the process is killed. Compliant usage allows multiple libraries in the same application to benefit from the RSEQ functionality without disturbing each other. The ABI compliant optimized v2 mode also enables extended RSEQ features like time slice extensions.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKmhjhhubeh}(h]optimized-rseq-v2ah ]h"]optimized rseq v2ah$]h&]uh1hhhhhhhhKubh)}(hhh](h)}(hScheduler time slice extensionsh]hScheduler time slice extensions}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhjhhhhhKwubh)}(hThis allows a thread to request a time slice extension when it enters a critical section to avoid contention on a resource when the thread is scheduled out inside of the critical section.h]hThis allows a thread to request a time slice extension when it enters a critical section to avoid contention on a resource when the thread is scheduled out inside of the critical section.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKyhjhhubh)}(h-The prerequisites for this functionality are:h]h-The prerequisites for this functionality are:}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhK}hjhhubh)}(h* Enabled in Kconfig * Enabled at boot time (default is enabled) * A rseq userspace pointer has been registered for the thread in optimized V2 mode h]h)}(hhh](h)}(hEnabled in Kconfig h]h)}(hEnabled in Kconfigh]hEnabled in Kconfig}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhjubah}(h]h ]h"]h$]h&]uh1hhjubh)}(h*Enabled at boot time (default is enabled) h]h)}(h)Enabled at boot time (default is enabled)h]h)Enabled at boot time (default is enabled)}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhj ubah}(h]h ]h"]h$]h&]uh1hhjubh)}(hQA rseq userspace pointer has been registered for the thread in optimized V2 mode h]h)}(hPA rseq userspace pointer has been registered for the thread in optimized V2 modeh]hPA rseq userspace pointer has been registered for the thread in optimized V2 mode}(hj&hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhj"ubah}(h]h ]h"]h$]h&]uh1hhjubeh}(h]h ]h"]h$]h&]j6j7uh1hhhhKhjubah}(h]h ]h"]h$]h&]uh1hhhhKhjhhubh)}(h9The thread has to enable the functionality via prctl(2)::h]h8The thread has to enable the functionality via prctl(2):}(hjFhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhjhhubh literal_block)}(hbprctl(PR_RSEQ_SLICE_EXTENSION, PR_RSEQ_SLICE_EXTENSION_SET, PR_RSEQ_SLICE_EXT_ENABLE, 0, 0);h]hbprctl(PR_RSEQ_SLICE_EXTENSION, PR_RSEQ_SLICE_EXTENSION_SET, PR_RSEQ_SLICE_EXT_ENABLE, 0, 0);}hjVsbah}(h]h ]h"]h$]h&] xml:spacepreserveuh1jThhhKhjhhubh)}(hIprctl() returns 0 on success or otherwise with the following error codes:h]hIprctl() returns 0 on success or otherwise with the following error codes:}(hjfhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhjhhubj)}(hhh]j)}(hhh](j)}(hhh]h}(h]h ]h"]h$]h&]colwidthK uh1jhjwubj)}(hhh]h}(h]h ]h"]h$]h&]colwidthK>uh1jhjwubj)}(hhh]j)}(hhh](j)}(hhh]h)}(h Errorcodeh]h Errorcode}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhjubah}(h]h ]h"]h$]h&]uh1jhjubj)}(hhh]h)}(hMeaningh]hMeaning}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhjubah}(h]h ]h"]h$]h&]uh1jhjubeh}(h]h ]h"]h$]h&]uh1jhjubah}(h]h ]h"]h$]h&]uh1jhjwubj)}(hhh](j)}(hhh](j)}(hhh]h)}(hEINVALh]hEINVAL}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhjubah}(h]h ]h"]h$]h&]uh1jhjubj)}(hhh]h)}(h[Functionality not available or invalid function arguments. Note: arg4 and arg5 must be zeroh]h[Functionality not available or invalid function arguments. Note: arg4 and arg5 must be zero}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhjubah}(h]h ]h"]h$]h&]uh1jhjubeh}(h]h ]h"]h$]h&]uh1jhjubj)}(hhh](j)}(hhh]h)}(hENOTSUPPh]hENOTSUPP}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhj ubah}(h]h ]h"]h$]h&]uh1jhjubj)}(hhh]h)}(h5Functionality was disabled on the kernel command lineh]h5Functionality was disabled on the kernel command line}(hj%hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhj"ubah}(h]h ]h"]h$]h&]uh1jhjubeh}(h]h ]h"]h$]h&]uh1jhjubj)}(hhh](j)}(hhh]h)}(hENXIOh]hENXIO}(hjEhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhjBubah}(h]h ]h"]h$]h&]uh1jhj?ubj)}(hhh]h)}(h-Available, but no rseq user struct registeredh]h-Available, but no rseq user struct registered}(hj\hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhjYubah}(h]h ]h"]h$]h&]uh1jhj?ubeh}(h]h ]h"]h$]h&]uh1jhjubeh}(h]h ]h"]h$]h&]uh1jhjwubeh}(h]h ]h"]h$]h&]colsKuh1jhjtubah}(h]h ]h"]h$]h&]uh1jhjhhhhhNubh)}(h,The state can be also queried via prctl(2)::h]h+The state can be also queried via prctl(2):}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhjhhubjU)}(hEprctl(PR_RSEQ_SLICE_EXTENSION, PR_RSEQ_SLICE_EXTENSION_GET, 0, 0, 0);h]hEprctl(PR_RSEQ_SLICE_EXTENSION, PR_RSEQ_SLICE_EXTENSION_GET, 0, 0, 0);}hjsbah}(h]h ]h"]h$]h&]jdjeuh1jThhhKhjhhubh)}(hprctl() returns ``PR_RSEQ_SLICE_EXT_ENABLE`` when it is enabled or 0 if disabled. Otherwise it returns with the following error codes:h](hprctl() returns }(hjhhhNhNubhliteral)}(h``PR_RSEQ_SLICE_EXT_ENABLE``h]hPR_RSEQ_SLICE_EXT_ENABLE}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjubhZ when it is enabled or 0 if disabled. Otherwise it returns with the following error codes:}(hjhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhKhjhhubj)}(hhh]j)}(hhh](j)}(hhh]h}(h]h ]h"]h$]h&]colwidthK uh1jhjubj)}(hhh]h}(h]h ]h"]h$]h&]colwidthK>uh1jhjubj)}(hhh]j)}(hhh](j)}(hhh]h)}(h Errorcodeh]h Errorcode}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhjubah}(h]h ]h"]h$]h&]uh1jhjubj)}(hhh]h)}(hMeaningh]hMeaning}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhjubah}(h]h ]h"]h$]h&]uh1jhjubeh}(h]h ]h"]h$]h&]uh1jhjubah}(h]h ]h"]h$]h&]uh1jhjubj)}(hhh]j)}(hhh](j)}(hhh]h)}(hEINVALh]hEINVAL}(hj*hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhj'ubah}(h]h ]h"]h$]h&]uh1jhj$ubj)}(hhh]h)}(hdFunctionality not available or invalid function arguments. Note: arg3 and arg4 and arg5 must be zeroh]hdFunctionality not available or invalid function arguments. Note: arg3 and arg4 and arg5 must be zero}(hjAhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhj>ubah}(h]h ]h"]h$]h&]uh1jhj$ubeh}(h]h ]h"]h$]h&]uh1jhj!ubah}(h]h ]h"]h$]h&]uh1jhjubeh}(h]h ]h"]h$]h&]colsKuh1jhjubah}(h]h ]h"]h$]h&]uh1jhjhhhhhNubh)}(hThe availability and status is also exposed via the rseq ABI struct flags field via the ``RSEQ_CS_FLAG_SLICE_EXT_AVAILABLE_BIT`` and the ``RSEQ_CS_FLAG_SLICE_EXT_ENABLED_BIT``. These bits are read-only for user space and only for informational purposes.h](hXThe availability and status is also exposed via the rseq ABI struct flags field via the }(hjnhhhNhNubj)}(h(``RSEQ_CS_FLAG_SLICE_EXT_AVAILABLE_BIT``h]h$RSEQ_CS_FLAG_SLICE_EXT_AVAILABLE_BIT}(hjvhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjnubh and the }(hjnhhhNhNubj)}(h&``RSEQ_CS_FLAG_SLICE_EXT_ENABLED_BIT``h]h"RSEQ_CS_FLAG_SLICE_EXT_ENABLED_BIT}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjnubhN. These bits are read-only for user space and only for informational purposes.}(hjnhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhKhjhhubh)}(hX+If the mechanism was enabled via prctl(), the thread can request a time slice extension by setting rseq::slice_ctrl::request to 1. If the thread is interrupted and the interrupt results in a reschedule request in the kernel, then the kernel can grant a time slice extension and return to userspace instead of scheduling out. The length of the extension is determined by debugfs:rseq/slice_ext_nsec. The default value is 5 usec; which is the minimum value. It can be incremented to 50 usecs, however doing so can/will affect the minimum scheduling latency.h]hX+If the mechanism was enabled via prctl(), the thread can request a time slice extension by setting rseq::slice_ctrl::request to 1. If the thread is interrupted and the interrupt results in a reschedule request in the kernel, then the kernel can grant a time slice extension and return to userspace instead of scheduling out. The length of the extension is determined by debugfs:rseq/slice_ext_nsec. The default value is 5 usec; which is the minimum value. It can be incremented to 50 usecs, however doing so can/will affect the minimum scheduling latency.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhjhhubh)}(hAny proposed changes to this default will have to come with a selftest and rseq-slice-hist.py output that shows the new value has merrit.h]hAny proposed changes to this default will have to come with a selftest and rseq-slice-hist.py output that shows the new value has merrit.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhjhhubh)}(hThe kernel indicates the grant by clearing rseq::slice_ctrl::request and setting rseq::slice_ctrl::granted to 1. If there is a reschedule of the thread after granting the extension, the kernel clears the granted bit to indicate that to userspace.h]hThe kernel indicates the grant by clearing rseq::slice_ctrl::request and setting rseq::slice_ctrl::granted to 1. If there is a reschedule of the thread after granting the extension, the kernel clears the granted bit to indicate that to userspace.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhjhhubh)}(hkIf the request bit is still set when the leaving the critical section, userspace can clear it and continue.h]hkIf the request bit is still set when the leaving the critical section, userspace can clear it and continue.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhjhhubh)}(hIf the granted bit is set, then userspace invokes rseq_slice_yield(2) when leaving the critical section to relinquish the CPU. The kernel enforces this by arming a timer to prevent misbehaving userspace from abusing this mechanism.h]hIf the granted bit is set, then userspace invokes rseq_slice_yield(2) when leaving the critical section to relinquish the CPU. The kernel enforces this by arming a timer to prevent misbehaving userspace from abusing this mechanism.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhjhhubh)}(hIf both the request bit and the granted bit are false when leaving the critical section, then this indicates that a grant was revoked and no further action is required by userspace.h]hIf both the request bit and the granted bit are false when leaving the critical section, then this indicates that a grant was revoked and no further action is required by userspace.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhjhhubh)}(h&The required code flow is as follows::h]h%The required code flow is as follows:}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhjhhubjU)}(hrseq->slice_ctrl.request = 1; barrier(); // Prevent compiler reordering critical_section(); barrier(); // Prevent compiler reordering rseq->slice_ctrl.request = 0; if (rseq->slice_ctrl.granted) rseq_slice_yield();h]hrseq->slice_ctrl.request = 1; barrier(); // Prevent compiler reordering critical_section(); barrier(); // Prevent compiler reordering rseq->slice_ctrl.request = 0; if (rseq->slice_ctrl.granted) rseq_slice_yield();}hjsbah}(h]h ]h"]h$]h&]jdjeuh1jThhhKhjhhubh)}(hAs all of this is strictly CPU local, there are no atomicity requirements. Checking the granted state is racy, but that cannot be avoided at all::h]hAs all of this is strictly CPU local, there are no atomicity requirements. Checking the granted state is racy, but that cannot be avoided at all:}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhjhhubjU)}(hmif (rseq->slice_ctrl.granted) -> Interrupt results in schedule and grant revocation rseq_slice_yield();h]hmif (rseq->slice_ctrl.granted) -> Interrupt results in schedule and grant revocation rseq_slice_yield();}hjsbah}(h]h ]h"]h$]h&]jdjeuh1jThhhKhjhhubh)}(hTSo there is no point in pretending that this might be solved by an atomic operation.h]hTSo there is no point in pretending that this might be solved by an atomic operation.}(hj,hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhjhhubh)}(hXIf the thread issues a syscall other than rseq_slice_yield(2) within the granted timeslice extension, the grant is also revoked and the CPU is relinquished immediately when entering the kernel. This is required as syscalls might consume arbitrary CPU time until they reach a scheduling point when the preemption model is either NONE or VOLUNTARY and therefore might exceed the grant by far.h]hXIf the thread issues a syscall other than rseq_slice_yield(2) within the granted timeslice extension, the grant is also revoked and the CPU is relinquished immediately when entering the kernel. This is required as syscalls might consume arbitrary CPU time until they reach a scheduling point when the preemption model is either NONE or VOLUNTARY and therefore might exceed the grant by far.}(hj:hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhjhhubh)}(hXOThe preferred solution for user space is to use rseq_slice_yield(2) which is side effect free. The support for arbitrary syscalls is required to support onion layer architectured applications, where the code handling the critical section and requesting the time slice extension has no control over the code within the critical section.h]hXOThe preferred solution for user space is to use rseq_slice_yield(2) which is side effect free. The support for arbitrary syscalls is required to support onion layer architectured applications, where the code handling the critical section and requesting the time slice extension has no control over the code within the critical section.}(hjHhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhjhhubh)}(hfThe kernel enforces flag consistency and terminates the thread with SIGSEGV if it detects a violation.h]hfThe kernel enforces flag consistency and terminates the thread with SIGSEGV if it detects a violation.}(hjVhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhjhhubeh}(h]scheduler-time-slice-extensionsah ]h"]scheduler time slice extensionsah$]h&]uh1hhhhhhhhKwubeh}(h]restartable-sequencesah ]h"]restartable sequencesah$]h&]uh1hhhhhhhhKubeh}(h]h ]h"]h$]h&]sourcehuh1hcurrent_sourceN current_lineNsettingsdocutils.frontendValues)}(hN generatorN datestampN source_linkN source_urlN toc_backlinksjfootnote_backlinksK sectnum_xformKstrip_commentsNstrip_elements_with_classesN strip_classesN report_levelK halt_levelKexit_status_levelKdebugNwarning_streamN tracebackinput_encoding utf-8-siginput_encoding_error_handlerstrictoutput_encodingutf-8output_encoding_error_handlerjerror_encodingutf-8error_encoding_error_handlerbackslashreplace language_codeenrecord_dependenciesNconfigN id_prefixhauto_id_prefixid dump_settingsNdump_internalsNdump_transformsNdump_pseudo_xmlNexpose_internalsNstrict_visitorN_disable_configN_sourcehʌ _destinationN _config_files]7/var/lib/git/docbuild/linux/Documentation/docutils.confafile_insertion_enabled raw_enabledKline_length_limitM'pep_referencesN pep_base_urlhttps://peps.python.org/pep_file_url_templatepep-%04drfc_referencesN rfc_base_url&https://datatracker.ietf.org/doc/html/ tab_widthKtrim_footnote_reference_spacesyntax_highlightlong smart_quotessmartquotes_locales]character_level_inline_markupdoctitle_xform docinfo_xformKsectsubtitle_xform image_loadinglinkembed_stylesheetcloak_email_addressessection_self_linkenvNubreporterNindirect_targets]substitution_defs}substitution_names}refnames}refids}nameids}(jqjnjbj_jjjjjijfu nametypes}(jqjbjjjiuh}(jnhj_j>jjejjjfjjju footnote_refs} citation_refs} autofootnotes]autofootnote_refs]symbol_footnotes]symbol_footnote_refs] footnotes] citations]autofootnote_startKsymbol_footnote_startK id_counter collectionsCounter}jKsRparse_messages]transform_messages] transformerN include_log] decorationNhhub.