^sphinx.addnodesdocument)}( rawsourcechildren]( translations LanguagesNode)}(hhh](h pending_xref)}(hhh]docutils.nodesTextChinese (Simplified)}parenthsba attributes}(ids]classes]names]dupnames]backrefs] refdomainstdreftypedoc reftarget'/translations/zh_CN/scheduler/sched-extmodnameN classnameN refexplicitutagnamehhh ubh)}(hhh]hChinese (Traditional)}hh2sbah}(h]h ]h"]h$]h&] refdomainh)reftypeh+ reftarget'/translations/zh_TW/scheduler/sched-extmodnameN classnameN refexplicituh1hhh ubh)}(hhh]hItalian}hhFsbah}(h]h ]h"]h$]h&] refdomainh)reftypeh+ reftarget'/translations/it_IT/scheduler/sched-extmodnameN classnameN refexplicituh1hhh ubh)}(hhh]hJapanese}hhZsbah}(h]h ]h"]h$]h&] refdomainh)reftypeh+ reftarget'/translations/ja_JP/scheduler/sched-extmodnameN classnameN refexplicituh1hhh ubh)}(hhh]hKorean}hhnsbah}(h]h ]h"]h$]h&] refdomainh)reftypeh+ reftarget'/translations/ko_KR/scheduler/sched-extmodnameN classnameN refexplicituh1hhh ubh)}(hhh]hSpanish}hhsbah}(h]h ]h"]h$]h&] refdomainh)reftypeh+ reftarget'/translations/sp_SP/scheduler/sched-extmodnameN classnameN refexplicituh1hhh ubeh}(h]h ]h"]h$]h&]current_languageEnglishuh1h hh _documenthsourceNlineNubhtarget)}(h.. _sched-ext:h]h}(h]h ]h"]h$]h&]refid sched-extuh1hhKhhhhhA/var/lib/git/docbuild/linux/Documentation/scheduler/sched-ext.rstubhsection)}(hhh](htitle)}(hExtensible Scheduler Classh]hExtensible Scheduler Class}(hhhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhhhhhKubh paragraph)}(hjsched_ext is a scheduler class whose behavior can be defined by a set of BPF programs - the BPF scheduler.h]hjsched_ext is a scheduler class whose behavior can be defined by a set of BPF programs - the BPF scheduler.}(hhhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhhhhubh bullet_list)}(hhh](h list_item)}(hjsched_ext exports a full scheduling interface so that any scheduling algorithm can be implemented on top. h]h)}(hisched_ext exports a full scheduling interface so that any scheduling algorithm can be implemented on top.h]hisched_ext exports a full scheduling interface so that any scheduling algorithm can be implemented on top.}(hhhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhK hhubah}(h]h ]h"]h$]h&]uh1hhhhhhhhNubh)}(hThe BPF scheduler can group CPUs however it sees fit and schedule them together, as tasks aren't tied to specific CPUs at the time of wakeup. h]h)}(hThe BPF scheduler can group CPUs however it sees fit and schedule them together, as tasks aren't tied to specific CPUs at the time of wakeup.h]hThe BPF scheduler can group CPUs however it sees fit and schedule them together, as tasks aren’t tied to specific CPUs at the time of wakeup.}(hhhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhK hhubah}(h]h ]h"]h$]h&]uh1hhhhhhhhNubh)}(h@The BPF scheduler can be turned on and off dynamically anytime. h]h)}(h?The BPF scheduler can be turned on and off dynamically anytime.h]h?The BPF scheduler can be turned on and off dynamically anytime.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhj ubah}(h]h ]h"]h$]h&]uh1hhhhhhhhNubh)}(hThe system integrity is maintained no matter what the BPF scheduler does. The default scheduling behavior is restored anytime an error is detected, a runnable task stalls, or on invoking the SysRq key sequence `SysRq-S`. h]h)}(hThe system integrity is maintained no matter what the BPF scheduler does. The default scheduling behavior is restored anytime an error is detected, a runnable task stalls, or on invoking the SysRq key sequence `SysRq-S`.h](hThe system integrity is maintained no matter what the BPF scheduler does. The default scheduling behavior is restored anytime an error is detected, a runnable task stalls, or on invoking the SysRq key sequence }(hj(hhhNhNubhtitle_reference)}(h `SysRq-S`h]hSysRq-S}(hj2hhhNhNubah}(h]h ]h"]h$]h&]uh1j0hj(ubh.}(hj(hhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhKhj$ubah}(h]h ]h"]h$]h&]uh1hhhhhhhhNubh)}(hXxWhen the BPF scheduler triggers an error, debug information is dumped to aid debugging. The debug dump is passed to and printed out by the scheduler binary. The debug dump can also be accessed through the `sched_ext_dump` tracepoint. The SysRq key sequence `SysRq-D` triggers a debug dump. This doesn't terminate the BPF scheduler and can only be read through the tracepoint. h]h)}(hXwWhen the BPF scheduler triggers an error, debug information is dumped to aid debugging. The debug dump is passed to and printed out by the scheduler binary. The debug dump can also be accessed through the `sched_ext_dump` tracepoint. The SysRq key sequence `SysRq-D` triggers a debug dump. This doesn't terminate the BPF scheduler and can only be read through the tracepoint.h](hWhen the BPF scheduler triggers an error, debug information is dumped to aid debugging. The debug dump is passed to and printed out by the scheduler binary. The debug dump can also be accessed through the }(hjThhhNhNubj1)}(h`sched_ext_dump`h]hsched_ext_dump}(hj\hhhNhNubah}(h]h ]h"]h$]h&]uh1j0hjTubh$ tracepoint. The SysRq key sequence }(hjThhhNhNubj1)}(h `SysRq-D`h]hSysRq-D}(hjnhhhNhNubah}(h]h ]h"]h$]h&]uh1j0hjTubho triggers a debug dump. This doesn’t terminate the BPF scheduler and can only be read through the tracepoint.}(hjThhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhKhjPubah}(h]h ]h"]h$]h&]uh1hhhhhhhhNubeh}(h]h ]h"]h$]h&]bullet*uh1hhhhK hhhhubh)}(hhh](h)}(hSwitching to and from sched_exth]hSwitching to and from sched_ext}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhjhhhhhKubh)}(h``CONFIG_SCHED_CLASS_EXT`` is the config option to enable sched_ext and ``tools/sched_ext`` contains the example schedulers. The following config options should be enabled to use sched_ext:h](hliteral)}(h``CONFIG_SCHED_CLASS_EXT``h]hCONFIG_SCHED_CLASS_EXT}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjubh. is the config option to enable sched_ext and }(hjhhhNhNubj)}(h``tools/sched_ext``h]htools/sched_ext}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjubhb contains the example schedulers. The following config options should be enabled to use sched_ext:}(hjhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhK!hjhhubh literal_block)}(hCONFIG_BPF=y CONFIG_SCHED_CLASS_EXT=y CONFIG_BPF_SYSCALL=y CONFIG_BPF_JIT=y CONFIG_DEBUG_INFO_BTF=y CONFIG_BPF_JIT_ALWAYS_ON=y CONFIG_BPF_JIT_DEFAULT_ON=y CONFIG_PAHOLE_HAS_SPLIT_BTF=y CONFIG_PAHOLE_HAS_BTF_TAG=yh]hCONFIG_BPF=y CONFIG_SCHED_CLASS_EXT=y CONFIG_BPF_SYSCALL=y CONFIG_BPF_JIT=y CONFIG_DEBUG_INFO_BTF=y CONFIG_BPF_JIT_ALWAYS_ON=y CONFIG_BPF_JIT_DEFAULT_ON=y CONFIG_PAHOLE_HAS_SPLIT_BTF=y CONFIG_PAHOLE_HAS_BTF_TAG=y}hjsbah}(h]h ]h"]h$]h&] xml:spacepreserveforcelanguagenonehighlight_args}uh1jhhhK%hjhhubh)}(hDsched_ext is used only when the BPF scheduler is loaded and running.h]hDsched_ext is used only when the BPF scheduler is loaded and running.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhK1hjhhubh)}(hIf a task explicitly sets its scheduling policy to ``SCHED_EXT``, it will be treated as ``SCHED_NORMAL`` and scheduled by the fair-class scheduler until the BPF scheduler is loaded.h](h3If a task explicitly sets its scheduling policy to }(hjhhhNhNubj)}(h ``SCHED_EXT``h]h SCHED_EXT}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjubh, it will be treated as }(hjhhhNhNubj)}(h``SCHED_NORMAL``h]h SCHED_NORMAL}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjubhM and scheduled by the fair-class scheduler until the BPF scheduler is loaded.}(hjhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhK3hjhhubh)}(hWhen the BPF scheduler is loaded and ``SCX_OPS_SWITCH_PARTIAL`` is not set in ``ops->flags``, all ``SCHED_NORMAL``, ``SCHED_BATCH``, ``SCHED_IDLE``, and ``SCHED_EXT`` tasks are scheduled by sched_ext.h](h%When the BPF scheduler is loaded and }(hj,hhhNhNubj)}(h``SCX_OPS_SWITCH_PARTIAL``h]hSCX_OPS_SWITCH_PARTIAL}(hj4hhhNhNubah}(h]h ]h"]h$]h&]uh1jhj,ubh is not set in }(hj,hhhNhNubj)}(h``ops->flags``h]h ops->flags}(hjFhhhNhNubah}(h]h ]h"]h$]h&]uh1jhj,ubh, all }(hj,hhhNhNubj)}(h``SCHED_NORMAL``h]h SCHED_NORMAL}(hjXhhhNhNubah}(h]h ]h"]h$]h&]uh1jhj,ubh, }(hj,hhhNhNubj)}(h``SCHED_BATCH``h]h SCHED_BATCH}(hjjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhj,ubh, }hj,sbj)}(h``SCHED_IDLE``h]h SCHED_IDLE}(hj|hhhNhNubah}(h]h ]h"]h$]h&]uh1jhj,ubh, and }(hj,hhhNhNubj)}(h ``SCHED_EXT``h]h SCHED_EXT}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhj,ubh" tasks are scheduled by sched_ext.}(hj,hhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhK7hjhhubh)}(hX!However, when the BPF scheduler is loaded and ``SCX_OPS_SWITCH_PARTIAL`` is set in ``ops->flags``, only tasks with the ``SCHED_EXT`` policy are scheduled by sched_ext, while tasks with ``SCHED_NORMAL``, ``SCHED_BATCH`` and ``SCHED_IDLE`` policies are scheduled by the fair-class scheduler.h](h.However, when the BPF scheduler is loaded and }(hjhhhNhNubj)}(h``SCX_OPS_SWITCH_PARTIAL``h]hSCX_OPS_SWITCH_PARTIAL}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjubh is set in }(hjhhhNhNubj)}(h``ops->flags``h]h ops->flags}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjubh, only tasks with the }(hjhhhNhNubj)}(h ``SCHED_EXT``h]h SCHED_EXT}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjubh5 policy are scheduled by sched_ext, while tasks with }(hjhhhNhNubj)}(h``SCHED_NORMAL``h]h SCHED_NORMAL}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjubh, }(hjhhhNhNubj)}(h``SCHED_BATCH``h]h SCHED_BATCH}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjubh and }(hjhhhNhNubj)}(h``SCHED_IDLE``h]h SCHED_IDLE}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjubh4 policies are scheduled by the fair-class scheduler.}(hjhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhK;hjhhubh)}(hTerminating the sched_ext scheduler program, triggering `SysRq-S`, or detection of any internal error including stalled runnable tasks aborts the BPF scheduler and reverts all tasks back to the fair-class scheduler.h](h8Terminating the sched_ext scheduler program, triggering }(hj hhhNhNubj1)}(h `SysRq-S`h]hSysRq-S}(hj(hhhNhNubah}(h]h ]h"]h$]h&]uh1j0hj ubh, or detection of any internal error including stalled runnable tasks aborts the BPF scheduler and reverts all tasks back to the fair-class scheduler.}(hj hhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhK@hjhhubj)}(h# make -j16 -C tools/sched_ext # tools/sched_ext/build/bin/scx_simple local=0 global=3 local=5 global=24 local=9 global=44 local=13 global=56 local=17 global=72 ^CEXIT: BPF scheduler unregisteredh]h# make -j16 -C tools/sched_ext # tools/sched_ext/build/bin/scx_simple local=0 global=3 local=5 global=24 local=9 global=44 local=13 global=56 local=17 global=72 ^CEXIT: BPF scheduler unregistered}hj@sbah}(h]h ]h"]h$]h&]jjjjnonej}uh1jhhhKDhjhhubh)}(hEThe current status of the BPF scheduler can be determined as follows:h]hEThe current status of the BPF scheduler can be determined as follows:}(hjPhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKOhjhhubj)}(hU# cat /sys/kernel/sched_ext/state enabled # cat /sys/kernel/sched_ext/root/ops simpleh]hU# cat /sys/kernel/sched_ext/state enabled # cat /sys/kernel/sched_ext/root/ops simple}hj^sbah}(h]h ]h"]h$]h&]jjjjnonej}uh1jhhhKQhjhhubh)}(hYou can check if any BPF scheduler has ever been loaded since boot by examining this monotonically incrementing counter (a value of zero indicates that no BPF scheduler has been loaded):h]hYou can check if any BPF scheduler has ever been loaded since boot by examining this monotonically incrementing counter (a value of zero indicates that no BPF scheduler has been loaded):}(hjnhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKXhjhhubj)}(h(# cat /sys/kernel/sched_ext/enable_seq 1h]h(# cat /sys/kernel/sched_ext/enable_seq 1}hj|sbah}(h]h ]h"]h$]h&]jjjjnonej}uh1jhhhK\hjhhubh)}(h]``tools/sched_ext/scx_show_state.py`` is a drgn script which shows more detailed information:h](j)}(h%``tools/sched_ext/scx_show_state.py``h]h!tools/sched_ext/scx_show_state.py}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjubh8 is a drgn script which shows more detailed information:}(hjhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhKahjhhubj)}(h# tools/sched_ext/scx_show_state.py ops : simple enabled : 1 switching_all : 1 switched_all : 1 enable_state : enabled (2) bypass_depth : 0 nr_rejected : 0 enable_seq : 1h]h# tools/sched_ext/scx_show_state.py ops : simple enabled : 1 switching_all : 1 switched_all : 1 enable_state : enabled (2) bypass_depth : 0 nr_rejected : 0 enable_seq : 1}hjsbah}(h]h ]h"]h$]h&]jjjjnonej}uh1jhhhKdhjhhubh)}(hBWhether a given task is on sched_ext can be determined as follows:h]hBWhether a given task is on sched_ext can be determined as follows:}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKphjhhubj)}(h_# grep ext /proc/self/sched ext.enabled : 1h]h_# grep ext /proc/self/sched ext.enabled : 1}hjsbah}(h]h ]h"]h$]h&]jjjjnonej}uh1jhhhKrhjhhubeh}(h]switching-to-and-from-sched-extah ]h"]switching to and from sched_extah$]h&]uh1hhhhhhhhKubh)}(hhh](h)}(h The Basicsh]h The Basics}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhjhhhhhKxubh)}(hX^Userspace can implement an arbitrary BPF scheduler by loading a set of BPF programs that implement ``struct sched_ext_ops``. The only mandatory field is ``ops.name`` which must be a valid BPF object name. All operations are optional. The following modified excerpt is from ``tools/sched_ext/scx_simple.bpf.c`` showing a minimal global FIFO scheduler.h](hcUserspace can implement an arbitrary BPF scheduler by loading a set of BPF programs that implement }(hjhhhNhNubj)}(h``struct sched_ext_ops``h]hstruct sched_ext_ops}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjubh. The only mandatory field is }(hjhhhNhNubj)}(h ``ops.name``h]hops.name}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1jhjubhl which must be a valid BPF object name. All operations are optional. The following modified excerpt is from }(hjhhhNhNubj)}(h$``tools/sched_ext/scx_simple.bpf.c``h]h tools/sched_ext/scx_simple.bpf.c}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjubh) showing a minimal global FIFO scheduler.}(hjhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhKzhjhhubj)}(hX/* * Decide which CPU a task should be migrated to before being * enqueued (either at wakeup, fork time, or exec time). If an * idle core is found by the default ops.select_cpu() implementation, * then insert the task directly into SCX_DSQ_LOCAL and skip the * ops.enqueue() callback. * * Note that this implementation has exactly the same behavior as the * default ops.select_cpu implementation. The behavior of the scheduler * would be exactly same if the implementation just didn't define the * simple_select_cpu() struct_ops prog. */ s32 BPF_STRUCT_OPS(simple_select_cpu, struct task_struct *p, s32 prev_cpu, u64 wake_flags) { s32 cpu; /* Need to initialize or the BPF verifier will reject the program */ bool direct = false; cpu = scx_bpf_select_cpu_dfl(p, prev_cpu, wake_flags, &direct); if (direct) scx_bpf_dsq_insert(p, SCX_DSQ_LOCAL, SCX_SLICE_DFL, 0); return cpu; } /* * Do a direct insertion of a task to the global DSQ. This ops.enqueue() * callback will only be invoked if we failed to find a core to insert * into in ops.select_cpu() above. * * Note that this implementation has exactly the same behavior as the * default ops.enqueue implementation, which just dispatches the task * to SCX_DSQ_GLOBAL. The behavior of the scheduler would be exactly same * if the implementation just didn't define the simple_enqueue struct_ops * prog. */ void BPF_STRUCT_OPS(simple_enqueue, struct task_struct *p, u64 enq_flags) { scx_bpf_dsq_insert(p, SCX_DSQ_GLOBAL, SCX_SLICE_DFL, enq_flags); } s32 BPF_STRUCT_OPS_SLEEPABLE(simple_init) { /* * By default, all SCHED_EXT, SCHED_OTHER, SCHED_IDLE, and * SCHED_BATCH tasks should use sched_ext. */ return 0; } void BPF_STRUCT_OPS(simple_exit, struct scx_exit_info *ei) { exit_type = ei->type; } SEC(".struct_ops") struct sched_ext_ops simple_ops = { .select_cpu = (void *)simple_select_cpu, .enqueue = (void *)simple_enqueue, .init = (void *)simple_init, .exit = (void *)simple_exit, .name = "simple", };h]hX/* * Decide which CPU a task should be migrated to before being * enqueued (either at wakeup, fork time, or exec time). If an * idle core is found by the default ops.select_cpu() implementation, * then insert the task directly into SCX_DSQ_LOCAL and skip the * ops.enqueue() callback. * * Note that this implementation has exactly the same behavior as the * default ops.select_cpu implementation. The behavior of the scheduler * would be exactly same if the implementation just didn't define the * simple_select_cpu() struct_ops prog. */ s32 BPF_STRUCT_OPS(simple_select_cpu, struct task_struct *p, s32 prev_cpu, u64 wake_flags) { s32 cpu; /* Need to initialize or the BPF verifier will reject the program */ bool direct = false; cpu = scx_bpf_select_cpu_dfl(p, prev_cpu, wake_flags, &direct); if (direct) scx_bpf_dsq_insert(p, SCX_DSQ_LOCAL, SCX_SLICE_DFL, 0); return cpu; } /* * Do a direct insertion of a task to the global DSQ. This ops.enqueue() * callback will only be invoked if we failed to find a core to insert * into in ops.select_cpu() above. * * Note that this implementation has exactly the same behavior as the * default ops.enqueue implementation, which just dispatches the task * to SCX_DSQ_GLOBAL. The behavior of the scheduler would be exactly same * if the implementation just didn't define the simple_enqueue struct_ops * prog. */ void BPF_STRUCT_OPS(simple_enqueue, struct task_struct *p, u64 enq_flags) { scx_bpf_dsq_insert(p, SCX_DSQ_GLOBAL, SCX_SLICE_DFL, enq_flags); } s32 BPF_STRUCT_OPS_SLEEPABLE(simple_init) { /* * By default, all SCHED_EXT, SCHED_OTHER, SCHED_IDLE, and * SCHED_BATCH tasks should use sched_ext. */ return 0; } void BPF_STRUCT_OPS(simple_exit, struct scx_exit_info *ei) { exit_type = ei->type; } SEC(".struct_ops") struct sched_ext_ops simple_ops = { .select_cpu = (void *)simple_select_cpu, .enqueue = (void *)simple_enqueue, .init = (void *)simple_init, .exit = (void *)simple_exit, .name = "simple", };}hj3sbah}(h]h ]h"]h$]h&]jjjjcj}uh1jhhhKhjhhubh)}(hhh](h)}(hDispatch Queuesh]hDispatch Queues}(hjFhhhNhNubah}(h]h ]h"]h$]h&]uh1hhjChhhhhKubh)}(hXTo match the impedance between the scheduler core and the BPF scheduler, sched_ext uses DSQs (dispatch queues) which can operate as both a FIFO and a priority queue. By default, there is one global FIFO (``SCX_DSQ_GLOBAL``), and one local DSQ per CPU (``SCX_DSQ_LOCAL``). The BPF scheduler can manage an arbitrary number of DSQs using ``scx_bpf_create_dsq()`` and ``scx_bpf_destroy_dsq()``.h](hTo match the impedance between the scheduler core and the BPF scheduler, sched_ext uses DSQs (dispatch queues) which can operate as both a FIFO and a priority queue. By default, there is one global FIFO (}(hjThhhNhNubj)}(h``SCX_DSQ_GLOBAL``h]hSCX_DSQ_GLOBAL}(hj\hhhNhNubah}(h]h ]h"]h$]h&]uh1jhjTubh), and one local DSQ per CPU (}(hjThhhNhNubj)}(h``SCX_DSQ_LOCAL``h]h SCX_DSQ_LOCAL}(hjnhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjTubhB). The BPF scheduler can manage an arbitrary number of DSQs using }(hjThhhNhNubj)}(h``scx_bpf_create_dsq()``h]hscx_bpf_create_dsq()}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjTubh and }(hjThhhNhNubj)}(h``scx_bpf_destroy_dsq()``h]hscx_bpf_destroy_dsq()}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjTubh.}(hjThhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhKhjChhubh)}(hA CPU always executes a task from its local DSQ. A task is "inserted" into a DSQ. A task in a non-local DSQ is "move"d into the target CPU's local DSQ.h]hA CPU always executes a task from its local DSQ. A task is “inserted” into a DSQ. A task in a non-local DSQ is “move”d into the target CPU’s local DSQ.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhjChhubh)}(hWhen a CPU is looking for the next task to run, if the local DSQ is not empty, the first task is picked. Otherwise, the CPU tries to move a task from the global DSQ. If that doesn't yield a runnable task either, ``ops.dispatch()`` is invoked.h](hWhen a CPU is looking for the next task to run, if the local DSQ is not empty, the first task is picked. Otherwise, the CPU tries to move a task from the global DSQ. If that doesn’t yield a runnable task either, }(hjhhhNhNubj)}(h``ops.dispatch()``h]hops.dispatch()}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjubh is invoked.}(hjhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhKhjChhubeh}(h]dispatch-queuesah ]h"]dispatch queuesah$]h&]uh1hhjhhhhhKubh)}(hhh](h)}(hScheduling Cycleh]hScheduling Cycle}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhjhhhhhKubh)}(hHThe following briefly shows how a waking task is scheduled and executed.h]hHThe following briefly shows how a waking task is scheduled and executed.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhjhhubhenumerated_list)}(hhh](h)}(hXWhen a task is waking up, ``ops.select_cpu()`` is the first operation invoked. This serves two purposes. First, CPU selection optimization hint. Second, waking up the selected CPU if idle. The CPU selected by ``ops.select_cpu()`` is an optimization hint and not binding. The actual decision is made at the last step of scheduling. However, there is a small performance gain if the CPU ``ops.select_cpu()`` returns matches the CPU the task eventually runs on. A side-effect of selecting a CPU is waking it up from idle. While a BPF scheduler can wake up any cpu using the ``scx_bpf_kick_cpu()`` helper, using ``ops.select_cpu()`` judiciously can be simpler and more efficient. A task can be immediately inserted into a DSQ from ``ops.select_cpu()`` by calling ``scx_bpf_dsq_insert()``. If the task is inserted into ``SCX_DSQ_LOCAL`` from ``ops.select_cpu()``, it will be inserted into the local DSQ of whichever CPU is returned from ``ops.select_cpu()``. Additionally, inserting directly from ``ops.select_cpu()`` will cause the ``ops.enqueue()`` callback to be skipped. Note that the scheduler core will ignore an invalid CPU selection, for example, if it's outside the allowed cpumask of the task. h](h)}(hWhen a task is waking up, ``ops.select_cpu()`` is the first operation invoked. This serves two purposes. First, CPU selection optimization hint. Second, waking up the selected CPU if idle.h](hWhen a task is waking up, }(hjhhhNhNubj)}(h``ops.select_cpu()``h]hops.select_cpu()}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjubh is the first operation invoked. This serves two purposes. First, CPU selection optimization hint. Second, waking up the selected CPU if idle.}(hjhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhKhjubh)}(hX The CPU selected by ``ops.select_cpu()`` is an optimization hint and not binding. The actual decision is made at the last step of scheduling. However, there is a small performance gain if the CPU ``ops.select_cpu()`` returns matches the CPU the task eventually runs on.h](hThe CPU selected by }(hj(hhhNhNubj)}(h``ops.select_cpu()``h]hops.select_cpu()}(hj0hhhNhNubah}(h]h ]h"]h$]h&]uh1jhj(ubh is an optimization hint and not binding. The actual decision is made at the last step of scheduling. However, there is a small performance gain if the CPU }(hj(hhhNhNubj)}(h``ops.select_cpu()``h]hops.select_cpu()}(hjBhhhNhNubah}(h]h ]h"]h$]h&]uh1jhj(ubh5 returns matches the CPU the task eventually runs on.}(hj(hhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhKhjubh)}(hA side-effect of selecting a CPU is waking it up from idle. While a BPF scheduler can wake up any cpu using the ``scx_bpf_kick_cpu()`` helper, using ``ops.select_cpu()`` judiciously can be simpler and more efficient.h](hpA side-effect of selecting a CPU is waking it up from idle. While a BPF scheduler can wake up any cpu using the }(hjZhhhNhNubj)}(h``scx_bpf_kick_cpu()``h]hscx_bpf_kick_cpu()}(hjbhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjZubh helper, using }(hjZhhhNhNubj)}(h``ops.select_cpu()``h]hops.select_cpu()}(hjthhhNhNubah}(h]h ]h"]h$]h&]uh1jhjZubh/ judiciously can be simpler and more efficient.}(hjZhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhKhjubh)}(hXA task can be immediately inserted into a DSQ from ``ops.select_cpu()`` by calling ``scx_bpf_dsq_insert()``. If the task is inserted into ``SCX_DSQ_LOCAL`` from ``ops.select_cpu()``, it will be inserted into the local DSQ of whichever CPU is returned from ``ops.select_cpu()``. Additionally, inserting directly from ``ops.select_cpu()`` will cause the ``ops.enqueue()`` callback to be skipped.h](h3A task can be immediately inserted into a DSQ from }(hjhhhNhNubj)}(h``ops.select_cpu()``h]hops.select_cpu()}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjubh by calling }(hjhhhNhNubj)}(h``scx_bpf_dsq_insert()``h]hscx_bpf_dsq_insert()}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjubh. If the task is inserted into }(hjhhhNhNubj)}(h``SCX_DSQ_LOCAL``h]h SCX_DSQ_LOCAL}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjubh from }(hjhhhNhNubj)}(h``ops.select_cpu()``h]hops.select_cpu()}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjubhK, it will be inserted into the local DSQ of whichever CPU is returned from }(hjhhhNhNubj)}(h``ops.select_cpu()``h]hops.select_cpu()}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjubh(. Additionally, inserting directly from }(hjhhhNhNubj)}(h``ops.select_cpu()``h]hops.select_cpu()}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjubh will cause the }(hjhhhNhNubj)}(h``ops.enqueue()``h]h ops.enqueue()}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjubh callback to be skipped.}(hjhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhKhjubh)}(hNote that the scheduler core will ignore an invalid CPU selection, for example, if it's outside the allowed cpumask of the task.h]hNote that the scheduler core will ignore an invalid CPU selection, for example, if it’s outside the allowed cpumask of the task.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhjubeh}(h]h ]h"]h$]h&]uh1hhjhhhhhNubh)}(hX6Once the target CPU is selected, ``ops.enqueue()`` is invoked (unless the task was inserted directly from ``ops.select_cpu()``). ``ops.enqueue()`` can make one of the following decisions: * Immediately insert the task into either the global or a local DSQ by calling ``scx_bpf_dsq_insert()`` with one of the following options: ``SCX_DSQ_GLOBAL``, ``SCX_DSQ_LOCAL``, or ``SCX_DSQ_LOCAL_ON | cpu``. * Immediately insert the task into a custom DSQ by calling ``scx_bpf_dsq_insert()`` with a DSQ ID which is smaller than 2^63. * Queue the task on the BPF side. h](h)}(hOnce the target CPU is selected, ``ops.enqueue()`` is invoked (unless the task was inserted directly from ``ops.select_cpu()``). ``ops.enqueue()`` can make one of the following decisions:h](h!Once the target CPU is selected, }(hj0hhhNhNubj)}(h``ops.enqueue()``h]h ops.enqueue()}(hj8hhhNhNubah}(h]h ]h"]h$]h&]uh1jhj0ubh8 is invoked (unless the task was inserted directly from }(hj0hhhNhNubj)}(h``ops.select_cpu()``h]hops.select_cpu()}(hjJhhhNhNubah}(h]h ]h"]h$]h&]uh1jhj0ubh). }(hj0hhhNhNubj)}(h``ops.enqueue()``h]h ops.enqueue()}(hj\hhhNhNubah}(h]h ]h"]h$]h&]uh1jhj0ubh) can make one of the following decisions:}(hj0hhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhKhj,ubh)}(hhh](h)}(hImmediately insert the task into either the global or a local DSQ by calling ``scx_bpf_dsq_insert()`` with one of the following options: ``SCX_DSQ_GLOBAL``, ``SCX_DSQ_LOCAL``, or ``SCX_DSQ_LOCAL_ON | cpu``. h]h)}(hImmediately insert the task into either the global or a local DSQ by calling ``scx_bpf_dsq_insert()`` with one of the following options: ``SCX_DSQ_GLOBAL``, ``SCX_DSQ_LOCAL``, or ``SCX_DSQ_LOCAL_ON | cpu``.h](hMImmediately insert the task into either the global or a local DSQ by calling }(hj{hhhNhNubj)}(h``scx_bpf_dsq_insert()``h]hscx_bpf_dsq_insert()}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhj{ubh$ with one of the following options: }(hj{hhhNhNubj)}(h``SCX_DSQ_GLOBAL``h]hSCX_DSQ_GLOBAL}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhj{ubh, }(hj{hhhNhNubj)}(h``SCX_DSQ_LOCAL``h]h SCX_DSQ_LOCAL}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhj{ubh, or }(hj{hhhNhNubj)}(h``SCX_DSQ_LOCAL_ON | cpu``h]hSCX_DSQ_LOCAL_ON | cpu}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhj{ubh.}(hj{hhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhKhjwubah}(h]h ]h"]h$]h&]uh1hhjtubh)}(h|Immediately insert the task into a custom DSQ by calling ``scx_bpf_dsq_insert()`` with a DSQ ID which is smaller than 2^63. h]h)}(h{Immediately insert the task into a custom DSQ by calling ``scx_bpf_dsq_insert()`` with a DSQ ID which is smaller than 2^63.h](h9Immediately insert the task into a custom DSQ by calling }(hjhhhNhNubj)}(h``scx_bpf_dsq_insert()``h]hscx_bpf_dsq_insert()}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjubh* with a DSQ ID which is smaller than 2^63.}(hjhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhKhjubah}(h]h ]h"]h$]h&]uh1hhjtubh)}(h Queue the task on the BPF side. h]h)}(hQueue the task on the BPF side.h]hQueue the task on the BPF side.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhjubah}(h]h ]h"]h$]h&]uh1hhjtubeh}(h]h ]h"]h$]h&]jjuh1hhhhKhj,ubeh}(h]h ]h"]h$]h&]uh1hhjhhhNhNubh)}(hXWhen a CPU is ready to schedule, it first looks at its local DSQ. If empty, it then looks at the global DSQ. If there still isn't a task to run, ``ops.dispatch()`` is invoked which can use the following two functions to populate the local DSQ. * ``scx_bpf_dsq_insert()`` inserts a task to a DSQ. Any target DSQ can be used - ``SCX_DSQ_LOCAL``, ``SCX_DSQ_LOCAL_ON | cpu``, ``SCX_DSQ_GLOBAL`` or a custom DSQ. While ``scx_bpf_dsq_insert()`` currently can't be called with BPF locks held, this is being worked on and will be supported. ``scx_bpf_dsq_insert()`` schedules insertion rather than performing them immediately. There can be up to ``ops.dispatch_max_batch`` pending tasks. * ``scx_bpf_move_to_local()`` moves a task from the specified non-local DSQ to the dispatching DSQ. This function cannot be called with any BPF locks held. ``scx_bpf_move_to_local()`` flushes the pending insertions tasks before trying to move from the specified DSQ. h](h)}(hWhen a CPU is ready to schedule, it first looks at its local DSQ. If empty, it then looks at the global DSQ. If there still isn't a task to run, ``ops.dispatch()`` is invoked which can use the following two functions to populate the local DSQ.h](hWhen a CPU is ready to schedule, it first looks at its local DSQ. If empty, it then looks at the global DSQ. If there still isn’t a task to run, }(hj)hhhNhNubj)}(h``ops.dispatch()``h]hops.dispatch()}(hj1hhhNhNubah}(h]h ]h"]h$]h&]uh1jhj)ubhP is invoked which can use the following two functions to populate the local DSQ.}(hj)hhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhKhj%ubh)}(hhh](h)}(hX``scx_bpf_dsq_insert()`` inserts a task to a DSQ. Any target DSQ can be used - ``SCX_DSQ_LOCAL``, ``SCX_DSQ_LOCAL_ON | cpu``, ``SCX_DSQ_GLOBAL`` or a custom DSQ. While ``scx_bpf_dsq_insert()`` currently can't be called with BPF locks held, this is being worked on and will be supported. ``scx_bpf_dsq_insert()`` schedules insertion rather than performing them immediately. There can be up to ``ops.dispatch_max_batch`` pending tasks. h]h)}(hX``scx_bpf_dsq_insert()`` inserts a task to a DSQ. Any target DSQ can be used - ``SCX_DSQ_LOCAL``, ``SCX_DSQ_LOCAL_ON | cpu``, ``SCX_DSQ_GLOBAL`` or a custom DSQ. While ``scx_bpf_dsq_insert()`` currently can't be called with BPF locks held, this is being worked on and will be supported. ``scx_bpf_dsq_insert()`` schedules insertion rather than performing them immediately. There can be up to ``ops.dispatch_max_batch`` pending tasks.h](j)}(h``scx_bpf_dsq_insert()``h]hscx_bpf_dsq_insert()}(hjThhhNhNubah}(h]h ]h"]h$]h&]uh1jhjPubh7 inserts a task to a DSQ. Any target DSQ can be used - }(hjPhhhNhNubj)}(h``SCX_DSQ_LOCAL``h]h SCX_DSQ_LOCAL}(hjfhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjPubh, }(hjPhhhNhNubj)}(h``SCX_DSQ_LOCAL_ON | cpu``h]hSCX_DSQ_LOCAL_ON | cpu}(hjxhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjPubh, }(hjPhhhNhNubj)}(h``SCX_DSQ_GLOBAL``h]hSCX_DSQ_GLOBAL}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjPubh or a custom DSQ. While }(hjPhhhNhNubj)}(h``scx_bpf_dsq_insert()``h]hscx_bpf_dsq_insert()}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjPubha currently can’t be called with BPF locks held, this is being worked on and will be supported. }(hjPhhhNhNubj)}(h``scx_bpf_dsq_insert()``h]hscx_bpf_dsq_insert()}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjPubhQ schedules insertion rather than performing them immediately. There can be up to }(hjPhhhNhNubj)}(h``ops.dispatch_max_batch``h]hops.dispatch_max_batch}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjPubh pending tasks.}(hjPhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhMhjLubah}(h]h ]h"]h$]h&]uh1hhjIubh)}(hX ``scx_bpf_move_to_local()`` moves a task from the specified non-local DSQ to the dispatching DSQ. This function cannot be called with any BPF locks held. ``scx_bpf_move_to_local()`` flushes the pending insertions tasks before trying to move from the specified DSQ. h]h)}(hX``scx_bpf_move_to_local()`` moves a task from the specified non-local DSQ to the dispatching DSQ. This function cannot be called with any BPF locks held. ``scx_bpf_move_to_local()`` flushes the pending insertions tasks before trying to move from the specified DSQ.h](j)}(h``scx_bpf_move_to_local()``h]hscx_bpf_move_to_local()}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjubh moves a task from the specified non-local DSQ to the dispatching DSQ. This function cannot be called with any BPF locks held. }(hjhhhNhNubj)}(h``scx_bpf_move_to_local()``h]hscx_bpf_move_to_local()}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjubhS flushes the pending insertions tasks before trying to move from the specified DSQ.}(hjhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhM hjubah}(h]h ]h"]h$]h&]uh1hhjIubeh}(h]h ]h"]h$]h&]jjuh1hhhhMhj%ubeh}(h]h ]h"]h$]h&]uh1hhjhhhNhNubh)}(hXAfter ``ops.dispatch()`` returns, if there are tasks in the local DSQ, the CPU runs the first one. If empty, the following steps are taken: * Try to move from the global DSQ. If successful, run the task. * If ``ops.dispatch()`` has dispatched any tasks, retry #3. * If the previous task is an SCX task and still runnable, keep executing it (see ``SCX_OPS_ENQ_LAST``). * Go idle. h](h)}(hAfter ``ops.dispatch()`` returns, if there are tasks in the local DSQ, the CPU runs the first one. If empty, the following steps are taken:h](hAfter }(hj&hhhNhNubj)}(h``ops.dispatch()``h]hops.dispatch()}(hj.hhhNhNubah}(h]h ]h"]h$]h&]uh1jhj&ubhs returns, if there are tasks in the local DSQ, the CPU runs the first one. If empty, the following steps are taken:}(hj&hhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhMhj"ubh)}(hhh](h)}(h>Try to move from the global DSQ. If successful, run the task. h]h)}(h=Try to move from the global DSQ. If successful, run the task.h]h=Try to move from the global DSQ. If successful, run the task.}(hjMhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMhjIubah}(h]h ]h"]h$]h&]uh1hhjFubh)}(h:If ``ops.dispatch()`` has dispatched any tasks, retry #3. h]h)}(h9If ``ops.dispatch()`` has dispatched any tasks, retry #3.h](hIf }(hjehhhNhNubj)}(h``ops.dispatch()``h]hops.dispatch()}(hjmhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjeubh$ has dispatched any tasks, retry #3.}(hjehhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhMhjaubah}(h]h ]h"]h$]h&]uh1hhjFubh)}(hfIf the previous task is an SCX task and still runnable, keep executing it (see ``SCX_OPS_ENQ_LAST``). h]h)}(heIf the previous task is an SCX task and still runnable, keep executing it (see ``SCX_OPS_ENQ_LAST``).h](hOIf the previous task is an SCX task and still runnable, keep executing it (see }(hjhhhNhNubj)}(h``SCX_OPS_ENQ_LAST``h]hSCX_OPS_ENQ_LAST}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjubh).}(hjhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhMhjubah}(h]h ]h"]h$]h&]uh1hhjFubh)}(h Go idle. h]h)}(hGo idle.h]hGo idle.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMhjubah}(h]h ]h"]h$]h&]uh1hhjFubeh}(h]h ]h"]h$]h&]jjuh1hhhhMhj"ubeh}(h]h ]h"]h$]h&]uh1hhjhhhNhNubeh}(h]h ]h"]h$]h&]enumtypearabicprefixhsuffix.uh1jhjhhhhhKubh)}(hXONote that the BPF scheduler can always choose to dispatch tasks immediately in ``ops.enqueue()`` as illustrated in the above simple example. If only the built-in DSQs are used, there is no need to implement ``ops.dispatch()`` as a task is never queued on the BPF scheduler and both the local and global DSQs are executed automatically.h](hONote that the BPF scheduler can always choose to dispatch tasks immediately in }(hjhhhNhNubj)}(h``ops.enqueue()``h]h ops.enqueue()}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjubho as illustrated in the above simple example. If only the built-in DSQs are used, there is no need to implement }(hjhhhNhNubj)}(h``ops.dispatch()``h]hops.dispatch()}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjubhn as a task is never queued on the BPF scheduler and both the local and global DSQs are executed automatically.}(hjhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhMhjhhubh)}(hX``scx_bpf_dsq_insert()`` inserts the task on the FIFO of the target DSQ. Use ``scx_bpf_dsq_insert_vtime()`` for the priority queue. Internal DSQs such as ``SCX_DSQ_LOCAL`` and ``SCX_DSQ_GLOBAL`` do not support priority-queue dispatching, and must be dispatched to with ``scx_bpf_dsq_insert()``. See the function documentation and usage in ``tools/sched_ext/scx_simple.bpf.c`` for more information.h](j)}(h``scx_bpf_dsq_insert()``h]hscx_bpf_dsq_insert()}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1jhj ubh5 inserts the task on the FIFO of the target DSQ. Use }(hj hhhNhNubj)}(h``scx_bpf_dsq_insert_vtime()``h]hscx_bpf_dsq_insert_vtime()}(hj, hhhNhNubah}(h]h ]h"]h$]h&]uh1jhj ubh/ for the priority queue. Internal DSQs such as }(hj hhhNhNubj)}(h``SCX_DSQ_LOCAL``h]h SCX_DSQ_LOCAL}(hj> hhhNhNubah}(h]h ]h"]h$]h&]uh1jhj ubh and }(hj hhhNhNubj)}(h``SCX_DSQ_GLOBAL``h]hSCX_DSQ_GLOBAL}(hjP hhhNhNubah}(h]h ]h"]h$]h&]uh1jhj ubhK do not support priority-queue dispatching, and must be dispatched to with }(hj hhhNhNubj)}(h``scx_bpf_dsq_insert()``h]hscx_bpf_dsq_insert()}(hjb hhhNhNubah}(h]h ]h"]h$]h&]uh1jhj ubh.. See the function documentation and usage in }(hj hhhNhNubj)}(h$``tools/sched_ext/scx_simple.bpf.c``h]h tools/sched_ext/scx_simple.bpf.c}(hjt hhhNhNubah}(h]h ]h"]h$]h&]uh1jhj ubh for more information.}(hj hhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhM#hjhhubeh}(h]scheduling-cycleah ]h"]scheduling cycleah$]h&]uh1hhjhhhhhKubh)}(hhh](h)}(hTask Lifecycleh]hTask Lifecycle}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1hhj hhhhhM+ubh)}(heThe following pseudo-code summarizes the entire lifecycle of a task managed by a sched_ext scheduler:h]heThe following pseudo-code summarizes the entire lifecycle of a task managed by a sched_ext scheduler:}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhM-hj hhubj)}(hXops.init_task(); /* A new task is created */ ops.enable(); /* Enable BPF scheduling for the task */ while (task in SCHED_EXT) { if (task can migrate) ops.select_cpu(); /* Called on wakeup (optimization) */ ops.runnable(); /* Task becomes ready to run */ while (task is runnable) { if (task is not in a DSQ) { ops.enqueue(); /* Task can be added to a DSQ */ /* A CPU becomes available */ ops.dispatch(); /* Task is moved to a local DSQ */ } ops.running(); /* Task starts running on its assigned CPU */ ops.tick(); /* Called every 1/HZ seconds */ ops.stopping(); /* Task stops running (time slice expires or wait) */ } ops.quiescent(); /* Task releases its assigned CPU (wait) */ } ops.disable(); /* Disable BPF scheduling for the task */ ops.exit_task(); /* Task is destroyed */h]hXops.init_task(); /* A new task is created */ ops.enable(); /* Enable BPF scheduling for the task */ while (task in SCHED_EXT) { if (task can migrate) ops.select_cpu(); /* Called on wakeup (optimization) */ ops.runnable(); /* Task becomes ready to run */ while (task is runnable) { if (task is not in a DSQ) { ops.enqueue(); /* Task can be added to a DSQ */ /* A CPU becomes available */ ops.dispatch(); /* Task is moved to a local DSQ */ } ops.running(); /* Task starts running on its assigned CPU */ ops.tick(); /* Called every 1/HZ seconds */ ops.stopping(); /* Task stops running (time slice expires or wait) */ } ops.quiescent(); /* Task releases its assigned CPU (wait) */ } ops.disable(); /* Disable BPF scheduling for the task */ ops.exit_task(); /* Task is destroyed */}hj sbah}(h]h ]h"]h$]h&]jjjjjAj}uh1jhhhM0hj hhubeh}(h]task-lifecycleah ]h"]task lifecycleah$]h&]uh1hhjhhhhhM+ubeh}(h] the-basicsah ]h"] the basicsah$]h&]uh1hhhhhhhhKxubh)}(hhh](h)}(h Where to Lookh]h Where to Look}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1hhj hhhhhMOubh)}(hhh](h)}(hY``include/linux/sched/ext.h`` defines the core data structures, ops table and constants. h]h)}(hX``include/linux/sched/ext.h`` defines the core data structures, ops table and constants.h](j)}(h``include/linux/sched/ext.h``h]hinclude/linux/sched/ext.h}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1jhj ubh; defines the core data structures, ops table and constants.}(hj hhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhMQhj ubah}(h]h ]h"]h$]h&]uh1hhj hhhhhNubh)}(h``kernel/sched/ext.c`` contains sched_ext core implementation and helpers. The functions prefixed with ``scx_bpf_`` can be called from the BPF scheduler. h]h)}(h``kernel/sched/ext.c`` contains sched_ext core implementation and helpers. The functions prefixed with ``scx_bpf_`` can be called from the BPF scheduler.h](j)}(h``kernel/sched/ext.c``h]hkernel/sched/ext.c}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1jhj ubhQ contains sched_ext core implementation and helpers. The functions prefixed with }(hj hhhNhNubj)}(h ``scx_bpf_``h]hscx_bpf_}(hj& hhhNhNubah}(h]h ]h"]h$]h&]uh1jhj ubh& can be called from the BPF scheduler.}(hj hhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhMThj ubah}(h]h ]h"]h$]h&]uh1hhj hhhhhNubh)}(hX``tools/sched_ext/`` hosts example BPF scheduler implementations. * ``scx_simple[.bpf].c``: Minimal global FIFO scheduler example using a custom DSQ. * ``scx_qmap[.bpf].c``: A multi-level FIFO scheduler supporting five levels of priority implemented with ``BPF_MAP_TYPE_QUEUE``. h](h)}(hA``tools/sched_ext/`` hosts example BPF scheduler implementations.h](j)}(h``tools/sched_ext/``h]htools/sched_ext/}(hjL hhhNhNubah}(h]h ]h"]h$]h&]uh1jhjH ubh- hosts example BPF scheduler implementations.}(hjH hhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhMXhjD ubh)}(hhh](h)}(hR``scx_simple[.bpf].c``: Minimal global FIFO scheduler example using a custom DSQ. h]h)}(hQ``scx_simple[.bpf].c``: Minimal global FIFO scheduler example using a custom DSQ.h](j)}(h``scx_simple[.bpf].c``h]hscx_simple[.bpf].c}(hjo hhhNhNubah}(h]h ]h"]h$]h&]uh1jhjk ubh;: Minimal global FIFO scheduler example using a custom DSQ.}(hjk hhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhMZhjg ubah}(h]h ]h"]h$]h&]uh1hhjd ubh)}(h``scx_qmap[.bpf].c``: A multi-level FIFO scheduler supporting five levels of priority implemented with ``BPF_MAP_TYPE_QUEUE``. h]h)}(h~``scx_qmap[.bpf].c``: A multi-level FIFO scheduler supporting five levels of priority implemented with ``BPF_MAP_TYPE_QUEUE``.h](j)}(h``scx_qmap[.bpf].c``h]hscx_qmap[.bpf].c}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1jhj ubhS: A multi-level FIFO scheduler supporting five levels of priority implemented with }(hj hhhNhNubj)}(h``BPF_MAP_TYPE_QUEUE``h]hBPF_MAP_TYPE_QUEUE}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1jhj ubh.}(hj hhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhM]hj ubah}(h]h ]h"]h$]h&]uh1hhjd ubeh}(h]h ]h"]h$]h&]jjuh1hhhhMZhjD ubeh}(h]h ]h"]h$]h&]uh1hhj hhhNhNubeh}(h]h ]h"]h$]h&]jjuh1hhhhMQhj hhubeh}(h] where-to-lookah ]h"] where to lookah$]h&]uh1hhhhhhhhMOubh)}(hhh](h)}(hABI Instabilityh]hABI Instability}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1hhj hhhhhMaubh)}(hThe APIs provided by sched_ext to BPF schedulers programs have no stability guarantees. This includes the ops table callbacks and constants defined in ``include/linux/sched/ext.h``, as well as the ``scx_bpf_`` kfuncs defined in ``kernel/sched/ext.c``.h](hThe APIs provided by sched_ext to BPF schedulers programs have no stability guarantees. This includes the ops table callbacks and constants defined in }(hj hhhNhNubj)}(h``include/linux/sched/ext.h``h]hinclude/linux/sched/ext.h}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1jhj ubh, as well as the }(hj hhhNhNubj)}(h ``scx_bpf_``h]hscx_bpf_}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1jhj ubh kfuncs defined in }(hj hhhNhNubj)}(h``kernel/sched/ext.c``h]hkernel/sched/ext.c}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1jhj ubh.}(hj hhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhMchj hhubh)}(hWhile we will attempt to provide a relatively stable API surface when possible, they are subject to change without warning between kernel versions.h]hWhile we will attempt to provide a relatively stable API surface when possible, they are subject to change without warning between kernel versions.}(hj4 hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMhhj hhubeh}(h]abi-instabilityah ]h"]abi instabilityah$]h&]uh1hhhhhhhhMaubeh}(h](extensible-scheduler-classheh ]h"](extensible scheduler class sched-exteh$]h&]uh1hhhhhhhhKexpect_referenced_by_name}jP hsexpect_referenced_by_id}hhsubeh}(h]h ]h"]h$]h&]sourcehuh1hcurrent_sourceN current_lineNsettingsdocutils.frontendValues)}(hN generatorN datestampN source_linkN source_urlN toc_backlinksentryfootnote_backlinksK sectnum_xformKstrip_commentsNstrip_elements_with_classesN strip_classesN report_levelK halt_levelKexit_status_levelKdebugNwarning_streamN tracebackinput_encoding utf-8-siginput_encoding_error_handlerstrictoutput_encodingutf-8output_encoding_error_handlerjz error_encodingutf-8error_encoding_error_handlerbackslashreplace language_codeenrecord_dependenciesNconfigN id_prefixhauto_id_prefixid dump_settingsNdump_internalsNdump_transformsNdump_pseudo_xmlNexpose_internalsNstrict_visitorN_disable_configN_sourceh _destinationN _config_files]7/var/lib/git/docbuild/linux/Documentation/docutils.confafile_insertion_enabled raw_enabledKline_length_limitM'pep_referencesN pep_base_urlhttps://peps.python.org/pep_file_url_templatepep-%04drfc_referencesN rfc_base_url&https://datatracker.ietf.org/doc/html/ tab_widthKtrim_footnote_reference_spacesyntax_highlightlong smart_quotessmartquotes_locales]character_level_inline_markupdoctitle_xform docinfo_xformKsectsubtitle_xform image_loadinglinkembed_stylesheetcloak_email_addressessection_self_linkenvNubreporterNindirect_targets]substitution_defs}substitution_names}refnames}refids}h]hasnameids}(jP hjO jL jjj j jjj j j j j j jG jD u nametypes}(jP jO jj jj j j jG uh}(hhjL hjjj jjjCj jj j j j jD j u footnote_refs} citation_refs} autofootnotes]autofootnote_refs]symbol_footnotes]symbol_footnote_refs] footnotes] citations]autofootnote_startKsymbol_footnote_startK id_counter collectionsCounter}Rparse_messages]transform_messages]hsystem_message)}(hhh]h)}(hhh]h/Hyperlink target "sched-ext" is not referenced.}hj sbah}(h]h ]h"]h$]h&]uh1hhj ubah}(h]h ]h"]h$]h&]levelKtypeINFOsourcehlineKuh1j uba transformerN include_log] decorationNhhub.