€•Ð\Œsphinx.addnodes”Œdocument”“”)”}”(Œ rawsource”Œ”Œchildren”]”(Œ translations”Œ LanguagesNode”“”)”}”(hhh]”(hŒ pending_xref”“”)”}”(hhh]”Œdocutils.nodes”ŒText”“”ŒChinese (Simplified)”…””}”Œparent”hsbaŒ attributes”}”(Œids”]”Œclasses”]”Œnames”]”Œdupnames”]”Œbackrefs”]”Œ refdomain”Œstd”Œreftype”Œdoc”Œ reftarget”Œ'/translations/zh_CN/scheduler/schedutil”Œmodname”NŒ classname”NŒ refexplicit”ˆuŒtagname”hhh ubh)”}”(hhh]”hŒChinese (Traditional)”…””}”hh2sbah}”(h]”h ]”h"]”h$]”h&]”Œ refdomain”h)Œreftype”h+Œ reftarget”Œ'/translations/zh_TW/scheduler/schedutil”Œmodname”NŒ classname”NŒ refexplicit”ˆuh1hhh ubh)”}”(hhh]”hŒItalian”…””}”hhFsbah}”(h]”h ]”h"]”h$]”h&]”Œ refdomain”h)Œreftype”h+Œ reftarget”Œ'/translations/it_IT/scheduler/schedutil”Œmodname”NŒ classname”NŒ refexplicit”ˆuh1hhh ubh)”}”(hhh]”hŒJapanese”…””}”hhZsbah}”(h]”h ]”h"]”h$]”h&]”Œ refdomain”h)Œreftype”h+Œ reftarget”Œ'/translations/ja_JP/scheduler/schedutil”Œmodname”NŒ classname”NŒ refexplicit”ˆuh1hhh ubh)”}”(hhh]”hŒKorean”…””}”hhnsbah}”(h]”h ]”h"]”h$]”h&]”Œ refdomain”h)Œreftype”h+Œ reftarget”Œ'/translations/ko_KR/scheduler/schedutil”Œmodname”NŒ classname”NŒ refexplicit”ˆuh1hhh ubh)”}”(hhh]”hŒPortuguese (Brazilian)”…””}”hh‚sbah}”(h]”h ]”h"]”h$]”h&]”Œ refdomain”h)Œreftype”h+Œ reftarget”Œ'/translations/pt_BR/scheduler/schedutil”Œmodname”NŒ classname”NŒ refexplicit”ˆuh1hhh ubh)”}”(hhh]”hŒSpanish”…””}”hh–sbah}”(h]”h ]”h"]”h$]”h&]”Œ refdomain”h)Œreftype”h+Œ reftarget”Œ'/translations/sp_SP/scheduler/schedutil”Œmodname”NŒ classname”NŒ refexplicit”ˆuh1hhh ubeh}”(h]”h ]”h"]”h$]”h&]”Œcurrent_language”ŒEnglish”uh1h hhŒ _document”hŒsource”NŒline”NubhŒsection”“”)”}”(hhh]”(hŒtitle”“”)”}”(hŒ Schedutil”h]”hŒ Schedutil”…””}”(hh¼h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hºhh·h²hh³ŒA/var/lib/git/docbuild/linux/Documentation/scheduler/schedutil.rst”h´KubhŒnote”“”)”}”(hŒŠAll this assumes a linear relation between frequency and work capacity, we know this is flawed, but it is the best workable approximation.”h]”hŒ paragraph”“”)”}”(hŒŠAll this assumes a linear relation between frequency and work capacity, we know this is flawed, but it is the best workable approximation.”h]”hŒŠAll this assumes a linear relation between frequency and work capacity, we know this is flawed, but it is the best workable approximation.”…””}”(hhÓh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÑh³hÊh´KhhÍubah}”(h]”h ]”h"]”h$]”h&]”uh1hËhh·h²hh³hÊh´Nubh¶)”}”(hhh]”(h»)”}”(hŒPELT (Per Entity Load Tracking)”h]”hŒPELT (Per Entity Load Tracking)”…””}”(hhêh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hºhhçh²hh³hÊh´K ubhÒ)”}”(hXkWith PELT we track some metrics across the various scheduler entities, from individual tasks to task-group slices to CPU runqueues. As the basis for this we use an Exponentially Weighted Moving Average (EWMA), each period (1024us) is decayed such that y^32 = 0.5. That is, the most recent 32ms contribute half, while the rest of history contribute the other half.”h]”hXkWith PELT we track some metrics across the various scheduler entities, from individual tasks to task-group slices to CPU runqueues. As the basis for this we use an Exponentially Weighted Moving Average (EWMA), each period (1024us) is decayed such that y^32 = 0.5. That is, the most recent 32ms contribute half, while the rest of history contribute the other half.”…””}”(hhøh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÑh³hÊh´Khhçh²hubhÒ)”}”(hŒ Specifically:”h]”hŒ Specifically:”…””}”(hjh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÑh³hÊh´Khhçh²hubhŒ block_quote”“”)”}”(hŒPewma_sum(u) := u_0 + u_1*y + u_2*y^2 + ... ewma(u) = ewma_sum(u) / ewma_sum(1) ”h]”(hÒ)”}”(hŒ*ewma_sum(u) := u_0 + u_1*y + u_2*y^2 + ...”h]”hŒ*ewma_sum(u) := u_0 + u_1*y + u_2*y^2 + ...”…””}”(hjh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÑh³hÊh´KhjubhÒ)”}”(hŒ#ewma(u) = ewma_sum(u) / ewma_sum(1)”h]”hŒ#ewma(u) = ewma_sum(u) / ewma_sum(1)”…””}”(hj(h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÑh³hÊh´Khjubeh}”(h]”h ]”h"]”h$]”h&]”uh1jh³hÊh´Khhçh²hubhÒ)”}”(hŒîSince this is essentially a progression of an infinite geometric series, the results are composable, that is ewma(A) + ewma(B) = ewma(A+B). This property is key, since it gives the ability to recompose the averages when tasks move around.”h]”hŒîSince this is essentially a progression of an infinite geometric series, the results are composable, that is ewma(A) + ewma(B) = ewma(A+B). This property is key, since it gives the ability to recompose the averages when tasks move around.”…””}”(hj<h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÑh³hÊh´Khhçh²hubhÒ)”}”(hŒ¦Note that blocked tasks still contribute to the aggregates (task-group slices and CPU runqueues), which reflects their expected contribution when they resume running.”h]”hŒ¦Note that blocked tasks still contribute to the aggregates (task-group slices and CPU runqueues), which reflects their expected contribution when they resume running.”…””}”(hjJh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÑh³hÊh´Khhçh²hubhÒ)”}”(hX¼Using this we track 2 key metrics: 'running' and 'runnable'. 'Running' reflects the time an entity spends on the CPU, while 'runnable' reflects the time an entity spends on the runqueue. When there is only a single task these two metrics are the same, but once there is contention for the CPU 'running' will decrease to reflect the fraction of time each task spends on the CPU while 'runnable' will increase to reflect the amount of contention.”h]”hXÔUsing this we track 2 key metrics: ‘running’ and ‘runnable’. ‘Running’ reflects the time an entity spends on the CPU, while ‘runnable’ reflects the time an entity spends on the runqueue. When there is only a single task these two metrics are the same, but once there is contention for the CPU ‘running’ will decrease to reflect the fraction of time each task spends on the CPU while ‘runnable’ will increase to reflect the amount of contention.”…””}”(hjXh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÑh³hÊh´K#hhçh²hubhÒ)”}”(hŒ(For more detail see: kernel/sched/pelt.c”h]”hŒ(For more detail see: kernel/sched/pelt.c”…””}”(hjfh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÑh³hÊh´K*hhçh²hubeh}”(h]”Œpelt-per-entity-load-tracking”ah ]”h"]”Œpelt (per entity load tracking)”ah$]”h&]”uh1hµhh·h²hh³hÊh´K ubh¶)”}”(hhh]”(h»)”}”(hŒFrequency / CPU Invariance”h]”hŒFrequency / CPU Invariance”…””}”(hjh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hºhj|h²hh³hÊh´K.ubhÒ)”}”(hX8Because consuming the CPU for 50% at 1GHz is not the same as consuming the CPU for 50% at 2GHz, nor is running 50% on a LITTLE CPU the same as running 50% on a big CPU, we allow architectures to scale the time delta with two ratios, one Dynamic Voltage and Frequency Scaling (DVFS) ratio and one microarch ratio.”h]”hX8Because consuming the CPU for 50% at 1GHz is not the same as consuming the CPU for 50% at 2GHz, nor is running 50% on a LITTLE CPU the same as running 50% on a big CPU, we allow architectures to scale the time delta with two ratios, one Dynamic Voltage and Frequency Scaling (DVFS) ratio and one microarch ratio.”…””}”(hjh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÑh³hÊh´K0hj|h²hubhÒ)”}”(hŒeFor simple DVFS architectures (where software is in full control) we trivially compute the ratio as::”h]”hŒdFor simple DVFS architectures (where software is in full control) we trivially compute the ratio as:”…””}”(hj›h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÑh³hÊh´K5hj|h²hubhŒ literal_block”“”)”}”(hŒ/ f_cur r_dvfs := ----- f_max”h]”hŒ/ f_cur r_dvfs := ----- f_max”…””}”hj«sbah}”(h]”h ]”h"]”h$]”h&]”Œ xml:space”Œpreserve”uh1j©h³hÊh´K8hj|h²hubhÒ)”}”(hŒ¶For more dynamic systems where the hardware is in control of DVFS we use hardware counters (Intel APERF/MPERF, ARMv8.4-AMU) to provide us this ratio. For Intel specifically, we use::”h]”hŒµFor more dynamic systems where the hardware is in control of DVFS we use hardware counters (Intel APERF/MPERF, ARMv8.4-AMU) to provide us this ratio. For Intel specifically, we use:”…””}”(hj»h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÑh³hÊh´K