€•úÚŒsphinx.addnodes”Œdocument”“”)”}”(Œ rawsource”Œ”Œchildren”]”(Œ translations”Œ LanguagesNode”“”)”}”(hhh]”(hŒ pending_xref”“”)”}”(hhh]”Œdocutils.nodes”ŒText”“”ŒChinese (Simplified)”…””}”Œparent”hsbaŒ attributes”}”(Œids”]”Œclasses”]”Œnames”]”Œdupnames”]”Œbackrefs”]”Œ refdomain”Œstd”Œreftype”Œdoc”Œ reftarget”Œ,/translations/zh_CN/scheduler/sched-capacity”Œmodname”NŒ classname”NŒ refexplicit”ˆuŒtagname”hhh ubh)”}”(hhh]”hŒChinese (Traditional)”…””}”hh2sbah}”(h]”h ]”h"]”h$]”h&]”Œ refdomain”h)Œreftype”h+Œ reftarget”Œ,/translations/zh_TW/scheduler/sched-capacity”Œmodname”NŒ classname”NŒ refexplicit”ˆuh1hhh ubh)”}”(hhh]”hŒItalian”…””}”hhFsbah}”(h]”h ]”h"]”h$]”h&]”Œ refdomain”h)Œreftype”h+Œ reftarget”Œ,/translations/it_IT/scheduler/sched-capacity”Œmodname”NŒ classname”NŒ refexplicit”ˆuh1hhh ubh)”}”(hhh]”hŒJapanese”…””}”hhZsbah}”(h]”h ]”h"]”h$]”h&]”Œ refdomain”h)Œreftype”h+Œ reftarget”Œ,/translations/ja_JP/scheduler/sched-capacity”Œmodname”NŒ classname”NŒ refexplicit”ˆuh1hhh ubh)”}”(hhh]”hŒKorean”…””}”hhnsbah}”(h]”h ]”h"]”h$]”h&]”Œ refdomain”h)Œreftype”h+Œ reftarget”Œ,/translations/ko_KR/scheduler/sched-capacity”Œmodname”NŒ classname”NŒ refexplicit”ˆuh1hhh ubh)”}”(hhh]”hŒPortuguese (Brazilian)”…””}”hh‚sbah}”(h]”h ]”h"]”h$]”h&]”Œ refdomain”h)Œreftype”h+Œ reftarget”Œ,/translations/pt_BR/scheduler/sched-capacity”Œmodname”NŒ classname”NŒ refexplicit”ˆuh1hhh ubh)”}”(hhh]”hŒSpanish”…””}”hh–sbah}”(h]”h ]”h"]”h$]”h&]”Œ refdomain”h)Œreftype”h+Œ reftarget”Œ,/translations/sp_SP/scheduler/sched-capacity”Œmodname”NŒ classname”NŒ refexplicit”ˆuh1hhh ubeh}”(h]”h ]”h"]”h$]”h&]”Œcurrent_language”ŒEnglish”uh1h hhŒ _document”hŒsource”NŒline”NubhŒsection”“”)”}”(hhh]”(hŒtitle”“”)”}”(hŒCapacity Aware Scheduling”h]”hŒCapacity Aware Scheduling”…””}”(hh¼h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hºhh·h²hh³ŒF/var/lib/git/docbuild/linux/Documentation/scheduler/sched-capacity.rst”h´Kubh¶)”}”(hhh]”(h»)”}”(hŒ1. CPU Capacity”h]”hŒ1. CPU Capacity”…””}”(hhÎh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hºhhËh²hh³hÊh´Kubh¶)”}”(hhh]”(h»)”}”(hŒ1.1 Introduction”h]”hŒ1.1 Introduction”…””}”(hhßh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hºhhÜh²hh³hÊh´K ubhŒ paragraph”“”)”}”(hŒóConventional, homogeneous SMP platforms are composed of purely identical CPUs. Heterogeneous platforms on the other hand are composed of CPUs with different performance characteristics - on such platforms, not all CPUs can be considered equal.”h]”hŒóConventional, homogeneous SMP platforms are composed of purely identical CPUs. Heterogeneous platforms on the other hand are composed of CPUs with different performance characteristics - on such platforms, not all CPUs can be considered equal.”…””}”(hhïh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´K hhÜh²hubhî)”}”(hŒìCPU capacity is a measure of the performance a CPU can reach, normalized against the most performant CPU in the system. Heterogeneous systems are also called asymmetric CPU capacity systems, as they contain CPUs of different capacities.”h]”hŒìCPU capacity is a measure of the performance a CPU can reach, normalized against the most performant CPU in the system. Heterogeneous systems are also called asymmetric CPU capacity systems, as they contain CPUs of different capacities.”…””}”(hhýh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´KhhÜh²hubhî)”}”(hŒaDisparity in maximum attainable performance (IOW in maximum CPU capacity) stems from two factors:”h]”hŒaDisparity in maximum attainable performance (IOW in maximum CPU capacity) stems from two factors:”…””}”(hj h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´KhhÜh²hubhŒ bullet_list”“”)”}”(hhh]”(hŒ list_item”“”)”}”(hŒ:not all CPUs may have the same microarchitecture (µarch).”h]”hî)”}”(hj"h]”hŒ:not all CPUs may have the same microarchitecture (µarch).”…””}”(hj$h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´Khj ubah}”(h]”h ]”h"]”h$]”h&]”uh1jhjh²hh³hÊh´Nubj)”}”(hŒwith Dynamic Voltage and Frequency Scaling (DVFS), not all CPUs may be physically able to attain the higher Operating Performance Points (OPP). ”h]”hî)”}”(hŒwith Dynamic Voltage and Frequency Scaling (DVFS), not all CPUs may be physically able to attain the higher Operating Performance Points (OPP).”h]”hŒwith Dynamic Voltage and Frequency Scaling (DVFS), not all CPUs may be physically able to attain the higher Operating Performance Points (OPP).”…””}”(hj;h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´Khj7ubah}”(h]”h ]”h"]”h$]”h&]”uh1jhjh²hh³hÊh´Nubeh}”(h]”h ]”h"]”h$]”h&]”Œbullet”Œ-”uh1jh³hÊh´KhhÜh²hubhî)”}”(hŒêArm big.LITTLE systems are an example of both. The big CPUs are more performance-oriented than the LITTLE ones (more pipeline stages, bigger caches, smarter predictors, etc), and can usually reach higher OPPs than the LITTLE ones can.”h]”hŒêArm big.LITTLE systems are an example of both. The big CPUs are more performance-oriented than the LITTLE ones (more pipeline stages, bigger caches, smarter predictors, etc), and can usually reach higher OPPs than the LITTLE ones can.”…””}”(hjWh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´KhhÜh²hubhî)”}”(hŒ±CPU performance is usually expressed in Millions of Instructions Per Second (MIPS), which can also be expressed as a given amount of instructions attainable per Hz, leading to::”h]”hŒ°CPU performance is usually expressed in Millions of Instructions Per Second (MIPS), which can also be expressed as a given amount of instructions attainable per Hz, leading to:”…””}”(hjeh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´K hhÜh²hubhŒ literal_block”“”)”}”(hŒ0capacity(cpu) = work_per_hz(cpu) * max_freq(cpu)”h]”hŒ0capacity(cpu) = work_per_hz(cpu) * max_freq(cpu)”…””}”hjusbah}”(h]”h ]”h"]”h$]”h&]”Œ xml:space”Œpreserve”uh1jsh³hÊh´K$hhÜh²hubeh}”(h]”Œ introduction”ah ]”h"]”Œ1.1 introduction”ah$]”h&]”uh1hµhhËh²hh³hÊh´K ubh¶)”}”(hhh]”(h»)”}”(hŒ1.2 Scheduler terms”h]”hŒ1.2 Scheduler terms”…””}”(hjh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hºhjh²hh³hÊh´K'ubhî)”}”(hXˆTwo different capacity values are used within the scheduler. A CPU's ``original capacity`` is its maximum attainable capacity, i.e. its maximum attainable performance level. This original capacity is returned by the function arch_scale_cpu_capacity(). A CPU's ``capacity`` is its ``original capacity`` to which some loss of available performance (e.g. time spent handling IRQs) is subtracted.”h]”(hŒGTwo different capacity values are used within the scheduler. A CPU’s ”…””}”(hjžh²hh³Nh´NubhŒliteral”“”)”}”(hŒ``original capacity``”h]”hŒoriginal capacity”…””}”(hj¨h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1j¦hjžubhŒ¬ is its maximum attainable capacity, i.e. its maximum attainable performance level. This original capacity is returned by the function arch_scale_cpu_capacity(). A CPU’s ”…””}”(hjžh²hh³Nh´Nubj§)”}”(hŒ ``capacity``”h]”hŒcapacity”…””}”(hjºh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1j¦hjžubhŒ is its ”…””}”(hjžh²hh³Nh´Nubj§)”}”(hŒ``original capacity``”h]”hŒoriginal capacity”…””}”(hjÌh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1j¦hjžubhŒ[ to which some loss of available performance (e.g. time spent handling IRQs) is subtracted.”…””}”(hjžh²hh³Nh´Nubeh}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´K)hjh²hubhî)”}”(hŒúNote that a CPU's ``capacity`` is solely intended to be used by the CFS class, while ``original capacity`` is class-agnostic. The rest of this document will use the term ``capacity`` interchangeably with ``original capacity`` for the sake of brevity.”h]”(hŒNote that a CPU’s ”…””}”(hjäh²hh³Nh´Nubj§)”}”(hŒ ``capacity``”h]”hŒcapacity”…””}”(hjìh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1j¦hjäubhŒ7 is solely intended to be used by the CFS class, while ”…””}”(hjäh²hh³Nh´Nubj§)”}”(hŒ``original capacity``”h]”hŒoriginal capacity”…””}”(hjþh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1j¦hjäubhŒ@ is class-agnostic. The rest of this document will use the term ”…””}”(hjäh²hh³Nh´Nubj§)”}”(hŒ ``capacity``”h]”hŒcapacity”…””}”(hjh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1j¦hjäubhŒ interchangeably with ”…””}”(hjäh²hh³Nh´Nubj§)”}”(hŒ``original capacity``”h]”hŒoriginal capacity”…””}”(hj"h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1j¦hjäubhŒ for the sake of brevity.”…””}”(hjäh²hh³Nh´Nubeh}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´K0hjh²hubeh}”(h]”Œscheduler-terms”ah ]”h"]”Œ1.2 scheduler terms”ah$]”h&]”uh1hµhhËh²hh³hÊh´K'ubh¶)”}”(hhh]”(h»)”}”(hŒ1.3 Platform examples”h]”hŒ1.3 Platform examples”…””}”(hjEh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hºhjBh²hh³hÊh´K6ubh¶)”}”(hhh]”(h»)”}”(hŒ1.3.1 Identical OPPs”h]”hŒ1.3.1 Identical OPPs”…””}”(hjVh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hºhjSh²hh³hÊh´K9ubhî)”}”(hŒGConsider an hypothetical dual-core asymmetric CPU capacity system where”h]”hŒGConsider an hypothetical dual-core asymmetric CPU capacity system where”…””}”(hjdh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´K;hjSh²hubj)”}”(hhh]”(j)”}”(hŒwork_per_hz(CPU0) = W”h]”hî)”}”(hjwh]”hŒwork_per_hz(CPU0) = W”…””}”(hjyh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´K=hjuubah}”(h]”h ]”h"]”h$]”h&]”uh1jhjrh²hh³hÊh´Nubj)”}”(hŒwork_per_hz(CPU1) = W/2”h]”hî)”}”(hjŽh]”hŒwork_per_hz(CPU1) = W/2”…””}”(hjh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´K>hjŒubah}”(h]”h ]”h"]”h$]”h&]”uh1jhjrh²hh³hÊh´Nubj)”}”(hŒ1all CPUs are running at the same fixed frequency ”h]”hî)”}”(hŒ0all CPUs are running at the same fixed frequency”h]”hŒ0all CPUs are running at the same fixed frequency”…””}”(hj§h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´K?hj£ubah}”(h]”h ]”h"]”h$]”h&]”uh1jhjrh²hh³hÊh´Nubeh}”(h]”h ]”h"]”h$]”h&]”jUjVuh1jh³hÊh´K=hjSh²hubhî)”}”(hŒ$By the above definition of capacity:”h]”hŒ$By the above definition of capacity:”…””}”(hjÁh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´KAhjSh²hubj)”}”(hhh]”(j)”}”(hŒcapacity(CPU0) = C”h]”hî)”}”(hjÔh]”hŒcapacity(CPU0) = C”…””}”(hjÖh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´KChjÒubah}”(h]”h ]”h"]”h$]”h&]”uh1jhjÏh²hh³hÊh´Nubj)”}”(hŒcapacity(CPU1) = C/2 ”h]”hî)”}”(hŒcapacity(CPU1) = C/2”h]”hŒcapacity(CPU1) = C/2”…””}”(hjíh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´KDhjéubah}”(h]”h ]”h"]”h$]”h&]”uh1jhjÏh²hh³hÊh´Nubeh}”(h]”h ]”h"]”h$]”h&]”jUjVuh1jh³hÊh´KChjSh²hubhî)”}”(hŒ[To draw the parallel with Arm big.LITTLE, CPU0 would be a big while CPU1 would be a LITTLE.”h]”hŒ[To draw the parallel with Arm big.LITTLE, CPU0 would be a big while CPU1 would be a LITTLE.”…””}”(hjh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´KFhjSh²hubhî)”}”(hŒhWith a workload that periodically does a fixed amount of work, you will get an execution trace like so::”h]”hŒgWith a workload that periodically does a fixed amount of work, you will get an execution trace like so:”…””}”(hjh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´KIhjSh²hubjt)”}”(hX“CPU0 work ^ | ____ ____ ____ | | | | | | | +----+----+----+----+----+----+----+----+----+----+-> time CPU1 work ^ | _________ _________ ____ | | | | | | +----+----+----+----+----+----+----+----+----+----+-> time”h]”hX“CPU0 work ^ | ____ ____ ____ | | | | | | | +----+----+----+----+----+----+----+----+----+----+-> time CPU1 work ^ | _________ _________ ____ | | | | | | +----+----+----+----+----+----+----+----+----+----+-> time”…””}”hj#sbah}”(h]”h ]”h"]”h$]”h&]”jƒj„uh1jsh³hÊh´KLhjSh²hubhî)”}”(hŒÄCPU0 has the highest capacity in the system (C), and completes a fixed amount of work W in T units of time. On the other hand, CPU1 has half the capacity of CPU0, and thus only completes W/2 in T.”h]”hŒÄCPU0 has the highest capacity in the system (C), and completes a fixed amount of work W in T units of time. On the other hand, CPU1 has half the capacity of CPU0, and thus only completes W/2 in T.”…””}”(hj1h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´KVhjSh²hubeh}”(h]”Œidentical-opps”ah ]”h"]”Œ1.3.1 identical opps”ah$]”h&]”uh1hµhjBh²hh³hÊh´K9ubh¶)”}”(hhh]”(h»)”}”(hŒ1.3.2 Different max OPPs”h]”hŒ1.3.2 Different max OPPs”…””}”(hjJh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hºhjGh²hh³hÊh´K[ubhî)”}”(hŒŒUsually, CPUs of different capacity values also have different maximum OPPs. Consider the same CPUs as above (i.e. same work_per_hz()) with:”h]”hŒŒUsually, CPUs of different capacity values also have different maximum OPPs. Consider the same CPUs as above (i.e. same work_per_hz()) with:”…””}”(hjXh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´K]hjGh²hubj)”}”(hhh]”(j)”}”(hŒmax_freq(CPU0) = F”h]”hî)”}”(hjkh]”hŒmax_freq(CPU0) = F”…””}”(hjmh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´K`hjiubah}”(h]”h ]”h"]”h$]”h&]”uh1jhjfh²hh³hÊh´Nubj)”}”(hŒmax_freq(CPU1) = 2/3 * F ”h]”hî)”}”(hŒmax_freq(CPU1) = 2/3 * F”h]”hŒmax_freq(CPU1) = 2/3 * F”…””}”(hj„h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´Kahj€ubah}”(h]”h ]”h"]”h$]”h&]”uh1jhjfh²hh³hÊh´Nubeh}”(h]”h ]”h"]”h$]”h&]”jUjVuh1jh³hÊh´K`hjGh²hubhî)”}”(hŒ This yields:”h]”hŒ This yields:”…””}”(hjžh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´KchjGh²hubj)”}”(hhh]”(j)”}”(hŒcapacity(CPU0) = C”h]”hî)”}”(hj±h]”hŒcapacity(CPU0) = C”…””}”(hj³h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´Kehj¯ubah}”(h]”h ]”h"]”h$]”h&]”uh1jhj¬h²hh³hÊh´Nubj)”}”(hŒcapacity(CPU1) = C/3 ”h]”hî)”}”(hŒcapacity(CPU1) = C/3”h]”hŒcapacity(CPU1) = C/3”…””}”(hjÊh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´KfhjÆubah}”(h]”h ]”h"]”h$]”h&]”uh1jhj¬h²hh³hÊh´Nubeh}”(h]”h ]”h"]”h$]”h&]”jUjVuh1jh³hÊh´KehjGh²hubhî)”}”(hŒoExecuting the same workload as described in 1.3.1, which each CPU running at its maximum frequency results in::”h]”hŒnExecuting the same workload as described in 1.3.1, which each CPU running at its maximum frequency results in:”…””}”(hjäh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´KhhjGh²hubjt)”}”(hX¿CPU0 work ^ | ____ ____ ____ | | | | | | | +----+----+----+----+----+----+----+----+----+----+-> time workload on CPU1 CPU1 work ^ | ______________ ______________ ____ | | | | | | +----+----+----+----+----+----+----+----+----+----+-> time”h]”hX¿CPU0 work ^ | ____ ____ ____ | | | | | | | +----+----+----+----+----+----+----+----+----+----+-> time workload on CPU1 CPU1 work ^ | ______________ ______________ ____ | | | | | | +----+----+----+----+----+----+----+----+----+----+-> time”…””}”hjòsbah}”(h]”h ]”h"]”h$]”h&]”jƒj„uh1jsh³hÊh´KkhjGh²hubeh}”(h]”Œdifferent-max-opps”ah ]”h"]”Œ1.3.2 different max opps”ah$]”h&]”uh1hµhjBh²hh³hÊh´K[ubeh}”(h]”Œplatform-examples”ah ]”h"]”Œ1.3 platform examples”ah$]”h&]”uh1hµhhËh²hh³hÊh´K6ubh¶)”}”(hhh]”(h»)”}”(hŒ1.4 Representation caveat”h]”hŒ1.4 Representation caveat”…””}”(hjh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hºhjh²hh³hÊh´Kwubhî)”}”(hXjIt should be noted that having a *single* value to represent differences in CPU performance is somewhat of a contentious point. The relative performance difference between two different µarchs could be X% on integer operations, Y% on floating point operations, Z% on branches, and so on. Still, results using this simple approach have been satisfactory for now.”h]”(hŒ!It should be noted that having a ”…””}”(hj!h²hh³Nh´NubhŒemphasis”“”)”}”(hŒ*single*”h]”hŒsingle”…””}”(hj+h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1j)hj!ubhXA value to represent differences in CPU performance is somewhat of a contentious point. The relative performance difference between two different µarchs could be X% on integer operations, Y% on floating point operations, Z% on branches, and so on. Still, results using this simple approach have been satisfactory for now.”…””}”(hj!h²hh³Nh´Nubeh}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´Kyhjh²hubeh}”(h]”Œrepresentation-caveat”ah ]”h"]”Œ1.4 representation caveat”ah$]”h&]”uh1hµhhËh²hh³hÊh´Kwubeh}”(h]”Œ cpu-capacity”ah ]”h"]”Œ1. cpu capacity”ah$]”h&]”uh1hµhh·h²hh³hÊh´Kubh¶)”}”(hhh]”(h»)”}”(hŒ2. Task utilization”h]”hŒ2. Task utilization”…””}”(hjVh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hºhjSh²hh³hÊh´K€ubh¶)”}”(hhh]”(h»)”}”(hŒ2.1 Introduction”h]”hŒ2.1 Introduction”…””}”(hjgh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hºhjdh²hh³hÊh´Kƒubhî)”}”(hXCapacity aware scheduling requires an expression of a task's requirements with regards to CPU capacity. Each scheduler class can express this differently, and while task utilization is specific to CFS, it is convenient to describe it here in order to introduce more generic concepts.”h]”hXCapacity aware scheduling requires an expression of a task’s requirements with regards to CPU capacity. Each scheduler class can express this differently, and while task utilization is specific to CFS, it is convenient to describe it here in order to introduce more generic concepts.”…””}”(hjuh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´K…hjdh²hubhî)”}”(hŒ˜Task utilization is a percentage meant to represent the throughput requirements of a task. A simple approximation of it is the task's duty cycle, i.e.::”h]”hŒ™Task utilization is a percentage meant to represent the throughput requirements of a task. A simple approximation of it is the task’s duty cycle, i.e.:”…””}”(hjƒh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´KŠhjdh²hubjt)”}”(hŒtask_util(p) = duty_cycle(p)”h]”hŒtask_util(p) = duty_cycle(p)”…””}”hj‘sbah}”(h]”h ]”h"]”h$]”h&]”jƒj„uh1jsh³hÊh´Khjdh²hubhî)”}”(hXFOn an SMP system with fixed frequencies, 100% utilization suggests the task is a busy loop. Conversely, 10% utilization hints it is a small periodic task that spends more time sleeping than executing. Variable CPU frequencies and asymmetric CPU capacities complexify this somewhat; the following sections will expand on these.”h]”hXFOn an SMP system with fixed frequencies, 100% utilization suggests the task is a busy loop. Conversely, 10% utilization hints it is a small periodic task that spends more time sleeping than executing. Variable CPU frequencies and asymmetric CPU capacities complexify this somewhat; the following sections will expand on these.”…””}”(hjŸh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´Khjdh²hubeh}”(h]”Œid1”ah ]”h"]”Œ2.1 introduction”ah$]”h&]”uh1hµhjSh²hh³hÊh´Kƒubh¶)”}”(hhh]”(h»)”}”(hŒ2.2 Frequency invariance”h]”hŒ2.2 Frequency invariance”…””}”(hj¸h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hºhjµh²hh³hÊh´K–ubhî)”}”(hŒÊOne issue that needs to be taken into account is that a workload's duty cycle is directly impacted by the current OPP the CPU is running at. Consider running a periodic workload at a given frequency F::”h]”hŒËOne issue that needs to be taken into account is that a workload’s duty cycle is directly impacted by the current OPP the CPU is running at. Consider running a periodic workload at a given frequency F:”…””}”(hjÆh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´K˜hjµh²hubjt)”}”(hŒÇCPU work ^ | ____ ____ ____ | | | | | | | +----+----+----+----+----+----+----+----+----+----+-> time”h]”hŒÇCPU work ^ | ____ ____ ____ | | | | | | | +----+----+----+----+----+----+----+----+----+----+-> time”…””}”hjÔsbah}”(h]”h ]”h"]”h$]”h&]”jƒj„uh1jsh³hÊh´Kœhjµh²hubhî)”}”(hŒ!This yields duty_cycle(p) == 25%.”h]”hŒ!This yields duty_cycle(p) == 25%.”…””}”(hjâh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´K¡hjµh²hubhî)”}”(hŒ time”h]”hŒÂCPU work ^ | _________ _________ ____ | | | | | | +----+----+----+----+----+----+----+----+----+----+-> time”…””}”hjsbah}”(h]”h ]”h"]”h$]”h&]”jƒj„uh1jsh³hÊh´K¥hjµh²hubhî)”}”(hŒThis yields duty_cycle(p) == 50%, despite the task having the exact same behaviour (i.e. executing the same amount of work) in both executions.”h]”hŒThis yields duty_cycle(p) == 50%, despite the task having the exact same behaviour (i.e. executing the same amount of work) in both executions.”…””}”(hjh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´Kªhjµh²hubhî)”}”(hŒYThe task utilization signal can be made frequency invariant using the following formula::”h]”hŒXThe task utilization signal can be made frequency invariant using the following formula:”…””}”(hj,h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´K­hjµh²hubjt)”}”(hŒRtask_util_freq_inv(p) = duty_cycle(p) * (curr_frequency(cpu) / max_frequency(cpu))”h]”hŒRtask_util_freq_inv(p) = duty_cycle(p) * (curr_frequency(cpu) / max_frequency(cpu))”…””}”hj:sbah}”(h]”h ]”h"]”h$]”h&]”jƒj„uh1jsh³hÊh´K°hjµh²hubhî)”}”(hŒeApplying this formula to the two examples above yields a frequency invariant task utilization of 25%.”h]”hŒeApplying this formula to the two examples above yields a frequency invariant task utilization of 25%.”…””}”(hjHh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´K²hjµh²hubeh}”(h]”Œfrequency-invariance”ah ]”h"]”Œ2.2 frequency invariance”ah$]”h&]”uh1hµhjSh²hh³hÊh´K–ubh¶)”}”(hhh]”(h»)”}”(hŒ2.3 CPU invariance”h]”hŒ2.3 CPU invariance”…””}”(hjah²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hºhj^h²hh³hÊh´K¶ubhî)”}”(hŒ¢CPU capacity has a similar effect on task utilization in that running an identical workload on CPUs of different capacity values will yield different duty cycles.”h]”hŒ¢CPU capacity has a similar effect on task utilization in that running an identical workload on CPUs of different capacity values will yield different duty cycles.”…””}”(hjoh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´K¸hj^h²hubhî)”}”(hŒ/Consider the system described in 1.3.2., i.e.::”h]”hŒ.Consider the system described in 1.3.2., i.e.:”…””}”(hj}h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´K¼hj^h²hubjt)”}”(hŒ+- capacity(CPU0) = C - capacity(CPU1) = C/3”h]”hŒ+- capacity(CPU0) = C - capacity(CPU1) = C/3”…””}”hj‹sbah}”(h]”h ]”h"]”h$]”h&]”jƒj„uh1jsh³hÊh´K¾hj^h²hubhî)”}”(hŒ\Executing a given periodic workload on each CPU at their maximum frequency would result in::”h]”hŒ[Executing a given periodic workload on each CPU at their maximum frequency would result in:”…””}”(hj™h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´KÁhj^h²hubjt)”}”(hX“CPU0 work ^ | ____ ____ ____ | | | | | | | +----+----+----+----+----+----+----+----+----+----+-> time CPU1 work ^ | ______________ ______________ ____ | | | | | | +----+----+----+----+----+----+----+----+----+----+-> time”h]”hX“CPU0 work ^ | ____ ____ ____ | | | | | | | +----+----+----+----+----+----+----+----+----+----+-> time CPU1 work ^ | ______________ ______________ ____ | | | | | | +----+----+----+----+----+----+----+----+----+----+-> time”…””}”hj§sbah}”(h]”h ]”h"]”h$]”h&]”jƒj„uh1jsh³hÊh´KÄhj^h²hubhî)”}”(hŒIOW,”h]”hŒIOW,”…””}”(hjµh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´KÎhj^h²hubj)”}”(hhh]”(j)”}”(hŒ?duty_cycle(p) == 25% if p runs on CPU0 at its maximum frequency”h]”hî)”}”(hjÈh]”hŒ?duty_cycle(p) == 25% if p runs on CPU0 at its maximum frequency”…””}”(hjÊh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´KÐhjÆubah}”(h]”h ]”h"]”h$]”h&]”uh1jhjÃh²hh³hÊh´Nubj)”}”(hŒ@duty_cycle(p) == 75% if p runs on CPU1 at its maximum frequency ”h]”hî)”}”(hŒ?duty_cycle(p) == 75% if p runs on CPU1 at its maximum frequency”h]”hŒ?duty_cycle(p) == 75% if p runs on CPU1 at its maximum frequency”…””}”(hjáh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´KÑhjÝubah}”(h]”h ]”h"]”h$]”h&]”uh1jhjÃh²hh³hÊh´Nubeh}”(h]”h ]”h"]”h$]”h&]”jUjVuh1jh³hÊh´KÐhj^h²hubhî)”}”(hŒSThe task utilization signal can be made CPU invariant using the following formula::”h]”hŒRThe task utilization signal can be made CPU invariant using the following formula:”…””}”(hjûh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´KÓhj^h²hubjt)”}”(hŒEtask_util_cpu_inv(p) = duty_cycle(p) * (capacity(cpu) / max_capacity)”h]”hŒEtask_util_cpu_inv(p) = duty_cycle(p) * (capacity(cpu) / max_capacity)”…””}”hj sbah}”(h]”h ]”h"]”h$]”h&]”jƒj„uh1jsh³hÊh´KÖhj^h²hubhî)”}”(hŒªwith ``max_capacity`` being the highest CPU capacity value in the system. Applying this formula to the above example above yields a CPU invariant task utilization of 25%.”h]”(hŒwith ”…””}”(hjh²hh³Nh´Nubj§)”}”(hŒ``max_capacity``”h]”hŒ max_capacity”…””}”(hjh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1j¦hjubhŒ• being the highest CPU capacity value in the system. Applying this formula to the above example above yields a CPU invariant task utilization of 25%.”…””}”(hjh²hh³Nh´Nubeh}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´KØhj^h²hubeh}”(h]”Œcpu-invariance”ah ]”h"]”Œ2.3 cpu invariance”ah$]”h&]”uh1hµhjSh²hh³hÊh´K¶ubh¶)”}”(hhh]”(h»)”}”(hŒ2.4 Invariant task utilization”h]”hŒ2.4 Invariant task utilization”…””}”(hjBh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hºhj?h²hh³hÊh´KÝubhî)”}”(hŒæBoth frequency and CPU invariance need to be applied to task utilization in order to obtain a truly invariant signal. The pseudo-formula for a task utilization that is both CPU and frequency invariant is thus, for a given task p::”h]”hŒåBoth frequency and CPU invariance need to be applied to task utilization in order to obtain a truly invariant signal. The pseudo-formula for a task utilization that is both CPU and frequency invariant is thus, for a given task p:”…””}”(hjPh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´Kßhj?h²hubjt)”}”(hŒÓ curr_frequency(cpu) capacity(cpu) task_util_inv(p) = duty_cycle(p) * ------------------- * ------------- max_frequency(cpu) max_capacity”h]”hŒÓ curr_frequency(cpu) capacity(cpu) task_util_inv(p) = duty_cycle(p) * ------------------- * ------------- max_frequency(cpu) max_capacity”…””}”hj^sbah}”(h]”h ]”h"]”h$]”h&]”jƒj„uh1jsh³hÊh´Kähj?h²hubhî)”}”(hŒ¯In other words, invariant task utilization describes the behaviour of a task as if it were running on the highest-capacity CPU in the system, running at its maximum frequency.”h]”hŒ¯In other words, invariant task utilization describes the behaviour of a task as if it were running on the highest-capacity CPU in the system, running at its maximum frequency.”…””}”(hjlh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´Kèhj?h²hubhî)”}”(hŒXAny mention of task utilization in the following sections will imply its invariant form.”h]”hŒXAny mention of task utilization in the following sections will imply its invariant form.”…””}”(hjzh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´Kìhj?h²hubeh}”(h]”Œinvariant-task-utilization”ah ]”h"]”Œ2.4 invariant task utilization”ah$]”h&]”uh1hµhjSh²hh³hÊh´KÝubh¶)”}”(hhh]”(h»)”}”(hŒ2.5 Utilization estimation”h]”hŒ2.5 Utilization estimation”…””}”(hj“h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hºhjh²hh³hÊh´Kðubhî)”}”(hXKWithout a crystal ball, task behaviour (and thus task utilization) cannot accurately be predicted the moment a task first becomes runnable. The CFS class maintains a handful of CPU and task signals based on the Per-Entity Load Tracking (PELT) mechanism, one of those yielding an *average* utilization (as opposed to instantaneous).”h]”(hXWithout a crystal ball, task behaviour (and thus task utilization) cannot accurately be predicted the moment a task first becomes runnable. The CFS class maintains a handful of CPU and task signals based on the Per-Entity Load Tracking (PELT) mechanism, one of those yielding an ”…””}”(hj¡h²hh³Nh´Nubj*)”}”(hŒ *average*”h]”hŒaverage”…””}”(hj©h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1j)hj¡ubhŒ+ utilization (as opposed to instantaneous).”…””}”(hj¡h²hh³Nh´Nubeh}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´Kòhjh²hubhî)”}”(hŒÑThis means that while the capacity aware scheduling criteria will be written considering a "true" task utilization (using a crystal ball), the implementation will only ever be able to use an estimator thereof.”h]”hŒÕThis means that while the capacity aware scheduling criteria will be written considering a “true†task utilization (using a crystal ball), the implementation will only ever be able to use an estimator thereof.”…””}”(hjÁh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´Køhjh²hubeh}”(h]”Œutilization-estimation”ah ]”h"]”Œ2.5 utilization estimation”ah$]”h&]”uh1hµhjSh²hh³hÊh´Kðubeh}”(h]”Œtask-utilization”ah ]”h"]”Œ2. task utilization”ah$]”h&]”uh1hµhh·h²hh³hÊh´K€ubh¶)”}”(hhh]”(h»)”}”(hŒ)3. Capacity aware scheduling requirements”h]”hŒ)3. Capacity aware scheduling requirements”…””}”(hjâh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hºhjßh²hh³hÊh´Kýubh¶)”}”(hhh]”(h»)”}”(hŒ3.1 CPU capacity”h]”hŒ3.1 CPU capacity”…””}”(hjóh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hºhjðh²hh³hÊh´Mubhî)”}”(hŒ°Linux cannot currently figure out CPU capacity on its own, this information thus needs to be handed to it. Architectures must define arch_scale_cpu_capacity() for that purpose.”h]”hŒ°Linux cannot currently figure out CPU capacity on its own, this information thus needs to be handed to it. Architectures must define arch_scale_cpu_capacity() for that purpose.”…””}”(hjh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´Mhjðh²hubhî)”}”(hŒÞThe arm, arm64, and RISC-V architectures directly map this to the arch_topology driver CPU scaling data, which is derived from the capacity-dmips-mhz CPU binding; see Documentation/devicetree/bindings/cpu/cpu-capacity.txt.”h]”hŒÞThe arm, arm64, and RISC-V architectures directly map this to the arch_topology driver CPU scaling data, which is derived from the capacity-dmips-mhz CPU binding; see Documentation/devicetree/bindings/cpu/cpu-capacity.txt.”…””}”(hjh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´Mhjðh²hubeh}”(h]”Œid2”ah ]”h"]”Œ3.1 cpu capacity”ah$]”h&]”uh1hµhjßh²hh³hÊh´Mubh¶)”}”(hhh]”(h»)”}”(hŒ3.2 Frequency invariance”h]”hŒ3.2 Frequency invariance”…””}”(hj(h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hºhj%h²hh³hÊh´M ubhî)”}”(hŒ¦As stated in 2.2, capacity-aware scheduling requires a frequency-invariant task utilization. Architectures must define arch_scale_freq_capacity(cpu) for that purpose.”h]”hŒ¦As stated in 2.2, capacity-aware scheduling requires a frequency-invariant task utilization. Architectures must define arch_scale_freq_capacity(cpu) for that purpose.”…””}”(hj6h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´M hj%h²hubhî)”}”(hXImplementing this function requires figuring out at which frequency each CPU have been running at. One way to implement this is to leverage hardware counters whose increment rate scale with a CPU's current frequency (APERF/MPERF on x86, AMU on arm64). Another is to directly hook into cpufreq frequency transitions, when the kernel is aware of the switched-to frequency (also employed by arm/arm64).”h]”hX‘Implementing this function requires figuring out at which frequency each CPU have been running at. One way to implement this is to leverage hardware counters whose increment rate scale with a CPU’s current frequency (APERF/MPERF on x86, AMU on arm64). Another is to directly hook into cpufreq frequency transitions, when the kernel is aware of the switched-to frequency (also employed by arm/arm64).”…””}”(hjDh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´Mhj%h²hubeh}”(h]”Œid3”ah ]”h"]”Œ3.2 frequency invariance”ah$]”h&]”uh1hµhjßh²hh³hÊh´M ubeh}”(h]”Œ&capacity-aware-scheduling-requirements”ah ]”h"]”Œ)3. capacity aware scheduling requirements”ah$]”h&]”uh1hµhh·h²hh³hÊh´Kýubh¶)”}”(hhh]”(h»)”}”(hŒ4. Scheduler topology”h]”hŒ4. Scheduler topology”…””}”(hjeh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hºhjbh²hh³hÊh´Mubhî)”}”(hŒ›During the construction of the sched domains, the scheduler will figure out whether the system exhibits asymmetric CPU capacities. Should that be the case:”h]”hŒ›During the construction of the sched domains, the scheduler will figure out whether the system exhibits asymmetric CPU capacities. Should that be the case:”…””}”(hjsh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´Mhjbh²hubj)”}”(hhh]”(j)”}”(hŒ6The sched_asym_cpucapacity static key will be enabled.”h]”hî)”}”(hj†h]”hŒ6The sched_asym_cpucapacity static key will be enabled.”…””}”(hjˆh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´Mhj„ubah}”(h]”h ]”h"]”h$]”h&]”uh1jhjh²hh³hÊh´Nubj)”}”(hŒyThe SD_ASYM_CPUCAPACITY_FULL flag will be set at the lowest sched_domain level that spans all unique CPU capacity values.”h]”hî)”}”(hŒyThe SD_ASYM_CPUCAPACITY_FULL flag will be set at the lowest sched_domain level that spans all unique CPU capacity values.”h]”hŒyThe SD_ASYM_CPUCAPACITY_FULL flag will be set at the lowest sched_domain level that spans all unique CPU capacity values.”…””}”(hjŸh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´M hj›ubah}”(h]”h ]”h"]”h$]”h&]”uh1jhjh²hh³hÊh´Nubj)”}”(hŒkThe SD_ASYM_CPUCAPACITY flag will be set for any sched_domain that spans CPUs with any range of asymmetry. ”h]”hî)”}”(hŒjThe SD_ASYM_CPUCAPACITY flag will be set for any sched_domain that spans CPUs with any range of asymmetry.”h]”hŒjThe SD_ASYM_CPUCAPACITY flag will be set for any sched_domain that spans CPUs with any range of asymmetry.”…””}”(hj·h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´M"hj³ubah}”(h]”h ]”h"]”h$]”h&]”uh1jhjh²hh³hÊh´Nubeh}”(h]”h ]”h"]”h$]”h&]”jUjVuh1jh³hÊh´Mhjbh²hubhî)”}”(hŒÖThe sched_asym_cpucapacity static key is intended to guard sections of code that cater to asymmetric CPU capacity systems. Do note however that said key is *system-wide*. Imagine the following setup using cpusets::”h]”(hŒœThe sched_asym_cpucapacity static key is intended to guard sections of code that cater to asymmetric CPU capacity systems. Do note however that said key is ”…””}”(hjÑh²hh³Nh´Nubj*)”}”(hŒ *system-wide*”h]”hŒ system-wide”…””}”(hjÙh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1j)hjÑubhŒ,. Imagine the following setup using cpusets:”…””}”(hjÑh²hh³Nh´Nubeh}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´M%hjbh²hubjt)”}”(hŒ³capacity C/2 C ________ ________ / \ / \ CPUs 0 1 2 3 4 5 6 7 \__/ \______________/ cpusets cs0 cs1”h]”hŒ³capacity C/2 C ________ ________ / \ / \ CPUs 0 1 2 3 4 5 6 7 \__/ \______________/ cpusets cs0 cs1”…””}”hjñsbah}”(h]”h ]”h"]”h$]”h&]”jƒj„uh1jsh³hÊh´M)hjbh²hubhî)”}”(hŒWhich could be created via:”h]”hŒWhich could be created via:”…””}”(hjÿh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´M0hjbh²hubjt)”}”(hX:mkdir /sys/fs/cgroup/cpuset/cs0 echo 0-1 > /sys/fs/cgroup/cpuset/cs0/cpuset.cpus echo 0 > /sys/fs/cgroup/cpuset/cs0/cpuset.mems mkdir /sys/fs/cgroup/cpuset/cs1 echo 2-7 > /sys/fs/cgroup/cpuset/cs1/cpuset.cpus echo 0 > /sys/fs/cgroup/cpuset/cs1/cpuset.mems echo 0 > /sys/fs/cgroup/cpuset/cpuset.sched_load_balance”h]”hX:mkdir /sys/fs/cgroup/cpuset/cs0 echo 0-1 > /sys/fs/cgroup/cpuset/cs0/cpuset.cpus echo 0 > /sys/fs/cgroup/cpuset/cs0/cpuset.mems mkdir /sys/fs/cgroup/cpuset/cs1 echo 2-7 > /sys/fs/cgroup/cpuset/cs1/cpuset.cpus echo 0 > /sys/fs/cgroup/cpuset/cs1/cpuset.mems echo 0 > /sys/fs/cgroup/cpuset/cpuset.sched_load_balance”…””}”hj sbah}”(h]”h ]”h"]”h$]”h&]”jƒj„Œforce”‰Œlanguage”Œsh”Œhighlight_args”}”uh1jsh³hÊh´M2hjbh²hubhî)”}”(hX'Since there *is* CPU capacity asymmetry in the system, the sched_asym_cpucapacity static key will be enabled. However, the sched_domain hierarchy of CPUs 0-1 spans a single capacity value: SD_ASYM_CPUCAPACITY isn't set in that hierarchy, it describes an SMP island and should be treated as such.”h]”(hŒ Since there ”…””}”(hj h²hh³Nh´Nubj*)”}”(hŒ*is*”h]”hŒis”…””}”(hj(h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1j)hj ubhX CPU capacity asymmetry in the system, the sched_asym_cpucapacity static key will be enabled. However, the sched_domain hierarchy of CPUs 0-1 spans a single capacity value: SD_ASYM_CPUCAPACITY isn’t set in that hierarchy, it describes an SMP island and should be treated as such.”…””}”(hj h²hh³Nh´Nubeh}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´M>hjbh²hubhî)”}”(hŒjTherefore, the 'canonical' pattern for protecting codepaths that cater to asymmetric CPU capacities is to:”h]”hŒnTherefore, the ‘canonical’ pattern for protecting codepaths that cater to asymmetric CPU capacities is to:”…””}”(hj@h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´MChjbh²hubj)”}”(hhh]”(j)”}”(hŒ+Check the sched_asym_cpucapacity static key”h]”hî)”}”(hjSh]”hŒ+Check the sched_asym_cpucapacity static key”…””}”(hjUh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´MFhjQubah}”(h]”h ]”h"]”h$]”h&]”uh1jhjNh²hh³hÊh´Nubj)”}”(hŒ±If it is enabled, then also check for the presence of SD_ASYM_CPUCAPACITY in the sched_domain hierarchy (if relevant, i.e. the codepath targets a specific CPU or group thereof) ”h]”hî)”}”(hŒ°If it is enabled, then also check for the presence of SD_ASYM_CPUCAPACITY in the sched_domain hierarchy (if relevant, i.e. the codepath targets a specific CPU or group thereof)”h]”hŒ°If it is enabled, then also check for the presence of SD_ASYM_CPUCAPACITY in the sched_domain hierarchy (if relevant, i.e. the codepath targets a specific CPU or group thereof)”…””}”(hjlh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´MGhjhubah}”(h]”h ]”h"]”h$]”h&]”uh1jhjNh²hh³hÊh´Nubeh}”(h]”h ]”h"]”h$]”h&]”jUjVuh1jh³hÊh´MFhjbh²hubeh}”(h]”Œscheduler-topology”ah ]”h"]”Œ4. scheduler topology”ah$]”h&]”uh1hµhh·h²hh³hÊh´Mubh¶)”}”(hhh]”(h»)”}”(hŒ+5. Capacity aware scheduling implementation”h]”hŒ+5. Capacity aware scheduling implementation”…””}”(hj‘h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hºhjŽh²hh³hÊh´MLubh¶)”}”(hhh]”(h»)”}”(hŒ5.1 CFS”h]”hŒ5.1 CFS”…””}”(hj¢h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hºhjŸh²hh³hÊh´MOubh¶)”}”(hhh]”(h»)”}”(hŒ5.1.1 Capacity fitness”h]”hŒ5.1.1 Capacity fitness”…””}”(hj³h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hºhj°h²hh³hÊh´MRubhî)”}”(hŒ2The main capacity scheduling criterion of CFS is::”h]”hŒ1The main capacity scheduling criterion of CFS is:”…””}”(hjÁh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´MThj°h²hubjt)”}”(hŒ$task_util(p) < capacity(task_cpu(p))”h]”hŒ$task_util(p) < capacity(task_cpu(p))”…””}”hjÏsbah}”(h]”h ]”h"]”h$]”h&]”jƒj„uh1jsh³hÊh´MVhj°h²hubhî)”}”(hŒÖThis is commonly called the capacity fitness criterion, i.e. CFS must ensure a task "fits" on its CPU. If it is violated, the task will need to achieve more work than what its CPU can provide: it will be CPU-bound.”h]”hŒÚThis is commonly called the capacity fitness criterion, i.e. CFS must ensure a task “fits†on its CPU. If it is violated, the task will need to achieve more work than what its CPU can provide: it will be CPU-bound.”…””}”(hjÝh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´MXhj°h²hubhî)”}”(hXFurthermore, uclamp lets userspace specify a minimum and a maximum utilization value for a task, either via sched_setattr() or via the cgroup interface (see Documentation/admin-guide/cgroup-v2.rst). As its name imply, this can be used to clamp task_util() in the previous criterion.”h]”hXFurthermore, uclamp lets userspace specify a minimum and a maximum utilization value for a task, either via sched_setattr() or via the cgroup interface (see Documentation/admin-guide/cgroup-v2.rst). As its name imply, this can be used to clamp task_util() in the previous criterion.”…””}”(hjëh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´M\hj°h²hubeh}”(h]”Œcapacity-fitness”ah ]”h"]”Œ5.1.1 capacity fitness”ah$]”h&]”uh1hµhjŸh²hh³hÊh´MRubh¶)”}”(hhh]”(h»)”}”(hŒ5.1.2 Wakeup CPU selection”h]”hŒ5.1.2 Wakeup CPU selection”…””}”(hj h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hºhj h²hh³hÊh´Mbubhî)”}”(hX)CFS task wakeup CPU selection follows the capacity fitness criterion described above. On top of that, uclamp is used to clamp the task utilization values, which lets userspace have more leverage over the CPU selection of CFS tasks. IOW, CFS wakeup CPU selection searches for a CPU that satisfies::”h]”hX(CFS task wakeup CPU selection follows the capacity fitness criterion described above. On top of that, uclamp is used to clamp the task utilization values, which lets userspace have more leverage over the CPU selection of CFS tasks. IOW, CFS wakeup CPU selection searches for a CPU that satisfies:”…””}”(hj h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´Mdhj h²hubjt)”}”(hŒKclamp(task_util(p), task_uclamp_min(p), task_uclamp_max(p)) < capacity(cpu)”h]”hŒKclamp(task_util(p), task_uclamp_min(p), task_uclamp_max(p)) < capacity(cpu)”…””}”hj sbah}”(h]”h ]”h"]”h$]”h&]”jƒj„uh1jsh³hÊh´Mihj h²hubhî)”}”(hXBy using uclamp, userspace can e.g. allow a busy loop (100% utilization) to run on any CPU by giving it a low uclamp.max value. Conversely, it can force a small periodic task (e.g. 10% utilization) to run on the highest-performance CPUs by giving it a high uclamp.min value.”h]”hXBy using uclamp, userspace can e.g. allow a busy loop (100% utilization) to run on any CPU by giving it a low uclamp.max value. Conversely, it can force a small periodic task (e.g. 10% utilization) to run on the highest-performance CPUs by giving it a high uclamp.min value.”…””}”(hj. h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´Mkhj h²hubhŒnote”“”)”}”(hŒWakeup CPU selection in CFS can be eclipsed by Energy Aware Scheduling (EAS), which is described in Documentation/scheduler/sched-energy.rst.”h]”hî)”}”(hŒWakeup CPU selection in CFS can be eclipsed by Energy Aware Scheduling (EAS), which is described in Documentation/scheduler/sched-energy.rst.”h]”hŒWakeup CPU selection in CFS can be eclipsed by Energy Aware Scheduling (EAS), which is described in Documentation/scheduler/sched-energy.rst.”…””}”(hjB h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´Mrhj> ubah}”(h]”h ]”h"]”h$]”h&]”uh1j< hj h²hh³hÊh´Nubeh}”(h]”Œwakeup-cpu-selection”ah ]”h"]”Œ5.1.2 wakeup cpu selection”ah$]”h&]”uh1hµhjŸh²hh³hÊh´Mbubh¶)”}”(hhh]”(h»)”}”(hŒ5.1.3 Load balancing”h]”hŒ5.1.3 Load balancing”…””}”(hja h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hºhj^ h²hh³hÊh´Mvubhî)”}”(hŒŒA pathological case in the wakeup CPU selection occurs when a task rarely sleeps, if at all - it thus rarely wakes up, if at all. Consider::”h]”hŒ‹A pathological case in the wakeup CPU selection occurs when a task rarely sleeps, if at all - it thus rarely wakes up, if at all. Consider:”…””}”(hjo h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´Mxhj^ h²hubjt)”}”(hX3w == wakeup event capacity(CPU0) = C capacity(CPU1) = C / 3 workload on CPU0 CPU work ^ | _________ _________ ____ | | | | | | +----+----+----+----+----+----+----+----+----+----+-> time w w w workload on CPU1 CPU work ^ | ____________________________________________ | | +----+----+----+----+----+----+----+----+----+----+-> w”h]”hX3w == wakeup event capacity(CPU0) = C capacity(CPU1) = C / 3 workload on CPU0 CPU work ^ | _________ _________ ____ | | | | | | +----+----+----+----+----+----+----+----+----+----+-> time w w w workload on CPU1 CPU work ^ | ____________________________________________ | | +----+----+----+----+----+----+----+----+----+----+-> w”…””}”hj} sbah}”(h]”h ]”h"]”h$]”h&]”jƒj„uh1jsh³hÊh´M{hj^ h²hubhî)”}”(hŒ9This workload should run on CPU0, but if the task either:”h]”hŒ9This workload should run on CPU0, but if the task either:”…””}”(hj‹ h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´MŽhj^ h²hubj)”}”(hhh]”(j)”}”(hŒSwas improperly scheduled from the start (inaccurate initial utilization estimation)”h]”hî)”}”(hŒSwas improperly scheduled from the start (inaccurate initial utilization estimation)”h]”hŒSwas improperly scheduled from the start (inaccurate initial utilization estimation)”…””}”(hj  h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´Mhjœ ubah}”(h]”h ]”h"]”h$]”h&]”uh1jhj™ h²hh³hÊh´Nubj)”}”(hŒPwas properly scheduled from the start, but suddenly needs more processing power ”h]”hî)”}”(hŒOwas properly scheduled from the start, but suddenly needs more processing power”h]”hŒOwas properly scheduled from the start, but suddenly needs more processing power”…””}”(hj¸ h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´M’hj´ ubah}”(h]”h ]”h"]”h$]”h&]”uh1jhj™ h²hh³hÊh´Nubeh}”(h]”h ]”h"]”h$]”h&]”jUjVuh1jh³hÊh´Mhj^ h²hubhî)”}”(hŒÔthen it might become CPU-bound, IOW ``task_util(p) > capacity(task_cpu(p))``; the CPU capacity scheduling criterion is violated, and there may not be any more wakeup event to fix this up via wakeup CPU selection.”h]”(hŒ$then it might become CPU-bound, IOW ”…””}”(hjÒ h²hh³Nh´Nubj§)”}”(hŒ(``task_util(p) > capacity(task_cpu(p))``”h]”hŒ$task_util(p) > capacity(task_cpu(p))”…””}”(hjÚ h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1j¦hjÒ ubhŒˆ; the CPU capacity scheduling criterion is violated, and there may not be any more wakeup event to fix this up via wakeup CPU selection.”…””}”(hjÒ h²hh³Nh´Nubeh}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´M•hj^ h²hubhî)”}”(hX·Tasks that are in this situation are dubbed "misfit" tasks, and the mechanism put in place to handle this shares the same name. Misfit task migration leverages the CFS load balancer, more specifically the active load balance part (which caters to migrating currently running tasks). When load balance happens, a misfit active load balance will be triggered if a misfit task can be migrated to a CPU with more capacity than its current one.”h]”hX»Tasks that are in this situation are dubbed “misfit†tasks, and the mechanism put in place to handle this shares the same name. Misfit task migration leverages the CFS load balancer, more specifically the active load balance part (which caters to migrating currently running tasks). When load balance happens, a misfit active load balance will be triggered if a misfit task can be migrated to a CPU with more capacity than its current one.”…””}”(hjò h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´M™hj^ h²hubeh}”(h]”Œload-balancing”ah ]”h"]”Œ5.1.3 load balancing”ah$]”h&]”uh1hµhjŸh²hh³hÊh´Mvubeh}”(h]”Œcfs”ah ]”h"]”Œ5.1 cfs”ah$]”h&]”uh1hµhjŽh²hh³hÊh´MOubh¶)”}”(hhh]”(h»)”}”(hŒ5.2 RT”h]”hŒ5.2 RT”…””}”(hj h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hºhj h²hh³hÊh´M¡ubh¶)”}”(hhh]”(h»)”}”(hŒ5.2.1 Wakeup CPU selection”h]”hŒ5.2.1 Wakeup CPU selection”…””}”(hj$ h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hºhj! h²hh³hÊh´M¤ubhî)”}”(hŒ@RT task wakeup CPU selection searches for a CPU that satisfies::”h]”hŒ?RT task wakeup CPU selection searches for a CPU that satisfies:”…””}”(hj2 h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´M¦hj! h²hubjt)”}”(hŒ-task_uclamp_min(p) <= capacity(task_cpu(cpu))”h]”hŒ-task_uclamp_min(p) <= capacity(task_cpu(cpu))”…””}”hj@ sbah}”(h]”h ]”h"]”h$]”h&]”jƒj„uh1jsh³hÊh´M¨hj! h²hubhî)”}”(hŒÊwhile still following the usual priority constraints. If none of the candidate CPUs can satisfy this capacity criterion, then strict priority based scheduling is followed and CPU capacities are ignored.”h]”hŒÊwhile still following the usual priority constraints. If none of the candidate CPUs can satisfy this capacity criterion, then strict priority based scheduling is followed and CPU capacities are ignored.”…””}”(hjN h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´Mªhj! h²hubeh}”(h]”Œid4”ah ]”h"]”Œ5.2.1 wakeup cpu selection”ah$]”h&]”uh1hµhj h²hh³hÊh´M¤ubeh}”(h]”Œrt”ah ]”h"]”Œ5.2 rt”ah$]”h&]”uh1hµhjŽh²hh³hÊh´M¡ubh¶)”}”(hhh]”(h»)”}”(hŒ5.3 DL”h]”hŒ5.3 DL”…””}”(hjo h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hºhjl h²hh³hÊh´M¯ubh¶)”}”(hhh]”(h»)”}”(hŒ5.3.1 Wakeup CPU selection”h]”hŒ5.3.1 Wakeup CPU selection”…””}”(hj€ h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hºhj} h²hh³hÊh´M²ubhî)”}”(hŒ@DL task wakeup CPU selection searches for a CPU that satisfies::”h]”hŒ?DL task wakeup CPU selection searches for a CPU that satisfies:”…””}”(hjŽ h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´M´hj} h²hubjt)”}”(hŒ)task_bandwidth(p) < capacity(task_cpu(p))”h]”hŒ)task_bandwidth(p) < capacity(task_cpu(p))”…””}”hjœ sbah}”(h]”h ]”h"]”h$]”h&]”jƒj„uh1jsh³hÊh´M¶hj} h²hubhî)”}”(hŒµwhile still respecting the usual bandwidth and deadline constraints. If none of the candidate CPUs can satisfy this capacity criterion, then the task will remain on its current CPU.”h]”hŒµwhile still respecting the usual bandwidth and deadline constraints. If none of the candidate CPUs can satisfy this capacity criterion, then the task will remain on its current CPU.”…””}”(hjª h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1híh³hÊh´M¸hj} h²hubeh}”(h]”Œid5”ah ]”h"]”Œ5.3.1 wakeup cpu selection”ah$]”h&]”uh1hµhjl h²hh³hÊh´M²ubeh}”(h]”Œdl”ah ]”h"]”Œ5.3 dl”ah$]”h&]”uh1hµhjŽh²hh³hÊh´M¯ubeh}”(h]”Œ(capacity-aware-scheduling-implementation”ah ]”h"]”Œ+5. capacity aware scheduling implementation”ah$]”h&]”uh1hµhh·h²hh³hÊh´MLubeh}”(h]”Œcapacity-aware-scheduling”ah ]”h"]”Œcapacity aware scheduling”ah$]”h&]”uh1hµhhh²hh³hÊh´Kubeh}”(h]”h ]”h"]”h$]”h&]”Œsource”hÊuh1hŒcurrent_source”NŒ current_line”NŒsettings”Œdocutils.frontend”ŒValues”“”)”}”(hºNŒ generator”NŒ datestamp”NŒ source_link”NŒ source_url”NŒ toc_backlinks”Œentry”Œfootnote_backlinks”KŒ sectnum_xform”KŒstrip_comments”NŒstrip_elements_with_classes”NŒ strip_classes”NŒ report_level”KŒ halt_level”KŒexit_status_level”KŒdebug”NŒwarning_stream”NŒ traceback”ˆŒinput_encoding”Œ utf-8-sig”Œinput_encoding_error_handler”Œstrict”Œoutput_encoding”Œutf-8”Œoutput_encoding_error_handler”jû Œerror_encoding”Œutf-8”Œerror_encoding_error_handler”Œbackslashreplace”Œ language_code”Œen”Œrecord_dependencies”NŒconfig”NŒ id_prefix”hŒauto_id_prefix”Œid”Œ dump_settings”NŒdump_internals”NŒdump_transforms”NŒdump_pseudo_xml”NŒexpose_internals”NŒstrict_visitor”NŒ_disable_config”NŒ_source”hÊŒ _destination”NŒ _config_files”]”Œ7/var/lib/git/docbuild/linux/Documentation/docutils.conf”aŒfile_insertion_enabled”ˆŒ raw_enabled”KŒline_length_limit”M'Œpep_references”NŒ pep_base_url”Œhttps://peps.python.org/”Œpep_file_url_template”Œpep-%04d”Œrfc_references”NŒ rfc_base_url”Œ&https://datatracker.ietf.org/doc/html/”Œ tab_width”KŒtrim_footnote_reference_space”‰Œsyntax_highlight”Œlong”Œ smart_quotes”ˆŒsmartquotes_locales”]”Œcharacter_level_inline_markup”‰Œdoctitle_xform”‰Œ docinfo_xform”KŒsectsubtitle_xform”‰Œ image_loading”Œlink”Œembed_stylesheet”‰Œcloak_email_addresses”ˆŒsection_self_link”‰Œenv”NubŒreporter”NŒindirect_targets”]”Œsubstitution_defs”}”Œsubstitution_names”}”Œrefnames”}”Œrefids”}”Œnameids”}”(jÕ jÒ jPjMjŠj‡j?j<j j jDjAjjjHjEjÜjÙj²j¯j[jXj<j9jjŠjÔjÑj_j\j"jjWjTj‹jˆjÍ jÊ j j jþjûj[ jX j j ji jf ja j^ jÅ j j½ jº uŒ nametypes”}”(jÕ ‰jP‰jЉj?‰j ‰jD‰j‰jH‰j܉j²‰j[‰j<‰j‰jÔ‰j_‰j"‰jW‰j‹‰jÍ ‰j ‰jþ‰j[ ‰j ‰ji ‰ja ‰jÅ ‰j½ ‰uh}”(jÒ h·jMhËj‡hÜj<jj jBjAjSjjGjEjjÙjSj¯jdjXjµj9j^jŠj?jÑjj\jßjjðjTj%jˆjbjÊ jŽj jŸjûj°jX j j j^ jf j j^ j! j jl jº j} uŒ footnote_refs”}”Œ citation_refs”}”Œ autofootnotes”]”Œautofootnote_refs”]”Œsymbol_footnotes”]”Œsymbol_footnote_refs”]”Œ footnotes”]”Œ citations”]”Œautofootnote_start”KŒsymbol_footnote_start”KŒ id_counter”Œ collections”ŒCounter”“”}”j Ks…”R”Œparse_messages”]”Œtransform_messages”]”Œ transformer”NŒ include_log”]”Œ decoration”Nh²hub.