€•µàŒsphinx.addnodes”Œdocument”“”)”}”(Œ rawsource”Œ”Œchildren”]”(Œ translations”Œ LanguagesNode”“”)”}”(hhh]”(hŒ pending_xref”“”)”}”(hhh]”Œdocutils.nodes”ŒText”“”ŒChinese (Simplified)”…””}”Œparent”hsbaŒ attributes”}”(Œids”]”Œclasses”]”Œnames”]”Œdupnames”]”Œbackrefs”]”Œ refdomain”Œstd”Œreftype”Œdoc”Œ reftarget”Œ*/translations/zh_CN/scheduler/sched-energy”Œmodname”NŒ classname”NŒ refexplicit”ˆuŒtagname”hhh ubh)”}”(hhh]”hŒChinese (Traditional)”…””}”hh2sbah}”(h]”h ]”h"]”h$]”h&]”Œ refdomain”h)Œreftype”h+Œ reftarget”Œ*/translations/zh_TW/scheduler/sched-energy”Œmodname”NŒ classname”NŒ refexplicit”ˆuh1hhh ubh)”}”(hhh]”hŒItalian”…””}”hhFsbah}”(h]”h ]”h"]”h$]”h&]”Œ refdomain”h)Œreftype”h+Œ reftarget”Œ*/translations/it_IT/scheduler/sched-energy”Œmodname”NŒ classname”NŒ refexplicit”ˆuh1hhh ubh)”}”(hhh]”hŒJapanese”…””}”hhZsbah}”(h]”h ]”h"]”h$]”h&]”Œ refdomain”h)Œreftype”h+Œ reftarget”Œ*/translations/ja_JP/scheduler/sched-energy”Œmodname”NŒ classname”NŒ refexplicit”ˆuh1hhh ubh)”}”(hhh]”hŒKorean”…””}”hhnsbah}”(h]”h ]”h"]”h$]”h&]”Œ refdomain”h)Œreftype”h+Œ reftarget”Œ*/translations/ko_KR/scheduler/sched-energy”Œmodname”NŒ classname”NŒ refexplicit”ˆuh1hhh ubh)”}”(hhh]”hŒPortuguese (Brazilian)”…””}”hh‚sbah}”(h]”h ]”h"]”h$]”h&]”Œ refdomain”h)Œreftype”h+Œ reftarget”Œ*/translations/pt_BR/scheduler/sched-energy”Œmodname”NŒ classname”NŒ refexplicit”ˆuh1hhh ubh)”}”(hhh]”hŒSpanish”…””}”hh–sbah}”(h]”h ]”h"]”h$]”h&]”Œ refdomain”h)Œreftype”h+Œ reftarget”Œ*/translations/sp_SP/scheduler/sched-energy”Œmodname”NŒ classname”NŒ refexplicit”ˆuh1hhh ubeh}”(h]”h ]”h"]”h$]”h&]”Œcurrent_language”ŒEnglish”uh1h hhŒ _document”hŒsource”NŒline”NubhŒsection”“”)”}”(hhh]”(hŒtitle”“”)”}”(hŒEnergy Aware Scheduling”h]”hŒEnergy Aware Scheduling”…””}”(hh¼h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hºhh·h²hh³ŒD/var/lib/git/docbuild/linux/Documentation/scheduler/sched-energy.rst”h´Kubh¶)”}”(hhh]”(h»)”}”(hŒ1. Introduction”h]”hŒ1. Introduction”…””}”(hhÎh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hºhhËh²hh³hÊh´KubhŒ paragraph”“”)”}”(hX©Energy Aware Scheduling (or EAS) gives the scheduler the ability to predict the impact of its decisions on the energy consumed by CPUs. EAS relies on an Energy Model (EM) of the CPUs to select an energy efficient CPU for each task, with a minimal impact on throughput. This document aims at providing an introduction on how EAS works, what are the main design decisions behind it, and details what is needed to get it to run.”h]”hX©Energy Aware Scheduling (or EAS) gives the scheduler the ability to predict the impact of its decisions on the energy consumed by CPUs. EAS relies on an Energy Model (EM) of the CPUs to select an energy efficient CPU for each task, with a minimal impact on throughput. This document aims at providing an introduction on how EAS works, what are the main design decisions behind it, and details what is needed to get it to run.”…””}”(hhÞh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÜh³hÊh´KhhËh²hubhÝ)”}”(hŒCBefore going any further, please note that at the time of writing::”h]”hŒBBefore going any further, please note that at the time of writing:”…””}”(hhìh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÜh³hÊh´KhhËh²hubhŒ literal_block”“”)”}”(hŒD/!\ EAS does not support platforms with symmetric CPU topologies /!\”h]”hŒD/!\ EAS does not support platforms with symmetric CPU topologies /!\”…””}”hhüsbah}”(h]”h ]”h"]”h$]”h&]”Œ xml:space”Œpreserve”uh1húh³hÊh´KhhËh²hubhÝ)”}”(hŒ£EAS operates only on heterogeneous CPU topologies (such as Arm big.LITTLE) because this is where the potential for saving energy through scheduling is the highest.”h]”hŒ£EAS operates only on heterogeneous CPU topologies (such as Arm big.LITTLE) because this is where the potential for saving energy through scheduling is the highest.”…””}”(hj h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÜh³hÊh´KhhËh²hubhÝ)”}”(hŒâThe actual EM used by EAS is _not_ maintained by the scheduler, but by a dedicated framework. For details about this framework and what it provides, please refer to its documentation (see Documentation/power/energy-model.rst).”h]”hŒâThe actual EM used by EAS is _not_ maintained by the scheduler, but by a dedicated framework. For details about this framework and what it provides, please refer to its documentation (see Documentation/power/energy-model.rst).”…””}”(hjh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÜh³hÊh´KhhËh²hubeh}”(h]”Œ introduction”ah ]”h"]”Œ1. introduction”ah$]”h&]”uh1hµhh·h²hh³hÊh´Kubh¶)”}”(hhh]”(h»)”}”(hŒ2. Background and Terminology”h]”hŒ2. Background and Terminology”…””}”(hj3h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hºhj0h²hh³hÊh´KubhŒdefinition_list”“”)”}”(hhh]”hŒdefinition_list_item”“”)”}”(hŒ‘To make it clear from the start: - energy = [joule] (resource like a battery on powered devices) - power = energy/time = [joule/second] = [watt] ”h]”(hŒterm”“”)”}”(hŒ To make it clear from the start:”h]”hŒ To make it clear from the start:”…””}”(hjNh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1jLh³hÊh´K!hjHubhŒ definition”“”)”}”(hhh]”hŒ bullet_list”“”)”}”(hhh]”(hŒ list_item”“”)”}”(hŒ=energy = [joule] (resource like a battery on powered devices)”h]”hÝ)”}”(hjjh]”hŒ=energy = [joule] (resource like a battery on powered devices)”…””}”(hjlh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÜh³hÊh´K hjhubah}”(h]”h ]”h"]”h$]”h&]”uh1jfhjcubjg)”}”(hŒ.power = energy/time = [joule/second] = [watt] ”h]”hÝ)”}”(hŒ-power = energy/time = [joule/second] = [watt]”h]”hŒ-power = energy/time = [joule/second] = [watt]”…””}”(hjƒh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÜh³hÊh´K!hjubah}”(h]”h ]”h"]”h$]”h&]”uh1jfhjcubeh}”(h]”h ]”h"]”h$]”h&]”Œbullet”Œ-”uh1jah³hÊh´K hj^ubah}”(h]”h ]”h"]”h$]”h&]”uh1j\hjHubeh}”(h]”h ]”h"]”h$]”h&]”uh1jFh³hÊh´K!hjCubah}”(h]”h ]”h"]”h$]”h&]”uh1jAhj0h²hh³Nh´NubhÝ)”}”(hŒgThe goal of EAS is to minimize energy, while still getting the job done. That is, we want to maximize::”h]”hŒfThe goal of EAS is to minimize energy, while still getting the job done. That is, we want to maximize:”…””}”(hj±h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÜh³hÊh´K#hj0h²hubhû)”}”(hŒ7performance [inst/s] -------------------- power [W]”h]”hŒ7performance [inst/s] -------------------- power [W]”…””}”hj¿sbah}”(h]”h ]”h"]”h$]”h&]”j j uh1húh³hÊh´K&hj0h²hubhÝ)”}”(hŒ#which is equivalent to minimizing::”h]”hŒ"which is equivalent to minimizing:”…””}”(hjÍh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÜh³hÊh´K*hj0h²hubhû)”}”(hŒ"energy [J] ----------- instruction”h]”hŒ"energy [J] ----------- instruction”…””}”hjÛsbah}”(h]”h ]”h"]”h$]”h&]”j j uh1húh³hÊh´K,hj0h²hubhÝ)”}”(hŒêwhile still getting 'good' performance. It is essentially an alternative optimization objective to the current performance-only objective for the scheduler. This alternative considers two objectives: energy-efficiency and performance.”h]”hŒîwhile still getting ‘good’ performance. It is essentially an alternative optimization objective to the current performance-only objective for the scheduler. This alternative considers two objectives: energy-efficiency and performance.”…””}”(hjéh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÜh³hÊh´K0hj0h²hubhÝ)”}”(hX7The idea behind introducing an EM is to allow the scheduler to evaluate the implications of its decisions rather than blindly applying energy-saving techniques that may have positive effects only on some platforms. At the same time, the EM must be as simple as possible to minimize the scheduler latency impact.”h]”hX7The idea behind introducing an EM is to allow the scheduler to evaluate the implications of its decisions rather than blindly applying energy-saving techniques that may have positive effects only on some platforms. At the same time, the EM must be as simple as possible to minimize the scheduler latency impact.”…””}”(hj÷h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÜh³hÊh´K5hj0h²hubhÝ)”}”(hXóIn short, EAS changes the way CFS tasks are assigned to CPUs. When it is time for the scheduler to decide where a task should run (during wake-up), the EM is used to break the tie between several good CPU candidates and pick the one that is predicted to yield the best energy consumption without harming the system's throughput. The predictions made by EAS rely on specific elements of knowledge about the platform's topology, which include the 'capacity' of CPUs, and their respective energy costs.”h]”hXûIn short, EAS changes the way CFS tasks are assigned to CPUs. When it is time for the scheduler to decide where a task should run (during wake-up), the EM is used to break the tie between several good CPU candidates and pick the one that is predicted to yield the best energy consumption without harming the system’s throughput. The predictions made by EAS rely on specific elements of knowledge about the platform’s topology, which include the ‘capacity’ of CPUs, and their respective energy costs.”…””}”(hjh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÜh³hÊh´K;hj0h²hubeh}”(h]”Œbackground-and-terminology”ah ]”h"]”Œ2. background and terminology”ah$]”h&]”uh1hµhh·h²hh³hÊh´Kubh¶)”}”(hhh]”(h»)”}”(hŒ3. Topology information”h]”hŒ3. Topology information”…””}”(hjh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hºhjh²hh³hÊh´KEubhÝ)”}”(hXçEAS (as well as the rest of the scheduler) uses the notion of 'capacity' to differentiate CPUs with different computing throughput. The 'capacity' of a CPU represents the amount of work it can absorb when running at its highest frequency compared to the most capable CPU of the system. Capacity values are normalized in a 1024 range, and are comparable with the utilization signals of tasks and CPUs computed by the Per-Entity Load Tracking (PELT) mechanism. Thanks to capacity and utilization values, EAS is able to estimate how big/busy a task/CPU is, and to take this into consideration when evaluating performance vs energy trade-offs. The capacity of CPUs is provided via arch-specific code through the arch_scale_cpu_capacity() callback.”h]”hXïEAS (as well as the rest of the scheduler) uses the notion of ‘capacity’ to differentiate CPUs with different computing throughput. The ‘capacity’ of a CPU represents the amount of work it can absorb when running at its highest frequency compared to the most capable CPU of the system. Capacity values are normalized in a 1024 range, and are comparable with the utilization signals of tasks and CPUs computed by the Per-Entity Load Tracking (PELT) mechanism. Thanks to capacity and utilization values, EAS is able to estimate how big/busy a task/CPU is, and to take this into consideration when evaluating performance vs energy trade-offs. The capacity of CPUs is provided via arch-specific code through the arch_scale_cpu_capacity() callback.”…””}”(hj,h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÜh³hÊh´KGhjh²hubhÝ)”}”(hXThe rest of platform knowledge used by EAS is directly read from the Energy Model (EM) framework. The EM of a platform is composed of a power cost table per 'performance domain' in the system (see Documentation/power/energy-model.rst for further details about performance domains).”h]”hXThe rest of platform knowledge used by EAS is directly read from the Energy Model (EM) framework. The EM of a platform is composed of a power cost table per ‘performance domain’ in the system (see Documentation/power/energy-model.rst for further details about performance domains).”…””}”(hj:h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÜh³hÊh´KRhjh²hubhÝ)”}”(hXfThe scheduler manages references to the EM objects in the topology code when the scheduling domains are built, or re-built. For each root domain (rd), the scheduler maintains a singly linked list of all performance domains intersecting the current rd->span. Each node in the list contains a pointer to a struct em_perf_domain as provided by the EM framework.”h]”hXfThe scheduler manages references to the EM objects in the topology code when the scheduling domains are built, or re-built. For each root domain (rd), the scheduler maintains a singly linked list of all performance domains intersecting the current rd->span. Each node in the list contains a pointer to a struct em_perf_domain as provided by the EM framework.”…””}”(hjHh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÜh³hÊh´KWhjh²hubhÝ)”}”(hXThe lists are attached to the root domains in order to cope with exclusive cpuset configurations. Since the boundaries of exclusive cpusets do not necessarily match those of performance domains, the lists of different root domains can contain duplicate elements.”h]”hXThe lists are attached to the root domains in order to cope with exclusive cpuset configurations. Since the boundaries of exclusive cpusets do not necessarily match those of performance domains, the lists of different root domains can contain duplicate elements.”…””}”(hjVh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÜh³hÊh´K]hjh²hubjB)”}”(hhh]”jG)”}”(hXOExample 1. Let us consider a platform with 12 CPUs, split in 3 performance domains (pd0, pd4 and pd8), organized as follows:: CPUs: 0 1 2 3 4 5 6 7 8 9 10 11 PDs: |--pd0--|--pd4--|---pd8---| RDs: |----rd1----|-----rd2-----| Now, consider that userspace decided to split the system with two exclusive cpusets, hence creating two independent root domains, each containing 6 CPUs. The two root domains are denoted rd1 and rd2 in the above figure. Since pd4 intersects with both rd1 and rd2, it will be present in the linked list '->pd' attached to each of them: * rd1->pd: pd0 -> pd4 * rd2->pd: pd4 -> pd8 Please note that the scheduler will create two duplicate list nodes for pd4 (one for each list). However, both just hold a pointer to the same shared data structure of the EM framework. ”h]”(jM)”}”(hŒ Example 1.”h]”hŒ Example 1.”…””}”(hjkh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1jLh³hÊh´Kuhjgubj])”}”(hhh]”(hÝ)”}”(hŒrLet us consider a platform with 12 CPUs, split in 3 performance domains (pd0, pd4 and pd8), organized as follows::”h]”hŒqLet us consider a platform with 12 CPUs, split in 3 performance domains (pd0, pd4 and pd8), organized as follows:”…””}”(hj|h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÜh³hÊh´Kchjyubhû)”}”(hŒgCPUs: 0 1 2 3 4 5 6 7 8 9 10 11 PDs: |--pd0--|--pd4--|---pd8---| RDs: |----rd1----|-----rd2-----|”h]”hŒgCPUs: 0 1 2 3 4 5 6 7 8 9 10 11 PDs: |--pd0--|--pd4--|---pd8---| RDs: |----rd1----|-----rd2-----|”…””}”hjŠsbah}”(h]”h ]”h"]”h$]”h&]”j j uh1húh³hÊh´KfhjyubhÝ)”}”(hXNNow, consider that userspace decided to split the system with two exclusive cpusets, hence creating two independent root domains, each containing 6 CPUs. The two root domains are denoted rd1 and rd2 in the above figure. Since pd4 intersects with both rd1 and rd2, it will be present in the linked list '->pd' attached to each of them:”h]”hXRNow, consider that userspace decided to split the system with two exclusive cpusets, hence creating two independent root domains, each containing 6 CPUs. The two root domains are denoted rd1 and rd2 in the above figure. Since pd4 intersects with both rd1 and rd2, it will be present in the linked list ‘->pd’ attached to each of them:”…””}”(hj˜h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÜh³hÊh´KjhjyubhŒ block_quote”“”)”}”(hŒ,* rd1->pd: pd0 -> pd4 * rd2->pd: pd4 -> pd8 ”h]”jb)”}”(hhh]”(jg)”}”(hŒrd1->pd: pd0 -> pd4”h]”hÝ)”}”(hj±h]”hŒrd1->pd: pd0 -> pd4”…””}”(hj³h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÜh³hÊh´Kphj¯ubah}”(h]”h ]”h"]”h$]”h&]”uh1jfhj¬ubjg)”}”(hŒrd2->pd: pd4 -> pd8 ”h]”hÝ)”}”(hŒrd2->pd: pd4 -> pd8”h]”hŒrd2->pd: pd4 -> pd8”…””}”(hjÊh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÜh³hÊh´KqhjÆubah}”(h]”h ]”h"]”h$]”h&]”uh1jfhj¬ubeh}”(h]”h ]”h"]”h$]”h&]”jŒ*”uh1jah³hÊh´Kphj¨ubah}”(h]”h ]”h"]”h$]”h&]”uh1j¦h³hÊh´KphjyubhÝ)”}”(hŒ¹Please note that the scheduler will create two duplicate list nodes for pd4 (one for each list). However, both just hold a pointer to the same shared data structure of the EM framework.”h]”hŒ¹Please note that the scheduler will create two duplicate list nodes for pd4 (one for each list). However, both just hold a pointer to the same shared data structure of the EM framework.”…””}”(hjëh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÜh³hÊh´Kshjyubeh}”(h]”h ]”h"]”h$]”h&]”uh1j\hjgubeh}”(h]”h ]”h"]”h$]”h&]”uh1jFh³hÊh´Kuhjdubah}”(h]”h ]”h"]”h$]”h&]”uh1jAhjh²hh³hÊh´NubhÝ)”}”(hŒ´Since the access to these lists can happen concurrently with hotplug and other things, they are protected by RCU, like the rest of topology structures manipulated by the scheduler.”h]”hŒ´Since the access to these lists can happen concurrently with hotplug and other things, they are protected by RCU, like the rest of topology structures manipulated by the scheduler.”…””}”(hj h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÜh³hÊh´Kwhjh²hubhÝ)”}”(hŒºEAS also maintains a static key (sched_energy_present) which is enabled when at least one root domain meets all conditions for EAS to start. Those conditions are summarized in Section 6.”h]”hŒºEAS also maintains a static key (sched_energy_present) which is enabled when at least one root domain meets all conditions for EAS to start. Those conditions are summarized in Section 6.”…””}”(hjh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÜh³hÊh´K{hjh²hubeh}”(h]”Œtopology-information”ah ]”h"]”Œ3. topology information”ah$]”h&]”uh1hµhh·h²hh³hÊh´KEubh¶)”}”(hhh]”(h»)”}”(hŒ4. Energy-Aware task placement”h]”hŒ4. Energy-Aware task placement”…””}”(hj2h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hºhj/h²hh³hÊh´KubhÝ)”}”(hX~EAS overrides the CFS task wake-up balancing code. It uses the EM of the platform and the PELT signals to choose an energy-efficient target CPU during wake-up balance. When EAS is enabled, select_task_rq_fair() calls find_energy_efficient_cpu() to do the placement decision. This function looks for the CPU with the highest spare capacity (CPU capacity - CPU utilization) in each performance domain since it is the one which will allow us to keep the frequency the lowest. Then, the function checks if placing the task there could save energy compared to leaving it on prev_cpu, i.e. the CPU where the task ran in its previous activation.”h]”hX~EAS overrides the CFS task wake-up balancing code. It uses the EM of the platform and the PELT signals to choose an energy-efficient target CPU during wake-up balance. When EAS is enabled, select_task_rq_fair() calls find_energy_efficient_cpu() to do the placement decision. This function looks for the CPU with the highest spare capacity (CPU capacity - CPU utilization) in each performance domain since it is the one which will allow us to keep the frequency the lowest. Then, the function checks if placing the task there could save energy compared to leaving it on prev_cpu, i.e. the CPU where the task ran in its previous activation.”…””}”(hj@h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÜh³hÊh´Kƒhj/h²hubhÝ)”}”(hX¦find_energy_efficient_cpu() uses compute_energy() to estimate what will be the energy consumed by the system if the waking task was migrated. compute_energy() looks at the current utilization landscape of the CPUs and adjusts it to 'simulate' the task migration. The EM framework provides the em_pd_energy() API which computes the expected energy consumption of each performance domain for the given utilization landscape.”h]”hXªfind_energy_efficient_cpu() uses compute_energy() to estimate what will be the energy consumed by the system if the waking task was migrated. compute_energy() looks at the current utilization landscape of the CPUs and adjusts it to ‘simulate’ the task migration. The EM framework provides the em_pd_energy() API which computes the expected energy consumption of each performance domain for the given utilization landscape.”…””}”(hjNh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÜh³hÊh´Khj/h²hubhÝ)”}”(hŒIAn example of energy-optimized task placement decision is detailed below.”h]”hŒIAn example of energy-optimized task placement decision is detailed below.”…””}”(hj\h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÜh³hÊh´K”hj/h²hubjB)”}”(hhh]”jG)”}”(hX¥Example 2. Let us consider a (fake) platform with 2 independent performance domains composed of two CPUs each. CPU0 and CPU1 are little CPUs; CPU2 and CPU3 are big. The scheduler must decide where to place a task P whose util_avg = 200 and prev_cpu = 0. The current utilization landscape of the CPUs is depicted on the graph below. CPUs 0-3 have a util_avg of 400, 100, 600 and 500 respectively Each performance domain has three Operating Performance Points (OPPs). The CPU capacity and power cost associated with each OPP is listed in the Energy Model table. The util_avg of P is shown on the figures below as 'PP':: CPU util. 1024 - - - - - - - Energy Model +-----------+-------------+ | Little | Big | 768 ============= +-----+-----+------+------+ | Cap | Pwr | Cap | Pwr | +-----+-----+------+------+ 512 =========== - ##- - - - - | 170 | 50 | 512 | 400 | ## ## | 341 | 150 | 768 | 800 | 341 -PP - - - - ## ## | 512 | 300 | 1024 | 1700 | PP ## ## +-----+-----+------+------+ 170 -## - - - - ## ## ## ## ## ## ------------ ------------- CPU0 CPU1 CPU2 CPU3 Current OPP: ===== Other OPP: - - - util_avg (100 each): ## find_energy_efficient_cpu() will first look for the CPUs with the maximum spare capacity in the two performance domains. In this example, CPU1 and CPU3. Then it will estimate the energy of the system if P was placed on either of them, and check if that would save some energy compared to leaving P on CPU0. EAS assumes that OPPs follow utilization (which is coherent with the behaviour of the schedutil CPUFreq governor, see Section 6. for more details on this topic). **Case 1. P is migrated to CPU1**:: 1024 - - - - - - - Energy calculation: 768 ============= * CPU0: 200 / 341 * 150 = 88 * CPU1: 300 / 341 * 150 = 131 * CPU2: 600 / 768 * 800 = 625 512 - - - - - - - ##- - - - - * CPU3: 500 / 768 * 800 = 520 ## ## => total_energy = 1364 341 =========== ## ## PP ## ## 170 -## - - PP- ## ## ## ## ## ## ------------ ------------- CPU0 CPU1 CPU2 CPU3 **Case 2. P is migrated to CPU3**:: 1024 - - - - - - - Energy calculation: 768 ============= * CPU0: 200 / 341 * 150 = 88 * CPU1: 100 / 341 * 150 = 43 PP * CPU2: 600 / 768 * 800 = 625 512 - - - - - - - ##- - -PP - * CPU3: 700 / 768 * 800 = 729 ## ## => total_energy = 1485 341 =========== ## ## ## ## 170 -## - - - - ## ## ## ## ## ## ------------ ------------- CPU0 CPU1 CPU2 CPU3 **Case 3. P stays on prev_cpu / CPU 0**:: 1024 - - - - - - - Energy calculation: 768 ============= * CPU0: 400 / 512 * 300 = 234 * CPU1: 100 / 512 * 300 = 58 * CPU2: 600 / 768 * 800 = 625 512 =========== - ##- - - - - * CPU3: 500 / 768 * 800 = 520 ## ## => total_energy = 1437 341 -PP - - - - ## ## PP ## ## 170 -## - - - - ## ## ## ## ## ## ------------ ------------- CPU0 CPU1 CPU2 CPU3 From these calculations, the Case 1 has the lowest total energy. So CPU 1 is the best candidate from an energy-efficiency standpoint. ”h]”(jM)”}”(hŒ Example 2.”h]”hŒ Example 2.”…””}”(hjqh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1jLh³hÊh´K÷hjmubj])”}”(hhh]”(hÝ)”}”(hŒ™Let us consider a (fake) platform with 2 independent performance domains composed of two CPUs each. CPU0 and CPU1 are little CPUs; CPU2 and CPU3 are big.”h]”hŒ™Let us consider a (fake) platform with 2 independent performance domains composed of two CPUs each. CPU0 and CPU1 are little CPUs; CPU2 and CPU3 are big.”…””}”(hj‚h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÜh³hÊh´K—hjubhÝ)”}”(hŒXThe scheduler must decide where to place a task P whose util_avg = 200 and prev_cpu = 0.”h]”hŒXThe scheduler must decide where to place a task P whose util_avg = 200 and prev_cpu = 0.”…””}”(hjh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÜh³hÊh´K›hjubhÝ)”}”(hXkThe current utilization landscape of the CPUs is depicted on the graph below. CPUs 0-3 have a util_avg of 400, 100, 600 and 500 respectively Each performance domain has three Operating Performance Points (OPPs). The CPU capacity and power cost associated with each OPP is listed in the Energy Model table. The util_avg of P is shown on the figures below as 'PP'::”h]”hXnThe current utilization landscape of the CPUs is depicted on the graph below. CPUs 0-3 have a util_avg of 400, 100, 600 and 500 respectively Each performance domain has three Operating Performance Points (OPPs). The CPU capacity and power cost associated with each OPP is listed in the Energy Model table. The util_avg of P is shown on the figures below as ‘PP’:”…””}”(hjžh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÜh³hÊh´Kžhjubhû)”}”(hXCPU util. 1024 - - - - - - - Energy Model +-----------+-------------+ | Little | Big | 768 ============= +-----+-----+------+------+ | Cap | Pwr | Cap | Pwr | +-----+-----+------+------+ 512 =========== - ##- - - - - | 170 | 50 | 512 | 400 | ## ## | 341 | 150 | 768 | 800 | 341 -PP - - - - ## ## | 512 | 300 | 1024 | 1700 | PP ## ## +-----+-----+------+------+ 170 -## - - - - ## ## ## ## ## ## ------------ ------------- CPU0 CPU1 CPU2 CPU3 Current OPP: ===== Other OPP: - - - util_avg (100 each): ##”h]”hXCPU util. 1024 - - - - - - - Energy Model +-----------+-------------+ | Little | Big | 768 ============= +-----+-----+------+------+ | Cap | Pwr | Cap | Pwr | +-----+-----+------+------+ 512 =========== - ##- - - - - | 170 | 50 | 512 | 400 | ## ## | 341 | 150 | 768 | 800 | 341 -PP - - - - ## ## | 512 | 300 | 1024 | 1700 | PP ## ## +-----+-----+------+------+ 170 -## - - - - ## ## ## ## ## ## ------------ ------------- CPU0 CPU1 CPU2 CPU3 Current OPP: ===== Other OPP: - - - util_avg (100 each): ##”…””}”hj¬sbah}”(h]”h ]”h"]”h$]”h&]”j j uh1húh³hÊh´K¥hjubhÝ)”}”(hXÔfind_energy_efficient_cpu() will first look for the CPUs with the maximum spare capacity in the two performance domains. In this example, CPU1 and CPU3. Then it will estimate the energy of the system if P was placed on either of them, and check if that would save some energy compared to leaving P on CPU0. EAS assumes that OPPs follow utilization (which is coherent with the behaviour of the schedutil CPUFreq governor, see Section 6. for more details on this topic).”h]”hXÔfind_energy_efficient_cpu() will first look for the CPUs with the maximum spare capacity in the two performance domains. In this example, CPU1 and CPU3. Then it will estimate the energy of the system if P was placed on either of them, and check if that would save some energy compared to leaving P on CPU0. EAS assumes that OPPs follow utilization (which is coherent with the behaviour of the schedutil CPUFreq governor, see Section 6. for more details on this topic).”…””}”(hjºh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÜh³hÊh´K¸hjubhÝ)”}”(hŒ#**Case 1. P is migrated to CPU1**::”h]”(hŒstrong”“”)”}”(hŒ!**Case 1. P is migrated to CPU1**”h]”hŒCase 1. P is migrated to CPU1”…””}”(hjÎh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1jÌhjÈubhŒ:”…””}”(hjÈh²hh³Nh´Nubeh}”(h]”h ]”h"]”h$]”h&]”uh1hÜh³hÊh´KÀhjubhû)”}”(hXz1024 - - - - - - - Energy calculation: 768 ============= * CPU0: 200 / 341 * 150 = 88 * CPU1: 300 / 341 * 150 = 131 * CPU2: 600 / 768 * 800 = 625 512 - - - - - - - ##- - - - - * CPU3: 500 / 768 * 800 = 520 ## ## => total_energy = 1364 341 =========== ## ## PP ## ## 170 -## - - PP- ## ## ## ## ## ## ------------ ------------- CPU0 CPU1 CPU2 CPU3”h]”hXz1024 - - - - - - - Energy calculation: 768 ============= * CPU0: 200 / 341 * 150 = 88 * CPU1: 300 / 341 * 150 = 131 * CPU2: 600 / 768 * 800 = 625 512 - - - - - - - ##- - - - - * CPU3: 500 / 768 * 800 = 520 ## ## => total_energy = 1364 341 =========== ## ## PP ## ## 170 -## - - PP- ## ## ## ## ## ## ------------ ------------- CPU0 CPU1 CPU2 CPU3”…””}”hjæsbah}”(h]”h ]”h"]”h$]”h&]”j j uh1húh³hÊh´KÂhjubhÝ)”}”(hŒ#**Case 2. P is migrated to CPU3**::”h]”(jÍ)”}”(hŒ!**Case 2. P is migrated to CPU3**”h]”hŒCase 2. P is migrated to CPU3”…””}”(hjøh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1jÌhjôubhŒ:”…””}”(hjôh²hh³Nh´Nubeh}”(h]”h ]”h"]”h$]”h&]”uh1hÜh³hÊh´KÒhjubhû)”}”(hXy1024 - - - - - - - Energy calculation: 768 ============= * CPU0: 200 / 341 * 150 = 88 * CPU1: 100 / 341 * 150 = 43 PP * CPU2: 600 / 768 * 800 = 625 512 - - - - - - - ##- - -PP - * CPU3: 700 / 768 * 800 = 729 ## ## => total_energy = 1485 341 =========== ## ## ## ## 170 -## - - - - ## ## ## ## ## ## ------------ ------------- CPU0 CPU1 CPU2 CPU3”h]”hXy1024 - - - - - - - Energy calculation: 768 ============= * CPU0: 200 / 341 * 150 = 88 * CPU1: 100 / 341 * 150 = 43 PP * CPU2: 600 / 768 * 800 = 625 512 - - - - - - - ##- - -PP - * CPU3: 700 / 768 * 800 = 729 ## ## => total_energy = 1485 341 =========== ## ## ## ## 170 -## - - - - ## ## ## ## ## ## ------------ ------------- CPU0 CPU1 CPU2 CPU3”…””}”hjsbah}”(h]”h ]”h"]”h$]”h&]”j j uh1húh³hÊh´KÔhjubhÝ)”}”(hŒ)**Case 3. P stays on prev_cpu / CPU 0**::”h]”(jÍ)”}”(hŒ'**Case 3. P stays on prev_cpu / CPU 0**”h]”hŒ#Case 3. P stays on prev_cpu / CPU 0”…””}”(hj"h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1jÌhjubhŒ:”…””}”(hjh²hh³Nh´Nubeh}”(h]”h ]”h"]”h$]”h&]”uh1hÜh³hÊh´Kähjubhû)”}”(hXz1024 - - - - - - - Energy calculation: 768 ============= * CPU0: 400 / 512 * 300 = 234 * CPU1: 100 / 512 * 300 = 58 * CPU2: 600 / 768 * 800 = 625 512 =========== - ##- - - - - * CPU3: 500 / 768 * 800 = 520 ## ## => total_energy = 1437 341 -PP - - - - ## ## PP ## ## 170 -## - - - - ## ## ## ## ## ## ------------ ------------- CPU0 CPU1 CPU2 CPU3”h]”hXz1024 - - - - - - - Energy calculation: 768 ============= * CPU0: 400 / 512 * 300 = 234 * CPU1: 100 / 512 * 300 = 58 * CPU2: 600 / 768 * 800 = 625 512 =========== - ##- - - - - * CPU3: 500 / 768 * 800 = 520 ## ## => total_energy = 1437 341 -PP - - - - ## ## PP ## ## 170 -## - - - - ## ## ## ## ## ## ------------ ------------- CPU0 CPU1 CPU2 CPU3”…””}”hj:sbah}”(h]”h ]”h"]”h$]”h&]”j j uh1húh³hÊh´KæhjubhÝ)”}”(hŒ…From these calculations, the Case 1 has the lowest total energy. So CPU 1 is the best candidate from an energy-efficiency standpoint.”h]”hŒ…From these calculations, the Case 1 has the lowest total energy. So CPU 1 is the best candidate from an energy-efficiency standpoint.”…””}”(hjHh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÜh³hÊh´Köhjubeh}”(h]”h ]”h"]”h$]”h&]”uh1j\hjmubeh}”(h]”h ]”h"]”h$]”h&]”uh1jFh³hÊh´K÷hjjubah}”(h]”h ]”h"]”h$]”h&]”uh1jAhj/h²hh³hÊh´NubhÝ)”}”(hX>Big CPUs are generally more power hungry than the little ones and are thus used mainly when a task doesn't fit the littles. However, little CPUs aren't always necessarily more energy-efficient than big CPUs. For some systems, the high OPPs of the little CPUs can be less energy-efficient than the lowest OPPs of the bigs, for example. So, if the little CPUs happen to have enough utilization at a specific point in time, a small task waking up at that moment could be better off executing on the big side in order to save energy, even though it would fit on the little side.”h]”hXBBig CPUs are generally more power hungry than the little ones and are thus used mainly when a task doesn’t fit the littles. However, little CPUs aren’t always necessarily more energy-efficient than big CPUs. For some systems, the high OPPs of the little CPUs can be less energy-efficient than the lowest OPPs of the bigs, for example. So, if the little CPUs happen to have enough utilization at a specific point in time, a small task waking up at that moment could be better off executing on the big side in order to save energy, even though it would fit on the little side.”…””}”(hjhh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÜh³hÊh´Kùhj/h²hubhÝ)”}”(hX And even in the case where all OPPs of the big CPUs are less energy-efficient than those of the little, using the big CPUs for a small task might still, under specific conditions, save energy. Indeed, placing a task on a little CPU can result in raising the OPP of the entire performance domain, and that will increase the cost of the tasks already running there. If the waking task is placed on a big CPU, its own execution cost might be higher than if it was running on a little, but it won't impact the other tasks of the little CPUs which will keep running at a lower OPP. So, when considering the total energy consumed by CPUs, the extra cost of running that one task on a big core can be smaller than the cost of raising the OPP on the little CPUs for all the other tasks.”h]”hX And even in the case where all OPPs of the big CPUs are less energy-efficient than those of the little, using the big CPUs for a small task might still, under specific conditions, save energy. Indeed, placing a task on a little CPU can result in raising the OPP of the entire performance domain, and that will increase the cost of the tasks already running there. If the waking task is placed on a big CPU, its own execution cost might be higher than if it was running on a little, but it won’t impact the other tasks of the little CPUs which will keep running at a lower OPP. So, when considering the total energy consumed by CPUs, the extra cost of running that one task on a big core can be smaller than the cost of raising the OPP on the little CPUs for all the other tasks.”…””}”(hjvh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÜh³hÊh´Mhj/h²hubhÝ)”}”(hX®The examples above would be nearly impossible to get right in a generic way, and for all platforms, without knowing the cost of running at different OPPs on all CPUs of the system. Thanks to its EM-based design, EAS should cope with them correctly without too many troubles. However, in order to ensure a minimal impact on throughput for high-utilization scenarios, EAS also implements another mechanism called 'over-utilization'.”h]”hX²The examples above would be nearly impossible to get right in a generic way, and for all platforms, without knowing the cost of running at different OPPs on all CPUs of the system. Thanks to its EM-based design, EAS should cope with them correctly without too many troubles. However, in order to ensure a minimal impact on throughput for high-utilization scenarios, EAS also implements another mechanism called ‘over-utilization’.”…””}”(hj„h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÜh³hÊh´Mhj/h²hubeh}”(h]”Œenergy-aware-task-placement”ah ]”h"]”Œ4. energy-aware task placement”ah$]”h&]”uh1hµhh·h²hh³hÊh´Kubh¶)”}”(hhh]”(h»)”}”(hŒ5. Over-utilization”h]”hŒ5. Over-utilization”…””}”(hjh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hºhjšh²hh³hÊh´MubhÝ)”}”(hXÃFrom a general standpoint, the use-cases where EAS can help the most are those involving a light/medium CPU utilization. Whenever long CPU-bound tasks are being run, they will require all of the available CPU capacity, and there isn't much that can be done by the scheduler to save energy without severely harming throughput. In order to avoid hurting performance with EAS, CPUs are flagged as 'over-utilized' as soon as they are used at more than 80% of their compute capacity. As long as no CPUs are over-utilized in a root domain, load balancing is disabled and EAS overrides the wake-up balancing code. EAS is likely to load the most energy efficient CPUs of the system more than the others if that can be done without harming throughput. So, the load-balancer is disabled to prevent it from breaking the energy-efficient task placement found by EAS. It is safe to do so when the system isn't overutilized since being below the 80% tipping point implies that:”h]”hXËFrom a general standpoint, the use-cases where EAS can help the most are those involving a light/medium CPU utilization. Whenever long CPU-bound tasks are being run, they will require all of the available CPU capacity, and there isn’t much that can be done by the scheduler to save energy without severely harming throughput. In order to avoid hurting performance with EAS, CPUs are flagged as ‘over-utilized’ as soon as they are used at more than 80% of their compute capacity. As long as no CPUs are over-utilized in a root domain, load balancing is disabled and EAS overrides the wake-up balancing code. EAS is likely to load the most energy efficient CPUs of the system more than the others if that can be done without harming throughput. So, the load-balancer is disabled to prevent it from breaking the energy-efficient task placement found by EAS. It is safe to do so when the system isn’t overutilized since being below the 80% tipping point implies that:”…””}”(hj«h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÜh³hÊh´Mhjšh²hubj§)”}”(hX‡a. there is some idle time on all CPUs, so the utilization signals used by EAS are likely to accurately represent the 'size' of the various tasks in the system; b. all tasks should already be provided with enough CPU capacity, regardless of their nice values; c. since there is spare capacity all tasks must be blocking/sleeping regularly and balancing at wake-up is sufficient. ”h]”hŒenumerated_list”“”)”}”(hhh]”(jg)”}”(hŒthere is some idle time on all CPUs, so the utilization signals used by EAS are likely to accurately represent the 'size' of the various tasks in the system;”h]”hÝ)”}”(hŒthere is some idle time on all CPUs, so the utilization signals used by EAS are likely to accurately represent the 'size' of the various tasks in the system;”h]”hŒ¡there is some idle time on all CPUs, so the utilization signals used by EAS are likely to accurately represent the ‘size’ of the various tasks in the system;”…””}”(hjÆh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÜh³hÊh´M'hjÂubah}”(h]”h ]”h"]”h$]”h&]”uh1jfhj¿ubjg)”}”(hŒ_all tasks should already be provided with enough CPU capacity, regardless of their nice values;”h]”hÝ)”}”(hŒ_all tasks should already be provided with enough CPU capacity, regardless of their nice values;”h]”hŒ_all tasks should already be provided with enough CPU capacity, regardless of their nice values;”…””}”(hjÞh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÜh³hÊh´M*hjÚubah}”(h]”h ]”h"]”h$]”h&]”uh1jfhj¿ubjg)”}”(hŒtsince there is spare capacity all tasks must be blocking/sleeping regularly and balancing at wake-up is sufficient. ”h]”hÝ)”}”(hŒssince there is spare capacity all tasks must be blocking/sleeping regularly and balancing at wake-up is sufficient.”h]”hŒssince there is spare capacity all tasks must be blocking/sleeping regularly and balancing at wake-up is sufficient.”…””}”(hjöh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÜh³hÊh´M,hjòubah}”(h]”h ]”h"]”h$]”h&]”uh1jfhj¿ubeh}”(h]”h ]”h"]”h$]”h&]”Œenumtype”Œ loweralpha”Œprefix”hŒsuffix”Œ.”uh1j½hj¹ubah}”(h]”h ]”h"]”h$]”h&]”uh1j¦h³hÊh´M'hjšh²hubhÝ)”}”(hX¯As soon as one CPU goes above the 80% tipping point, at least one of the three assumptions above becomes incorrect. In this scenario, the 'overutilized' flag is raised for the entire root domain, EAS is disabled, and the load-balancer is re-enabled. By doing so, the scheduler falls back onto load-based algorithms for wake-up and load balance under CPU-bound conditions. This provides a better respect of the nice values of tasks.”h]”hX³As soon as one CPU goes above the 80% tipping point, at least one of the three assumptions above becomes incorrect. In this scenario, the ‘overutilized’ flag is raised for the entire root domain, EAS is disabled, and the load-balancer is re-enabled. By doing so, the scheduler falls back onto load-based algorithms for wake-up and load balance under CPU-bound conditions. This provides a better respect of the nice values of tasks.”…””}”(hjh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÜh³hÊh´M/hjšh²hubhÝ)”}”(hXvSince the notion of overutilization largely relies on detecting whether or not there is some idle time in the system, the CPU capacity 'stolen' by higher (than CFS) scheduling classes (as well as IRQ) must be taken into account. As such, the detection of overutilization accounts for the capacity used not only by CFS tasks, but also by the other scheduling classes and IRQ.”h]”hXzSince the notion of overutilization largely relies on detecting whether or not there is some idle time in the system, the CPU capacity ‘stolen’ by higher (than CFS) scheduling classes (as well as IRQ) must be taken into account. As such, the detection of overutilization accounts for the capacity used not only by CFS tasks, but also by the other scheduling classes and IRQ.”…””}”(hj)h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÜh³hÊh´M6hjšh²hubeh}”(h]”Œover-utilization”ah ]”h"]”Œ5. over-utilization”ah$]”h&]”uh1hµhh·h²hh³hÊh´Mubh¶)”}”(hhh]”(h»)”}”(hŒ(6. Dependencies and requirements for EAS”h]”hŒ(6. Dependencies and requirements for EAS”…””}”(hjBh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hºhj?h²hh³hÊh´M>ubhÝ)”}”(hŒäEnergy Aware Scheduling depends on the CPUs of the system having specific hardware properties and on other features of the kernel being enabled. This section lists these dependencies and provides hints as to how they can be met.”h]”hŒäEnergy Aware Scheduling depends on the CPUs of the system having specific hardware properties and on other features of the kernel being enabled. This section lists these dependencies and provides hints as to how they can be met.”…””}”(hjPh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÜh³hÊh´M@hj?h²hubh¶)”}”(hhh]”(h»)”}”(hŒ6.1 - Asymmetric CPU topology”h]”hŒ6.1 - Asymmetric CPU topology”…””}”(hjah²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hºhj^h²hh³hÊh´MFubhÝ)”}”(hŒüAs mentioned in the introduction, EAS is only supported on platforms with asymmetric CPU topologies for now. This requirement is checked at run-time by looking for the presence of the SD_ASYM_CPUCAPACITY_FULL flag when the scheduling domains are built.”h]”hŒüAs mentioned in the introduction, EAS is only supported on platforms with asymmetric CPU topologies for now. This requirement is checked at run-time by looking for the presence of the SD_ASYM_CPUCAPACITY_FULL flag when the scheduling domains are built.”…””}”(hjoh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÜh³hÊh´MIhj^h²hubhÝ)”}”(hŒ€See Documentation/scheduler/sched-capacity.rst for requirements to be met for this flag to be set in the sched_domain hierarchy.”h]”hŒ€See Documentation/scheduler/sched-capacity.rst for requirements to be met for this flag to be set in the sched_domain hierarchy.”…””}”(hj}h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÜh³hÊh´MNhj^h²hubhÝ)”}”(hŒÉPlease note that EAS is not fundamentally incompatible with SMP, but no significant savings on SMP platforms have been observed yet. This restriction could be amended in the future if proven otherwise.”h]”hŒÉPlease note that EAS is not fundamentally incompatible with SMP, but no significant savings on SMP platforms have been observed yet. This restriction could be amended in the future if proven otherwise.”…””}”(hj‹h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÜh³hÊh´MQhj^h²hubeh}”(h]”Œasymmetric-cpu-topology”ah ]”h"]”Œ6.1 - asymmetric cpu topology”ah$]”h&]”uh1hµhj?h²hh³hÊh´MFubh¶)”}”(hhh]”(h»)”}”(hŒ6.2 - Energy Model presence”h]”hŒ6.2 - Energy Model presence”…””}”(hj¤h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hºhj¡h²hh³hÊh´MWubhÝ)”}”(hX)EAS uses the EM of a platform to estimate the impact of scheduling decisions on energy. So, your platform must provide power cost tables to the EM framework in order to make EAS start. To do so, please refer to documentation of the independent EM framework in Documentation/power/energy-model.rst.”h]”hX)EAS uses the EM of a platform to estimate the impact of scheduling decisions on energy. So, your platform must provide power cost tables to the EM framework in order to make EAS start. To do so, please refer to documentation of the independent EM framework in Documentation/power/energy-model.rst.”…””}”(hj²h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÜh³hÊh´MYhj¡h²hubhÝ)”}”(hŒxPlease also note that the scheduling domains need to be re-built after the EM has been registered in order to start EAS.”h]”hŒxPlease also note that the scheduling domains need to be re-built after the EM has been registered in order to start EAS.”…””}”(hjÀh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÜh³hÊh´M^hj¡h²hubhÝ)”}”(hX EAS uses the EM to make a forecasting decision on energy usage and thus it is more focused on the difference when checking possible options for task placement. For EAS it doesn't matter whether the EM power values are expressed in milli-Watts or in an 'abstract scale'.”h]”hXEAS uses the EM to make a forecasting decision on energy usage and thus it is more focused on the difference when checking possible options for task placement. For EAS it doesn’t matter whether the EM power values are expressed in milli-Watts or in an ‘abstract scale’.”…””}”(hjÎh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÜh³hÊh´Mahj¡h²hubeh}”(h]”Œenergy-model-presence”ah ]”h"]”Œ6.2 - energy model presence”ah$]”h&]”uh1hµhj?h²hh³hÊh´MWubh¶)”}”(hhh]”(h»)”}”(hŒ6.3 - Energy Model complexity”h]”hŒ6.3 - Energy Model complexity”…””}”(hjçh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hºhjäh²hh³hÊh´MhubhÝ)”}”(hŒ®EAS does not impose any complexity limit on the number of PDs/OPPs/CPUs but restricts the number of CPUs to EM_MAX_NUM_CPUS to prevent overflows during the energy estimation.”h]”hŒ®EAS does not impose any complexity limit on the number of PDs/OPPs/CPUs but restricts the number of CPUs to EM_MAX_NUM_CPUS to prevent overflows during the energy estimation.”…””}”(hjõh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÜh³hÊh´Mjhjäh²hubeh}”(h]”Œenergy-model-complexity”ah ]”h"]”Œ6.3 - energy model complexity”ah$]”h&]”uh1hµhj?h²hh³hÊh´Mhubh¶)”}”(hhh]”(h»)”}”(hŒ6.4 - Schedutil governor”h]”hŒ6.4 - Schedutil governor”…””}”(hjh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hºhj h²hh³hÊh´MpubhÝ)”}”(hŒÁEAS tries to predict at which OPP will the CPUs be running in the close future in order to estimate their energy consumption. To do so, it is assumed that OPPs of CPUs follow their utilization.”h]”hŒÁEAS tries to predict at which OPP will the CPUs be running in the close future in order to estimate their energy consumption. To do so, it is assumed that OPPs of CPUs follow their utilization.”…””}”(hjh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÜh³hÊh´Mrhj h²hubhÝ)”}”(hXïAlthough it is very difficult to provide hard guarantees regarding the accuracy of this assumption in practice (because the hardware might not do what it is told to do, for example), schedutil as opposed to other CPUFreq governors at least _requests_ frequencies calculated using the utilization signals. Consequently, the only sane governor to use together with EAS is schedutil, because it is the only one providing some degree of consistency between frequency requests and energy predictions.”h]”hXïAlthough it is very difficult to provide hard guarantees regarding the accuracy of this assumption in practice (because the hardware might not do what it is told to do, for example), schedutil as opposed to other CPUFreq governors at least _requests_ frequencies calculated using the utilization signals. Consequently, the only sane governor to use together with EAS is schedutil, because it is the only one providing some degree of consistency between frequency requests and energy predictions.”…””}”(hj*h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÜh³hÊh´Mvhj h²hubhÝ)”}”(hŒBUsing EAS with any other governor than schedutil is not supported.”h]”hŒBUsing EAS with any other governor than schedutil is not supported.”…””}”(hj8h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÜh³hÊh´M~hj h²hubeh}”(h]”Œschedutil-governor”ah ]”h"]”Œ6.4 - schedutil governor”ah$]”h&]”uh1hµhj?h²hh³hÊh´Mpubh¶)”}”(hhh]”(h»)”}”(hŒ'6.5 Scale-invariant utilization signals”h]”hŒ'6.5 Scale-invariant utilization signals”…””}”(hjQh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hºhjNh²hh³hÊh´M‚ubhÝ)”}”(hŒïIn order to make accurate predictions across CPUs and for all performance states, EAS needs frequency-invariant and CPU-invariant PELT signals. These can be obtained using the architecture-defined arch_scale{cpu,freq}_capacity() callbacks.”h]”hŒïIn order to make accurate predictions across CPUs and for all performance states, EAS needs frequency-invariant and CPU-invariant PELT signals. These can be obtained using the architecture-defined arch_scale{cpu,freq}_capacity() callbacks.”…””}”(hj_h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÜh³hÊh´M„hjNh²hubhÝ)”}”(hŒTUsing EAS on a platform that doesn't implement these two callbacks is not supported.”h]”hŒVUsing EAS on a platform that doesn’t implement these two callbacks is not supported.”…””}”(hjmh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÜh³hÊh´M‰hjNh²hubeh}”(h]”Œ#scale-invariant-utilization-signals”ah ]”h"]”Œ'6.5 scale-invariant utilization signals”ah$]”h&]”uh1hµhj?h²hh³hÊh´M‚ubh¶)”}”(hhh]”(h»)”}”(hŒ6.6 Multithreading (SMT)”h]”hŒ6.6 Multithreading (SMT)”…””}”(hj†h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hºhjƒh²hh³hÊh´MŽubhÝ)”}”(hŒÞEAS in its current form is SMT unaware and is not able to leverage multithreaded hardware to save energy. EAS considers threads as independent CPUs, which can actually be counter-productive for both performance and energy.”h]”hŒÞEAS in its current form is SMT unaware and is not able to leverage multithreaded hardware to save energy. EAS considers threads as independent CPUs, which can actually be counter-productive for both performance and energy.”…””}”(hj”h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÜh³hÊh´Mhjƒh²hubhÝ)”}”(hŒEAS on SMT is not supported.”h]”hŒEAS on SMT is not supported.”…””}”(hj¢h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÜh³hÊh´M”hjƒh²hubeh}”(h]”Œmultithreading-smt”ah ]”h"]”Œ6.6 multithreading (smt)”ah$]”h&]”uh1hµhj?h²hh³hÊh´MŽubeh}”(h]”Œ%dependencies-and-requirements-for-eas”ah ]”h"]”Œ(6. dependencies and requirements for eas”ah$]”h&]”uh1hµhh·h²hh³hÊh´M>ubeh}”(h]”Œenergy-aware-scheduling”ah ]”h"]”Œenergy aware scheduling”ah$]”h&]”uh1hµhhh²hh³hÊh´Kubeh}”(h]”h ]”h"]”h$]”h&]”Œsource”hÊuh1hŒcurrent_source”NŒ current_line”NŒsettings”Œdocutils.frontend”ŒValues”“”)”}”(hºNŒ generator”NŒ datestamp”NŒ source_link”NŒ source_url”NŒ toc_backlinks”Œentry”Œfootnote_backlinks”KŒ sectnum_xform”KŒstrip_comments”NŒstrip_elements_with_classes”NŒ strip_classes”NŒ report_level”KŒ halt_level”KŒexit_status_level”KŒdebug”NŒwarning_stream”NŒ traceback”ˆŒinput_encoding”Œ utf-8-sig”Œinput_encoding_error_handler”Œstrict”Œoutput_encoding”Œutf-8”Œoutput_encoding_error_handler”jëŒerror_encoding”Œutf-8”Œerror_encoding_error_handler”Œbackslashreplace”Œ language_code”Œen”Œrecord_dependencies”NŒconfig”NŒ id_prefix”hŒauto_id_prefix”Œid”Œ dump_settings”NŒdump_internals”NŒdump_transforms”NŒdump_pseudo_xml”NŒexpose_internals”NŒstrict_visitor”NŒ_disable_config”NŒ_source”hÊŒ _destination”NŒ _config_files”]”Œ7/var/lib/git/docbuild/linux/Documentation/docutils.conf”aŒfile_insertion_enabled”ˆŒ raw_enabled”KŒline_length_limit”M'Œpep_references”NŒ pep_base_url”Œhttps://peps.python.org/”Œpep_file_url_template”Œpep-%04d”Œrfc_references”NŒ rfc_base_url”Œ&https://datatracker.ietf.org/doc/html/”Œ tab_width”KŒtrim_footnote_reference_space”‰Œsyntax_highlight”Œlong”Œ smart_quotes”ˆŒsmartquotes_locales”]”Œcharacter_level_inline_markup”‰Œdoctitle_xform”‰Œ docinfo_xform”KŒsectsubtitle_xform”‰Œ image_loading”Œlink”Œembed_stylesheet”‰Œcloak_email_addresses”ˆŒsection_self_link”‰Œenv”NubŒreporter”NŒindirect_targets”]”Œsubstitution_defs”}”Œsubstitution_names”}”Œrefnames”}”Œrefids”}”Œnameids”}”(jÅjÂj-j*jjj,j)j—j”j<j9j½jºjžj›jájÞjjjKjHj€j}jµj²uŒ nametypes”}”(jʼnj-‰j‰j,‰j—‰j<‰j½‰jž‰já‰j‰jK‰j€‰jµ‰uh}”(jÂh·j*hËjj0j)jj”j/j9jšjºj?j›j^jÞj¡jjäjHj j}jNj²jƒuŒ footnote_refs”}”Œ citation_refs”}”Œ autofootnotes”]”Œautofootnote_refs”]”Œsymbol_footnotes”]”Œsymbol_footnote_refs”]”Œ footnotes”]”Œ citations”]”Œautofootnote_start”KŒsymbol_footnote_start”KŒ id_counter”Œ collections”ŒCounter”“”}”…”R”Œparse_messages”]”Œtransform_messages”]”Œ transformer”NŒ include_log”]”Œ decoration”Nh²hub.