aboutsummaryrefslogtreecommitdiffstats
path: root/kernel/sched
diff options
context:
space:
mode:
authorRicardo Neri <ricardo.neri-calderon@linux.intel.com>2023-07-07 15:57:03 -0700
committerPeter Zijlstra <peterz@infradead.org>2023-07-13 15:21:52 +0200
commitb1bfeab9b00283f521d2100afb9f5af84ccdae13 (patch)
tree574e3a022ce5064e431c38a4cde6514bec9ff5b1 /kernel/sched
parent7ff1693236f5d97a939dbeb660c07671a2d57071 (diff)
downloadlinux-b1bfeab9b00283f521d2100afb9f5af84ccdae13.tar.gz
sched/fair: Consider the idle state of the whole core for load balance
should_we_balance() traverses the group_balance_mask (AND'ed with lb_env:: cpus) starting from lower numbered CPUs looking for the first idle CPU. In hybrid x86 systems, the siblings of SMT cores get CPU numbers, before non-SMT cores: [0, 1] [2, 3] [4, 5] 6 7 8 9 b i b i b i b i i i In the figure above, CPUs in brackets are siblings of an SMT core. The rest are non-SMT cores. 'b' indicates a busy CPU, 'i' indicates an idle CPU. We should let a CPU on a fully idle core get the first chance to idle load balance as it has more CPU capacity than a CPU on an idle SMT CPU with busy sibling. So for the figure above, if we are running should_we_balance() to CPU 1, we should return false to let CPU 7 on idle core to have a chance first to idle load balance. A partially busy (i.e., of type group_has_spare) local group with SMT  cores will often have only one SMT sibling busy. If the destination CPU is a non-SMT core, partially busy, lower-numbered, SMT cores should not be considered when finding the first idle CPU.  However, in should_we_balance(), when we encounter idle SMT first in partially busy core, we prematurely break the search for the first idle CPU. Higher-numbered, non-SMT cores is not given the chance to have idle balance done on their behalf. Those CPUs will only be considered for idle balancing by chance via CPU_NEWLY_IDLE. Instead, consider the idle state of the whole SMT core. Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Co-developed-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/807bdd05331378ea3bf5956bda87ded1036ba769.1688770494.git.tim.c.chen@linux.intel.com
Diffstat (limited to 'kernel/sched')
-rw-r--r--kernel/sched/fair.c16
1 files changed, 15 insertions, 1 deletions
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index c6246fbcd74f48..a87988327f889c 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -10902,7 +10902,7 @@ static int active_load_balance_cpu_stop(void *data);
static int should_we_balance(struct lb_env *env)
{
struct sched_group *sg = env->sd->groups;
- int cpu;
+ int cpu, idle_smt = -1;
/*
* Ensure the balancing environment is consistent; can happen
@@ -10929,10 +10929,24 @@ static int should_we_balance(struct lb_env *env)
if (!idle_cpu(cpu))
continue;
+ /*
+ * Don't balance to idle SMT in busy core right away when
+ * balancing cores, but remember the first idle SMT CPU for
+ * later consideration. Find CPU on an idle core first.
+ */
+ if (!(env->sd->flags & SD_SHARE_CPUCAPACITY) && !is_core_idle(cpu)) {
+ if (idle_smt == -1)
+ idle_smt = cpu;
+ continue;
+ }
+
/* Are we the first idle CPU? */
return cpu == env->dst_cpu;
}
+ if (idle_smt == env->dst_cpu)
+ return true;
+
/* Are we the first CPU of this group ? */
return group_balance_cpu(sg) == env->dst_cpu;
}