diff options
author | Rusty Russell <rusty@rustcorp.com.au> | 2004-12-01 01:10:19 -0800 |
---|---|---|
committer | Linus Torvalds <torvalds@ppc970.osdl.org> | 2004-12-01 01:10:19 -0800 |
commit | a041464f31d6f0ad289c0dfbe8b43cfb1ba3cf43 (patch) | |
tree | 13a874333a69685801d4e675d6949a2f69b9604b /kernel | |
parent | 4e73e8ed5e43a405ff62173b8fc0903d383dc82a (diff) | |
download | history-a041464f31d6f0ad289c0dfbe8b43cfb1ba3cf43.tar.gz |
[PATCH] Fix occasional stop_machine() lockup with > 2 CPUs
Stephen Rothwell noted a case where one CPU was sitting in userspace, one
in stop_machine() waiting for everyone to enter stopmachine(). This can
happen if migration occurs at exactly the wrong time with more than 2 CPUS.
Say we have 4 CPUS:
1) stop_machine() on CPU 0creates stopmachine() threads for CPUS 1, 2
and 3, and yields waiting for them to migrate to their CPUs and
ack.
2) stopmachine(2) gets rebalanced (probably on exec) to CPU 1.
3) stopmachine(2) calls set_cpus_allowed on CPU 1, sleeps awaiting
migration thread.
4) stopmachine(1) calls set_cpus_allowed on CPU 0, moves onto CPU1 and
starts spinning.
Now the migration thread never runs, and we deadlock. The simplest
solution is for stopmachine() to yield until they are all in place.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Diffstat (limited to 'kernel')
-rw-r--r-- | kernel/stop_machine.c | 7 |
1 files changed, 6 insertions, 1 deletions
diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c index 9610403ce2cf22..2ceea25f67f674 100644 --- a/kernel/stop_machine.c +++ b/kernel/stop_machine.c @@ -52,7 +52,12 @@ static int stopmachine(void *cpu) mb(); /* Must read state first. */ atomic_inc(&stopmachine_thread_ack); } - cpu_relax(); + /* Yield in first stage: migration threads need to + * help our sisters onto their CPUs. */ + if (!prepared && !irqs_disabled) + yield(); + else + cpu_relax(); } /* Ack: we are exiting. */ |