From: Roland McGrath Klaus Dittrich observed this bug and posted a test case for it. This patch fixes both that failure mode and some others possible. What Klaus saw was a false negative (i.e. ECHILD when there was a child) when the group leader was a zombie but delayed because other children live; in the test program this happens in a race between the two threads dying on a signal. The change to the TASK_TRACED case avoids a potential false positive (blocking, or WNOHANG returning 0, when there are really no children left), in the race condition where my_ptrace_child returns zero. Signed-off-by: Roland McGrath Signed-off-by: Andrew Morton --- 25-akpm/kernel/exit.c | 15 +++++++++++++-- 1 files changed, 13 insertions(+), 2 deletions(-) diff -puN kernel/exit.c~fix-bogus-echild-return-from-wait-with-zombie-group-leader kernel/exit.c --- 25/kernel/exit.c~fix-bogus-echild-return-from-wait-with-zombie-group-leader 2004-12-06 13:55:07.985260472 -0800 +++ 25-akpm/kernel/exit.c 2004-12-06 13:55:07.989259864 -0800 @@ -1322,6 +1322,10 @@ static long do_wait(pid_t pid, int optio add_wait_queue(¤t->wait_chldexit,&wait); repeat: + /* + * We will set this flag if we see any child that might later + * match our criteria, even if we are not able to reap it yet. + */ flag = 0; current->state = TASK_INTERRUPTIBLE; read_lock(&tasklist_lock); @@ -1340,11 +1344,14 @@ repeat: switch (p->state) { case TASK_TRACED: - flag = 1; if (!my_ptrace_child(p)) continue; /*FALLTHROUGH*/ case TASK_STOPPED: + /* + * It's stopped now, so it might later + * continue, exit, or stop again. + */ flag = 1; if (!(options & WUNTRACED) && !my_ptrace_child(p)) @@ -1380,8 +1387,12 @@ repeat: goto end; break; } - flag = 1; check_continued: + /* + * It's running now, so it might later + * exit, stop, or stop and then continue. + */ + flag = 1; if (!unlikely(options & WCONTINUED)) continue; retval = wait_task_continued( _