From: Andrea Arcangeli I debugged a problem with CLONE_THREAD under strace generating zombies that cannot be reaped by init. Basically what's going on is that release_task is never called on the clones, and in turn the parent thread will remain zombie forever because thread_group_empty == 0 (it never notifies init). the group can become empty only after release_task has been called on all the clones. What's going on is that if the clone happen to be under strace by the time it exits its state will not be set to TASK_DEAD and nobody will ever call wait4 on the clone because the parent is being killed at the same time. But the parent cannot go away until the clone goes away too. I believe strace needs as well a little race where it has the sigchld disabled but what I'm discussing here is still a kernel bug generating zombie threads. I think I could have fixed even with a strictier patch (adding a exit_signal == -1 check just to cover that case), but I believe that it makes no sense to leave ptrace enabled on a clone that is being killed, it happens to be safe without a thread-group just because there will be always init able to call wait4->release_task on it, that will call ptrace_unlink later in release_task, same goes for the "leader" of the thread group, that as well can be detached by ptrace via release_task). Signed-off-by: Andrew Morton --- 25-akpm/kernel/exit.c | 7 +++++++ 1 files changed, 7 insertions(+) diff -puN kernel/exit.c~zombie-with-clone_thread kernel/exit.c --- 25/kernel/exit.c~zombie-with-clone_thread 2004-06-29 23:04:07.787147704 -0700 +++ 25-akpm/kernel/exit.c 2004-06-29 23:04:07.791147096 -0700 @@ -730,6 +730,13 @@ static void exit_notify(struct task_stru do_notify_parent(tsk, SIGCHLD); } + /* + * To allow the group leader of a thread group to be released + * we must really go away synchronously if exit_signal == -1. + */ + if (unlikely(tsk->ptrace) && tsk != tsk->group_leader) + __ptrace_unlink(tsk); + state = TASK_ZOMBIE; if (tsk->exit_signal == -1 && tsk->ptrace == 0) state = TASK_DEAD; _