From: George Anzinger This patch adds a notify to the die_nmi notify that the system is about to be taken down. If the notify is handled with a NOTIFY_STOP return, the system is given a new lease on life. We also change the nmi watchdog to carry on if die_nmi returns. This give debug code a chance to a) catch watchdog timeouts and b) possibly allow the system to continue, realizing that the time out may be due to debugger activities such as single stepping which is usually done with "other" cpus held. Signed-off-by: George Anzinger Cc: Keith Owens Signed-off-by: Andrew Morton --- arch/i386/kernel/nmi.c | 5 ++++- arch/i386/kernel/traps.c | 4 ++++ 2 files changed, 8 insertions(+), 1 deletion(-) diff -puN arch/i386/kernel/nmi.c~x86-nmi-better-support-for-debuggers arch/i386/kernel/nmi.c --- devel/arch/i386/kernel/nmi.c~x86-nmi-better-support-for-debuggers 2005-08-06 15:34:38.000000000 -0700 +++ devel-akpm/arch/i386/kernel/nmi.c 2005-08-06 15:34:38.000000000 -0700 @@ -501,8 +501,11 @@ void nmi_watchdog_tick (struct pt_regs * */ alert_counter[cpu]++; if (alert_counter[cpu] == 5*nmi_hz) + /* + * die_nmi will return ONLY if NOTIFY_STOP happens.. + */ die_nmi(regs, "NMI Watchdog detected LOCKUP"); - } else { + last_irq_sums[cpu] = sum; alert_counter[cpu] = 0; } diff -puN arch/i386/kernel/traps.c~x86-nmi-better-support-for-debuggers arch/i386/kernel/traps.c --- devel/arch/i386/kernel/traps.c~x86-nmi-better-support-for-debuggers 2005-08-06 15:34:38.000000000 -0700 +++ devel-akpm/arch/i386/kernel/traps.c 2005-08-06 15:34:38.000000000 -0700 @@ -565,6 +565,10 @@ static DEFINE_SPINLOCK(nmi_print_lock); void die_nmi (struct pt_regs *regs, const char *msg) { + if (notify_die(DIE_NMIWATCHDOG, msg, regs, 0, 0, SIGINT) == + NOTIFY_STOP) + return; + spin_lock(&nmi_print_lock); /* * We are in trouble anyway, lets at least try _