From: Oleg Nesterov I think it is supposed that timers are serialized wrt to itself, but I can't find any documentation about it. If CPU_0 does mod_timer(jiffies+1) while the timer is currently running on CPU 1, it is quite possible that local interrupt on CPU_0 will start that timer before it finished on CPU_1. With this patch __mod_timer() locks old_base even if timer is not pending, like del_timer_sync() does. If the timer is still running, and old_base != new_base, __mod_timer() retries. Now we have: 1. the timer's handler can't run on different cpus at the same time. 2. the timer's base can't be changed during execution of the timer's handler. Signed-off-by: Oleg Nesterov Signed-off-by: Andrew Morton --- 25-akpm/kernel/timer.c | 17 ++++++++++------- 1 files changed, 10 insertions(+), 7 deletions(-) diff -puN kernel/timer.c~timers-serialize-timers kernel/timer.c --- 25/kernel/timer.c~timers-serialize-timers 2005-03-19 14:19:54.000000000 -0800 +++ 25-akpm/kernel/timer.c 2005-03-19 14:19:54.000000000 -0800 @@ -193,7 +193,7 @@ int __mod_timer(struct timer_list *timer spin_lock_irqsave(&timer->lock, flags); new_base = &__get_cpu_var(tvec_bases); repeat: - old_base = __get_base(timer); + old_base = timer_base(timer); /* * Prevent deadlocks via ordering by old_base < new_base. @@ -208,29 +208,32 @@ repeat: } /* * The timer base might have been cancelled while we were - * trying to take the lock(s): + * trying to take the lock(s), or can still be running on + * old_base's CPU. */ - if (__get_base(timer) != old_base) { + if (timer_base(timer) != old_base + || old_base->running_timer == timer) { spin_unlock(&new_base->lock); spin_unlock(&old_base->lock); goto repeat; } } else { spin_lock(&new_base->lock); - if (__get_base(timer) != old_base) { + if (timer_base(timer) != old_base) { spin_unlock(&new_base->lock); goto repeat; } } /* - * Delete the previous timeout (if there was any), and install - * the new one: + * Delete the previous timeout (if there was any). + * We hold timer->lock, no need to check old_base != 0. */ - if (old_base) { + if (timer_pending(timer)) { list_del(&timer->entry); ret = 1; } + timer->expires = expires; internal_add_timer(new_base, timer); __set_base(timer, new_base, 1); _