From: Steven Rostedt According to the comments in include/linux/sched.h /* * Priority of a process goes from 0..MAX_PRIO-1, valid RT * priority is 0..MAX_RT_PRIO-1, and SCHED_NORMAL tasks are * in the range MAX_RT_PRIO..MAX_PRIO-1. Priority values * are inverted: lower p->prio value means higher priority. * * The MAX_USER_RT_PRIO value allows the actual maximum * RT priority to be separate from the value exported to * user-space. This allows kernel threads to set their * priority to a value higher than any user task. Note: * MAX_RT_PRIO must not be smaller than MAX_USER_RT_PRIO. */ This makes it look like the priority goes as follows: prio: 0 .. MAX_RT_PRIO .. MAX_USER_RT_PRIO .. MAX_PRIO where 0 is of highest priority but in reality we have: prio: 0 .. MAX_USER_RT_PRIO .. MAX_RT_PRIO .. MAX_PRIO The comments say that MAX_RT_PRIO must not be smaller than MAX_USER_RT_PRIO, but if it is bigger (thinking bigger means greater than) then the system will crash on a SMP machine. Here's how it works. The migration_thread sets the priority of its thread to MAX_RT_PRIO-1 via: __setscheduler(p, SCHED_FIFO, MAX_RT_PRIO-1); Now looking at __setscheduler static void __setscheduler(struct task_struct *p, int policy, int prio) { BUG_ON(p->array); p->policy = policy; p->rt_priority = prio; if (policy != SCHED_NORMAL) p->prio = MAX_USER_RT_PRIO-1 - p->rt_priority; else p->prio = p->static_prio; } If we have MAX_USER_RT_PRIO = 99 and MAX_RT_PRIO = 100 then we would get p->prio = 99-1 - 100-1 = -1; This would be very bad when it comes time to schedule. Not to mention that kstop_machine uses MAX_RT_PRIO and then calls sys_sched_setscheduler, which would fail if MAX_RT_PRIO > MAX_USER_RT_PRIO. Below is a patch that makes MAX_RT_PRIO work if it is greater than MAX_USER_RT_PRIO on a SMP machine. The p->mm is to allow kstop_machine to work and any other kernel threads. I tested the patch on an SMP machine where MAX_RT_PRIO = 100 and MAX_USER_RT_PRIO = 99. Without the patch, the system crashes with a reboot. Funny, back in July 2002, this was noticed by an Anton Wilson and he was just lost in the noise! http://seclists.org/lists/linux-kernel/2002/Jul/1695.html Acked-by: Ingo Molnar Signed-off-by: Andrew Morton --- arch/ia64/sn/kernel/xpc_main.c | 2 +- kernel/sched.c | 5 +++-- 2 files changed, 4 insertions(+), 3 deletions(-) diff -puN arch/ia64/sn/kernel/xpc_main.c~max_user_rt_prio-and-max_rt_prio-are-wrong arch/ia64/sn/kernel/xpc_main.c --- 25/arch/ia64/sn/kernel/xpc_main.c~max_user_rt_prio-and-max_rt_prio-are-wrong 2005-06-25 01:17:13.000000000 -0700 +++ 25-akpm/arch/ia64/sn/kernel/xpc_main.c 2005-06-25 01:17:13.000000000 -0700 @@ -420,7 +420,7 @@ xpc_activating(void *__partid) partid_t partid = (u64) __partid; struct xpc_partition *part = &xpc_partitions[partid]; unsigned long irq_flags; - struct sched_param param = { sched_priority: MAX_USER_RT_PRIO - 1 }; + struct sched_param param = { sched_priority: MAX_RT_PRIO - 1 }; int ret; diff -puN kernel/sched.c~max_user_rt_prio-and-max_rt_prio-are-wrong kernel/sched.c --- 25/kernel/sched.c~max_user_rt_prio-and-max_rt_prio-are-wrong 2005-06-25 01:17:13.000000000 -0700 +++ 25-akpm/kernel/sched.c 2005-06-25 01:17:13.000000000 -0700 @@ -3527,7 +3527,7 @@ static void __setscheduler(struct task_s p->policy = policy; p->rt_priority = prio; if (policy != SCHED_NORMAL) - p->prio = MAX_USER_RT_PRIO-1 - p->rt_priority; + p->prio = MAX_RT_PRIO-1 - p->rt_priority; else p->prio = p->static_prio; } @@ -3559,7 +3559,8 @@ recheck: * 1..MAX_USER_RT_PRIO-1, valid priority for SCHED_NORMAL is 0. */ if (param->sched_priority < 0 || - param->sched_priority > MAX_USER_RT_PRIO-1) + (p->mm && param->sched_priority > MAX_USER_RT_PRIO-1) || + (!p->mm && param->sched_priority > MAX_RT_PRIO-1)) return -EINVAL; if ((policy == SCHED_NORMAL) != (param->sched_priority == 0)) return -EINVAL; _