diff options
author | Robert Love <rml@tech9.net> | 2002-02-09 20:59:32 -0800 |
---|---|---|
committer | Linus Torvalds <torvalds@home.transmeta.com> | 2002-02-09 20:59:32 -0800 |
commit | 987562211dd19bb1ee87cfc740da2bb535b20292 (patch) | |
tree | a9d5e334917da293d61c528f3eb07bb2abbc1f55 | |
parent | 01a5072396a453bf12e0286af64135fd34040b2e (diff) | |
download | history-987562211dd19bb1ee87cfc740da2bb535b20292.tar.gz |
[PATCH] preemptible kernel documentation, etcv2.5.4-pre6
Linus,
The attached patch adds a Documentation/preempt-locking.txt file which
describes the new locking rules wrt preemptive kernels (ie, watch
per-CPU data, etc). It also updates a CREDITS entry and adds some
comments.
Patch is against 2.5.4-pre5, please apply.
Robert Love
-rw-r--r-- | CREDITS | 2 | ||||
-rw-r--r-- | Documentation/preempt-locking.txt | 104 | ||||
-rw-r--r-- | MAINTAINERS | 8 | ||||
-rw-r--r-- | mm/slab.c | 3 |
4 files changed, 115 insertions, 2 deletions
@@ -990,8 +990,8 @@ S: Brazil N: Nigel Gamble E: nigel@nrg.org -E: nigel@sgi.com D: Interrupt-driven printer driver +D: Preemptible kernel S: 120 Alley Way S: Mountain View, California 94040 S: USA diff --git a/Documentation/preempt-locking.txt b/Documentation/preempt-locking.txt new file mode 100644 index 0000000000000..08e2b4719a237 --- /dev/null +++ b/Documentation/preempt-locking.txt @@ -0,0 +1,104 @@ + Proper Locking Under a Preemptible Kernel: + Keeping Kernel Code Preempt-Safe + Robert Love <rml@tech9.net> + Last Updated: 22 Jan 2002 + + +INTRODUCTION + + +A preemptible kernel creates new locking issues. The issues are the same as +those under SMP: concurrency and reentrancy. Thankfully, the Linux preemptible +kernel model leverages existing SMP locking mechanisms. Thus, the kernel +requires explicit additional locking for very few additional situations. + +This document is for all kernel hackers. Developing code in the kernel +requires protecting these situations. + + +RULE #1: Per-CPU data structures need explicit protection + + +Two similar problems arise. An example code snippet: + + struct this_needs_locking tux[NR_CPUS]; + tux[smp_processor_id()] = some_value; + /* task is preempted here... */ + something = tux[smp_processor_id()]; + +First, since the data is per-CPU, it may not have explicit SMP locking, but +require it otherwise. Second, when a preempted task is finally rescheduled, +the previous value of smp_processor_id may not equal the current. You must +protect these situations by disabling preemption around them. + + +RULE #2: CPU state must be protected. + + +Under preemption, the state of the CPU must be protected. This is arch- +dependent, but includes CPU structures and state not preserved over a context +switch. For example, on x86, entering and exiting FPU mode is now a critical +section that must occur while preemption is disabled. Think what would happen +if the kernel is executing a floating-point instruction and is then preempted. +Remember, the kernel does not save FPU state except for user tasks. Therefore, +upon preemption, the FPU registers will be sold to the lowest bidder. Thus, +preemption must be disabled around such regions. + +Note, some FPU functions are already explicitly preempt safe. For example, +kernel_fpu_begin and kernel_fpu_end will disable and enable preemption. +However, math_state_restore must be called with preemption disabled. + + +RULE #3: Lock acquire and release must be performed by same task + + +A lock acquired in one task must be released by the same task. This +means you can't do oddball things like acquire a lock and go off to +play while another task releases it. If you want to do something +like this, acquire and release the task in the same code path and +have the caller wait on an event by the other task. + + +SOLUTION + + +Data protection under preemption is achieved by disabling preemption for the +duration of the critical region. + +preempt_enable() decrement the preempt counter +preempt_disable() increment the preempt counter +preempt_enable_no_resched() decrement, but do not immediately preempt +preempt_get_count() return the preempt counter + +The functions are nestable. In other words, you can call preempt_disable +n-times in a code path, and preemption will not be reenabled until the n-th +call to preempt_enable. The preempt statements define to nothing if +preemption is not enabled. + +Note that you do not need to explicitly prevent preemption if you are holding +any locks or interrupts are disabled, since preemption is implicitly disabled +in those cases. + +Example: + + cpucache_t *cc; /* this is per-CPU */ + preempt_disable(); + cc = cc_data(searchp); + if (cc && cc->avail) { + __free_block(searchp, cc_entry(cc), cc->avail); + cc->avail = 0; + } + preempt_enable(); + return 0; + +Notice how the preemption statements must encompass every reference of the +critical variables. Another example: + + int buf[NR_CPUS]; + set_cpu_val(buf); + if (buf[smp_processor_id()] == -1) printf(KERN_INFO "wee!\n"); + spin_lock(&buf_lock); + /* ... */ + +This code is not preempt-safe, but see how easily we can fix it by simply +moving the spin_lock up two lines. diff --git a/MAINTAINERS b/MAINTAINERS index aa204450d0677..8c1c254ef4268 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -1239,6 +1239,14 @@ P: Michal Ostrowski M: mostrows@styx.uwaterloo.ca S: Maintained +PREEMPTIBLE KERNEL +P: Robert Love +M: rml@tech9.net +L: linux-kernel@vger.kernel.org +L: kpreempt-tech@lists.sourceforge.net +W: ftp://ftp.kernel.org/pub/linux/kernel/people/rml/preempt-kernel +S: Supported + PROMISE DC4030 CACHING DISK CONTROLLER DRIVER P: Peter Denison M: promise@pnd-pc.demon.co.uk diff --git a/mm/slab.c b/mm/slab.c index 26c7eb5aac015..dfa8401d556da 100644 --- a/mm/slab.c +++ b/mm/slab.c @@ -49,7 +49,8 @@ * constructors and destructors are called without any locking. * Several members in kmem_cache_t and slab_t never change, they * are accessed without any locking. - * The per-cpu arrays are never accessed from the wrong cpu, no locking. + * The per-cpu arrays are never accessed from the wrong cpu, no locking, + * and local interrupts are disabled so slab code is preempt-safe. * The non-constant members are protected with a per-cache irq spinlock. * * Further notes from the original documentation: |