aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorRobert Love <rml@tech9.net>2002-02-09 20:59:32 -0800
committerLinus Torvalds <torvalds@home.transmeta.com>2002-02-09 20:59:32 -0800
commit987562211dd19bb1ee87cfc740da2bb535b20292 (patch)
treea9d5e334917da293d61c528f3eb07bb2abbc1f55
parent01a5072396a453bf12e0286af64135fd34040b2e (diff)
downloadhistory-987562211dd19bb1ee87cfc740da2bb535b20292.tar.gz
[PATCH] preemptible kernel documentation, etcv2.5.4-pre6
Linus, The attached patch adds a Documentation/preempt-locking.txt file which describes the new locking rules wrt preemptive kernels (ie, watch per-CPU data, etc). It also updates a CREDITS entry and adds some comments. Patch is against 2.5.4-pre5, please apply. Robert Love
-rw-r--r--CREDITS2
-rw-r--r--Documentation/preempt-locking.txt104
-rw-r--r--MAINTAINERS8
-rw-r--r--mm/slab.c3
4 files changed, 115 insertions, 2 deletions
diff --git a/CREDITS b/CREDITS
index b36c8b68f2cb2..bc44a4ccb1ee5 100644
--- a/CREDITS
+++ b/CREDITS
@@ -990,8 +990,8 @@ S: Brazil
N: Nigel Gamble
E: nigel@nrg.org
-E: nigel@sgi.com
D: Interrupt-driven printer driver
+D: Preemptible kernel
S: 120 Alley Way
S: Mountain View, California 94040
S: USA
diff --git a/Documentation/preempt-locking.txt b/Documentation/preempt-locking.txt
new file mode 100644
index 0000000000000..08e2b4719a237
--- /dev/null
+++ b/Documentation/preempt-locking.txt
@@ -0,0 +1,104 @@
+ Proper Locking Under a Preemptible Kernel:
+ Keeping Kernel Code Preempt-Safe
+ Robert Love <rml@tech9.net>
+ Last Updated: 22 Jan 2002
+
+
+INTRODUCTION
+
+
+A preemptible kernel creates new locking issues. The issues are the same as
+those under SMP: concurrency and reentrancy. Thankfully, the Linux preemptible
+kernel model leverages existing SMP locking mechanisms. Thus, the kernel
+requires explicit additional locking for very few additional situations.
+
+This document is for all kernel hackers. Developing code in the kernel
+requires protecting these situations.
+
+
+RULE #1: Per-CPU data structures need explicit protection
+
+
+Two similar problems arise. An example code snippet:
+
+ struct this_needs_locking tux[NR_CPUS];
+ tux[smp_processor_id()] = some_value;
+ /* task is preempted here... */
+ something = tux[smp_processor_id()];
+
+First, since the data is per-CPU, it may not have explicit SMP locking, but
+require it otherwise. Second, when a preempted task is finally rescheduled,
+the previous value of smp_processor_id may not equal the current. You must
+protect these situations by disabling preemption around them.
+
+
+RULE #2: CPU state must be protected.
+
+
+Under preemption, the state of the CPU must be protected. This is arch-
+dependent, but includes CPU structures and state not preserved over a context
+switch. For example, on x86, entering and exiting FPU mode is now a critical
+section that must occur while preemption is disabled. Think what would happen
+if the kernel is executing a floating-point instruction and is then preempted.
+Remember, the kernel does not save FPU state except for user tasks. Therefore,
+upon preemption, the FPU registers will be sold to the lowest bidder. Thus,
+preemption must be disabled around such regions.
+
+Note, some FPU functions are already explicitly preempt safe. For example,
+kernel_fpu_begin and kernel_fpu_end will disable and enable preemption.
+However, math_state_restore must be called with preemption disabled.
+
+
+RULE #3: Lock acquire and release must be performed by same task
+
+
+A lock acquired in one task must be released by the same task. This
+means you can't do oddball things like acquire a lock and go off to
+play while another task releases it. If you want to do something
+like this, acquire and release the task in the same code path and
+have the caller wait on an event by the other task.
+
+
+SOLUTION
+
+
+Data protection under preemption is achieved by disabling preemption for the
+duration of the critical region.
+
+preempt_enable() decrement the preempt counter
+preempt_disable() increment the preempt counter
+preempt_enable_no_resched() decrement, but do not immediately preempt
+preempt_get_count() return the preempt counter
+
+The functions are nestable. In other words, you can call preempt_disable
+n-times in a code path, and preemption will not be reenabled until the n-th
+call to preempt_enable. The preempt statements define to nothing if
+preemption is not enabled.
+
+Note that you do not need to explicitly prevent preemption if you are holding
+any locks or interrupts are disabled, since preemption is implicitly disabled
+in those cases.
+
+Example:
+
+ cpucache_t *cc; /* this is per-CPU */
+ preempt_disable();
+ cc = cc_data(searchp);
+ if (cc && cc->avail) {
+ __free_block(searchp, cc_entry(cc), cc->avail);
+ cc->avail = 0;
+ }
+ preempt_enable();
+ return 0;
+
+Notice how the preemption statements must encompass every reference of the
+critical variables. Another example:
+
+ int buf[NR_CPUS];
+ set_cpu_val(buf);
+ if (buf[smp_processor_id()] == -1) printf(KERN_INFO "wee!\n");
+ spin_lock(&buf_lock);
+ /* ... */
+
+This code is not preempt-safe, but see how easily we can fix it by simply
+moving the spin_lock up two lines.
diff --git a/MAINTAINERS b/MAINTAINERS
index aa204450d0677..8c1c254ef4268 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1239,6 +1239,14 @@ P: Michal Ostrowski
M: mostrows@styx.uwaterloo.ca
S: Maintained
+PREEMPTIBLE KERNEL
+P: Robert Love
+M: rml@tech9.net
+L: linux-kernel@vger.kernel.org
+L: kpreempt-tech@lists.sourceforge.net
+W: ftp://ftp.kernel.org/pub/linux/kernel/people/rml/preempt-kernel
+S: Supported
+
PROMISE DC4030 CACHING DISK CONTROLLER DRIVER
P: Peter Denison
M: promise@pnd-pc.demon.co.uk
diff --git a/mm/slab.c b/mm/slab.c
index 26c7eb5aac015..dfa8401d556da 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -49,7 +49,8 @@
* constructors and destructors are called without any locking.
* Several members in kmem_cache_t and slab_t never change, they
* are accessed without any locking.
- * The per-cpu arrays are never accessed from the wrong cpu, no locking.
+ * The per-cpu arrays are never accessed from the wrong cpu, no locking,
+ * and local interrupts are disabled so slab code is preempt-safe.
* The non-constant members are protected with a per-cache irq spinlock.
*
* Further notes from the original documentation: