Per-CPU Data

Another technique for avoiding locking which is used fairly widely is to duplicate information for each CPU. For example, if you wanted to keep a count of a common condition, you could use a spin lock and a single counter. Nice and simple.

If that was too slow (it's usually not, but if you've got a really big machine to test on and can show that it is), you could instead use a counter for each CPU, then none of them need an exclusive lock. See DEFINE_PER_CPU(), get_cpu_var() and put_cpu_var() (include/linux/percpu.h).

Of particular use for simple per-cpu counters is the local_t type, and the cpu_local_inc() and related functions, which are more efficient than simple code on some architectures (include/asm/local.h).

Note that there is no simple, reliable way of getting an exact value of such a counter, without introducing more locks. This is not a problem for some uses.