8.5. Per-CPU Variables
Per-CPU variables are an interesting 2.6 kernel feature. When you
create a per-CPU variable, each processor on the system gets its own
copy of that variable. This may seem like a strange thing to want to
do, but it has its advantages. Access to per-CPU variables requires
(almost) no locking, because each processor works with its own copy.
Per-CPU variables can also remain in their respective
processors' caches, which leads to significantly
better performance for frequently updated quantities.
A good example of per-CPU variable use can be found in the networking
subsystem. The kernel maintains no end of counters tracking how many
of each type of packet was received; these counters can be
u pdated
thousands of times per second. Rather than deal with the caching and
locking issues, the networking developers put the statistics counters
into per-CPU variables. Updates are now lockless and fast. On the
rare occasion that user space requests to see the values of the
counters, it is a simple matter to add up each
processor's version and return the total.
The declarations for per-CPU variables can be found in
<linux/percpu.h>. To create a per-CPU
variable at compile time, use this macro:
DEFINE_PER_CPU(type, name);
If the variable (to be called name) is an array,
include the dimension information with the type.
Thus, a per-CPU array of three integers would be created with:
DEFINE_PER_CPU(int[3], my_percpu_array);
Per-CPU variables can be manipulated without explicit
locking—almost. Remember that the 2.6 kernel is preemptible; it
would not do for a processor to be preempted in the middle of a
critical section that modifies a per-CPU variable. It also would not
be good if your process were to be moved to another processor in the
middle of a per-CPU variable access. For this reason, you must
explicitly use the get_cpu_var macro to access
the current processor's copy of a given variable,
and call put_cpu_var when you are done. The call
to get_cpu_var returns an lvalue for the current
processor's version of the variable and disables
preemption. Since an lvalue is returned, it can be assigned to or
operated on directly. For example, one counter in the networking code
is incremented with these two statements:
get_cpu_var(sockets_in_use)++;
put_cpu_var(sockets_in_use);
You can access another processor's copy of the
variable with:
per_cpu(variable, int cpu_id);
If you write code that involves processors reaching into each
other's per-CPU variables, you, of course, have to
implement a locking scheme that makes that access safe.
Dynamically allocated per-CPU variables are also possible. These
variables can be allocated with:
void *alloc_percpu(type);
void *_ _alloc_percpu(size_t size, size_t align);
In most cases, alloc_percpu does the job; you
can call _ _alloc_percpu in cases where a
particular alignment is required. In either case, a per-CPU variable
can be returned to the system with free_percpu.
Access to a dynamically allocated per-CPU variable is done via
per_cpu_ptr:
per_cpu_ptr(void *per_cpu_var, int cpu_id);
This macro returns a pointer to the version of
per_cpu_var corresponding to the given
cpu_id. If you are simply reading another
CPU's version of the variable, you can dereference
that pointer and be done with it. If, however, you are manipulating
the current processor's version, you probably need
to ensure that you cannot be moved out of that processor first. If
the entirety of your access to the per-CPU variable happens with a
spinlock held, all is well. Usually, however, you need to use
get_cpu to block preemption while working with
the variable. Thus, code using dynamic per-CPU variables tends to
look like this:
int cpu;
cpu = get_cpu( )
ptr = per_cpu_ptr(per_cpu_var, cpu);
/* work with ptr */
put_cpu( );
When using compile-time per-CPU variables, the
get_cpu_var and put_cpu_var
macros take care of these details. Dynamic per-CPU variables require
more explicit protection.
Per-CPU variables can be exported to modules, but you must use a
special version of the macros:
EXPORT_PER_CPU_SYMBOL(per_cpu_var);
EXPORT_PER_CPU_SYMBOL_GPL(per_cpu_var);
To access such a variable within a module, declare it with:
DECLARE_PER_CPU(type, name);
The use of DECLARE_PER_CPU (instead of
DEFINE_PER_CPU) tells the compiler that an
external reference is being made.
If you want to use per-CPU variables to create a simple integer
counter, take a look at the canned implementation in
<linux/percpu_counter.h>. Finally, note
that some architectures have a limited amount of address space
available for per-CPU variables. If you create per-CPU variables in your
code, you should try to keep them small.
|