The Linux Kernel API¶
List Management Functions¶
-
void INIT_LIST_HEAD(struct list_head *list)¶
Initialize a list_head structure
Parameters
struct list_head *list
list_head structure to be initialized.
Description
Initializes the list_head to point to itself. If it is a list header, the result is an empty list.
-
void list_add(struct list_head *new, struct list_head *head)¶
add a new entry
Parameters
struct list_head *new
new entry to be added
struct list_head *head
list head to add it after
Description
Insert a new entry after the specified head. This is good for implementing stacks.
-
void list_add_tail(struct list_head *new, struct list_head *head)¶
add a new entry
Parameters
struct list_head *new
new entry to be added
struct list_head *head
list head to add it before
Description
Insert a new entry before the specified head. This is useful for implementing queues.
-
void list_del(struct list_head *entry)¶
deletes entry from list.
Parameters
struct list_head *entry
the element to delete from the list.
Note
list_empty()
on entry does not return true after this, the entry is
in an undefined state.
-
void list_replace(struct list_head *old, struct list_head *new)¶
replace old entry by new one
Parameters
struct list_head *old
the element to be replaced
struct list_head *new
the new element to insert
Description
If old was empty, it will be overwritten.
-
void list_replace_init(struct list_head *old, struct list_head *new)¶
replace old entry by new one and initialize the old one
Parameters
struct list_head *old
the element to be replaced
struct list_head *new
the new element to insert
Description
If old was empty, it will be overwritten.
-
void list_swap(struct list_head *entry1, struct list_head *entry2)¶
replace entry1 with entry2 and re-add entry1 at entry2's position
Parameters
struct list_head *entry1
the location to place entry2
struct list_head *entry2
the location to place entry1
-
void list_del_init(struct list_head *entry)¶
deletes entry from list and reinitialize it.
Parameters
struct list_head *entry
the element to delete from the list.
-
void list_move(struct list_head *list, struct list_head *head)¶
delete from one list and add as another's head
Parameters
struct list_head *list
the entry to move
struct list_head *head
the head that will precede our entry
-
void list_move_tail(struct list_head *list, struct list_head *head)¶
delete from one list and add as another's tail
Parameters
struct list_head *list
the entry to move
struct list_head *head
the head that will follow our entry
-
void list_bulk_move_tail(struct list_head *head, struct list_head *first, struct list_head *last)¶
move a subsection of a list to its tail
Parameters
struct list_head *head
the head that will follow our entry
struct list_head *first
first entry to move
struct list_head *last
last entry to move, can be the same as first
Description
Move all entries between first and including last before head. All three entries must belong to the same linked list.
-
int list_is_first(const struct list_head *list, const struct list_head *head)¶
tests whether list is the first entry in list head
Parameters
const struct list_head *list
the entry to test
const struct list_head *head
the head of the list
-
int list_is_last(const struct list_head *list, const struct list_head *head)¶
tests whether list is the last entry in list head
Parameters
const struct list_head *list
the entry to test
const struct list_head *head
the head of the list
-
int list_is_head(const struct list_head *list, const struct list_head *head)¶
tests whether list is the list head
Parameters
const struct list_head *list
the entry to test
const struct list_head *head
the head of the list
-
int list_empty(const struct list_head *head)¶
tests whether a list is empty
Parameters
const struct list_head *head
the list to test.
-
void list_del_init_careful(struct list_head *entry)¶
deletes entry from list and reinitialize it.
Parameters
struct list_head *entry
the element to delete from the list.
Description
This is the same as list_del_init()
, except designed to be used
together with list_empty_careful()
in a way to guarantee ordering
of other memory operations.
Any memory operations done before a list_del_init_careful()
are
guaranteed to be visible after a list_empty_careful()
test.
-
int list_empty_careful(const struct list_head *head)¶
tests whether a list is empty and not being modified
Parameters
const struct list_head *head
the list to test
Description
tests whether a list is empty _and_ checks that no other CPU might be in the process of modifying either member (next or prev)
NOTE
using list_empty_careful()
without synchronization
can only be safe if the only activity that can happen
to the list entry is list_del_init()
. Eg. it cannot be used
if another CPU could re-list_add()
it.
-
void list_rotate_left(struct list_head *head)¶
rotate the list to the left
Parameters
struct list_head *head
the head of the list
-
void list_rotate_to_front(struct list_head *list, struct list_head *head)¶
Rotate list to specific item.
Parameters
struct list_head *list
The desired new front of the list.
struct list_head *head
The head of the list.
Description
Rotates list so that list becomes the new front of the list.
-
int list_is_singular(const struct list_head *head)¶
tests whether a list has just one entry.
Parameters
const struct list_head *head
the list to test.
-
void list_cut_position(struct list_head *list, struct list_head *head, struct list_head *entry)¶
cut a list into two
Parameters
struct list_head *list
a new list to add all removed entries
struct list_head *head
a list with entries
struct list_head *entry
an entry within head, could be the head itself and if so we won't cut the list
Description
This helper moves the initial part of head, up to and including entry, from head to list. You should pass on entry an element you know is on head. list should be an empty list or a list you do not care about losing its data.
-
void list_cut_before(struct list_head *list, struct list_head *head, struct list_head *entry)¶
cut a list into two, before given entry
Parameters
struct list_head *list
a new list to add all removed entries
struct list_head *head
a list with entries
struct list_head *entry
an entry within head, could be the head itself
Description
This helper moves the initial part of head, up to but excluding entry, from head to list. You should pass in entry an element you know is on head. list should be an empty list or a list you do not care about losing its data. If entry == head, all entries on head are moved to list.
-
void list_splice(const struct list_head *list, struct list_head *head)¶
join two lists, this is designed for stacks
Parameters
const struct list_head *list
the new list to add.
struct list_head *head
the place to add it in the first list.
-
void list_splice_tail(struct list_head *list, struct list_head *head)¶
join two lists, each list being a queue
Parameters
struct list_head *list
the new list to add.
struct list_head *head
the place to add it in the first list.
-
void list_splice_init(struct list_head *list, struct list_head *head)¶
join two lists and reinitialise the emptied list.
Parameters
struct list_head *list
the new list to add.
struct list_head *head
the place to add it in the first list.
Description
The list at list is reinitialised
-
void list_splice_tail_init(struct list_head *list, struct list_head *head)¶
join two lists and reinitialise the emptied list
Parameters
struct list_head *list
the new list to add.
struct list_head *head
the place to add it in the first list.
Description
Each of the lists is a queue. The list at list is reinitialised
-
list_entry¶
list_entry (ptr, type, member)
get the struct for this entry
Parameters
ptr
the
struct list_head
pointer.type
the type of the struct this is embedded in.
member
the name of the list_head within the struct.
-
list_first_entry¶
list_first_entry (ptr, type, member)
get the first element from a list
Parameters
ptr
the list head to take the element from.
type
the type of the struct this is embedded in.
member
the name of the list_head within the struct.
Description
Note, that list is expected to be not empty.
-
list_last_entry¶
list_last_entry (ptr, type, member)
get the last element from a list
Parameters
ptr
the list head to take the element from.
type
the type of the struct this is embedded in.
member
the name of the list_head within the struct.
Description
Note, that list is expected to be not empty.
-
list_first_entry_or_null¶
list_first_entry_or_null (ptr, type, member)
get the first element from a list
Parameters
ptr
the list head to take the element from.
type
the type of the struct this is embedded in.
member
the name of the list_head within the struct.
Description
Note that if the list is empty, it returns NULL.
-
list_next_entry¶
list_next_entry (pos, member)
get the next element in list
Parameters
pos
the type * to cursor
member
the name of the list_head within the struct.
-
list_next_entry_circular¶
list_next_entry_circular (pos, head, member)
get the next element in list
Parameters
pos
the type * to cursor.
head
the list head to take the element from.
member
the name of the list_head within the struct.
Description
Wraparound if pos is the last element (return the first element). Note, that list is expected to be not empty.
-
list_prev_entry¶
list_prev_entry (pos, member)
get the prev element in list
Parameters
pos
the type * to cursor
member
the name of the list_head within the struct.
-
list_prev_entry_circular¶
list_prev_entry_circular (pos, head, member)
get the prev element in list
Parameters
pos
the type * to cursor.
head
the list head to take the element from.
member
the name of the list_head within the struct.
Description
Wraparound if pos is the first element (return the last element). Note, that list is expected to be not empty.
-
list_for_each¶
list_for_each (pos, head)
iterate over a list
Parameters
pos
the
struct list_head
to use as a loop cursor.head
the head for your list.
-
list_for_each_rcu¶
list_for_each_rcu (pos, head)
Iterate over a list in an RCU-safe fashion
Parameters
pos
the
struct list_head
to use as a loop cursor.head
the head for your list.
-
list_for_each_continue¶
list_for_each_continue (pos, head)
continue iteration over a list
Parameters
pos
the
struct list_head
to use as a loop cursor.head
the head for your list.
Description
Continue to iterate over a list, continuing after the current position.
-
list_for_each_prev¶
list_for_each_prev (pos, head)
iterate over a list backwards
Parameters
pos
the
struct list_head
to use as a loop cursor.head
the head for your list.
-
list_for_each_safe¶
list_for_each_safe (pos, n, head)
iterate over a list safe against removal of list entry
Parameters
pos
the
struct list_head
to use as a loop cursor.n
another
struct list_head
to use as temporary storagehead
the head for your list.
-
list_for_each_prev_safe¶
list_for_each_prev_safe (pos, n, head)
iterate over a list backwards safe against removal of list entry
Parameters
pos
the
struct list_head
to use as a loop cursor.n
another
struct list_head
to use as temporary storagehead
the head for your list.
-
size_t list_count_nodes(struct list_head *head)¶
count nodes in the list
Parameters
struct list_head *head
the head for your list.
-
list_entry_is_head¶
list_entry_is_head (pos, head, member)
test if the entry points to the head of the list
Parameters
pos
the type * to cursor
head
the head for your list.
member
the name of the list_head within the struct.
-
list_for_each_entry¶
list_for_each_entry (pos, head, member)
iterate over list of given type
Parameters
pos
the type * to use as a loop cursor.
head
the head for your list.
member
the name of the list_head within the struct.
-
list_for_each_entry_reverse¶
list_for_each_entry_reverse (pos, head, member)
iterate backwards over list of given type.
Parameters
pos
the type * to use as a loop cursor.
head
the head for your list.
member
the name of the list_head within the struct.
-
list_prepare_entry¶
list_prepare_entry (pos, head, member)
prepare a pos entry for use in
list_for_each_entry_continue()
Parameters
pos
the type * to use as a start point
head
the head of the list
member
the name of the list_head within the struct.
Description
Prepares a pos entry for use as a start point in list_for_each_entry_continue()
.
-
list_for_each_entry_continue¶
list_for_each_entry_continue (pos, head, member)
continue iteration over list of given type
Parameters
pos
the type * to use as a loop cursor.
head
the head for your list.
member
the name of the list_head within the struct.
Description
Continue to iterate over list of given type, continuing after the current position.
-
list_for_each_entry_continue_reverse¶
list_for_each_entry_continue_reverse (pos, head, member)
iterate backwards from the given point
Parameters
pos
the type * to use as a loop cursor.
head
the head for your list.
member
the name of the list_head within the struct.
Description
Start to iterate over list of given type backwards, continuing after the current position.
-
list_for_each_entry_from¶
list_for_each_entry_from (pos, head, member)
iterate over list of given type from the current point
Parameters
pos
the type * to use as a loop cursor.
head
the head for your list.
member
the name of the list_head within the struct.
Description
Iterate over list of given type, continuing from current position.
-
list_for_each_entry_from_reverse¶
list_for_each_entry_from_reverse (pos, head, member)
iterate backwards over list of given type from the current point
Parameters
pos
the type * to use as a loop cursor.
head
the head for your list.
member
the name of the list_head within the struct.
Description
Iterate backwards over list of given type, continuing from current position.
-
list_for_each_entry_safe¶
list_for_each_entry_safe (pos, n, head, member)
iterate over list of given type safe against removal of list entry
Parameters
pos
the type * to use as a loop cursor.
n
another type * to use as temporary storage
head
the head for your list.
member
the name of the list_head within the struct.
-
list_for_each_entry_safe_continue¶
list_for_each_entry_safe_continue (pos, n, head, member)
continue list iteration safe against removal
Parameters
pos
the type * to use as a loop cursor.
n
another type * to use as temporary storage
head
the head for your list.
member
the name of the list_head within the struct.
Description
Iterate over list of given type, continuing after current point, safe against removal of list entry.
-
list_for_each_entry_safe_from¶
list_for_each_entry_safe_from (pos, n, head, member)
iterate over list from current point safe against removal
Parameters
pos
the type * to use as a loop cursor.
n
another type * to use as temporary storage
head
the head for your list.
member
the name of the list_head within the struct.
Description
Iterate over list of given type from current point, safe against removal of list entry.
-
list_for_each_entry_safe_reverse¶
list_for_each_entry_safe_reverse (pos, n, head, member)
iterate backwards over list safe against removal
Parameters
pos
the type * to use as a loop cursor.
n
another type * to use as temporary storage
head
the head for your list.
member
the name of the list_head within the struct.
Description
Iterate backwards over list of given type, safe against removal of list entry.
-
list_safe_reset_next¶
list_safe_reset_next (pos, n, member)
reset a stale list_for_each_entry_safe loop
Parameters
pos
the loop cursor used in the list_for_each_entry_safe loop
n
temporary storage used in list_for_each_entry_safe
member
the name of the list_head within the struct.
Description
list_safe_reset_next is not safe to use in general if the list may be modified concurrently (eg. the lock is dropped in the loop body). An exception to this is if the cursor element (pos) is pinned in the list, and list_safe_reset_next is called after re-taking the lock and before completing the current iteration of the loop body.
-
int hlist_unhashed(const struct hlist_node *h)¶
Has node been removed from list and reinitialized?
Parameters
const struct hlist_node *h
Node to be checked
Description
Not that not all removal functions will leave a node in unhashed
state. For example, hlist_nulls_del_init_rcu()
does leave the
node in unhashed state, but hlist_nulls_del() does not.
-
int hlist_unhashed_lockless(const struct hlist_node *h)¶
Version of hlist_unhashed for lockless use
Parameters
const struct hlist_node *h
Node to be checked
Description
This variant of hlist_unhashed()
must be used in lockless contexts
to avoid potential load-tearing. The READ_ONCE() is paired with the
various WRITE_ONCE() in hlist helpers that are defined below.
-
int hlist_empty(const struct hlist_head *h)¶
Is the specified hlist_head structure an empty hlist?
Parameters
const struct hlist_head *h
Structure to check.
-
void hlist_del(struct hlist_node *n)¶
Delete the specified hlist_node from its list
Parameters
struct hlist_node *n
Node to delete.
Description
Note that this function leaves the node in hashed state. Use
hlist_del_init()
or similar instead to unhash n.
-
void hlist_del_init(struct hlist_node *n)¶
Delete the specified hlist_node from its list and initialize
Parameters
struct hlist_node *n
Node to delete.
Description
Note that this function leaves the node in unhashed state.
-
void hlist_add_head(struct hlist_node *n, struct hlist_head *h)¶
add a new entry at the beginning of the hlist
Parameters
struct hlist_node *n
new entry to be added
struct hlist_head *h
hlist head to add it after
Description
Insert a new entry after the specified head. This is good for implementing stacks.
-
void hlist_add_before(struct hlist_node *n, struct hlist_node *next)¶
add a new entry before the one specified
Parameters
struct hlist_node *n
new entry to be added
struct hlist_node *next
hlist node to add it before, which must be non-NULL
-
void hlist_add_behind(struct hlist_node *n, struct hlist_node *prev)¶
add a new entry after the one specified
Parameters
struct hlist_node *n
new entry to be added
struct hlist_node *prev
hlist node to add it after, which must be non-NULL
-
void hlist_add_fake(struct hlist_node *n)¶
create a fake hlist consisting of a single headless node
Parameters
struct hlist_node *n
Node to make a fake list out of
Description
This makes n appear to be its own predecessor on a headless hlist.
The point of this is to allow things like hlist_del()
to work correctly
in cases where there is no list.
-
bool hlist_fake(struct hlist_node *h)¶
Is this node a fake hlist?
Parameters
struct hlist_node *h
Node to check for being a self-referential fake hlist.
-
bool hlist_is_singular_node(struct hlist_node *n, struct hlist_head *h)¶
is node the only element of the specified hlist?
Parameters
struct hlist_node *n
Node to check for singularity.
struct hlist_head *h
Header for potentially singular list.
Description
Check whether the node is the only node of the head without accessing head, thus avoiding unnecessary cache misses.
-
void hlist_move_list(struct hlist_head *old, struct hlist_head *new)¶
Move an hlist
Parameters
struct hlist_head *old
hlist_head for old list.
struct hlist_head *new
hlist_head for new list.
Description
Move a list from one list head to another. Fixup the pprev reference of the first entry if it exists.
-
hlist_for_each_entry¶
hlist_for_each_entry (pos, head, member)
iterate over list of given type
Parameters
pos
the type * to use as a loop cursor.
head
the head for your list.
member
the name of the hlist_node within the struct.
-
hlist_for_each_entry_continue¶
hlist_for_each_entry_continue (pos, member)
iterate over a hlist continuing after current point
Parameters
pos
the type * to use as a loop cursor.
member
the name of the hlist_node within the struct.
-
hlist_for_each_entry_from¶
hlist_for_each_entry_from (pos, member)
iterate over a hlist continuing from current point
Parameters
pos
the type * to use as a loop cursor.
member
the name of the hlist_node within the struct.
-
hlist_for_each_entry_safe¶
hlist_for_each_entry_safe (pos, n, head, member)
iterate over list of given type safe against removal of list entry
Parameters
pos
the type * to use as a loop cursor.
n
a
struct hlist_node
to use as temporary storagehead
the head for your list.
member
the name of the hlist_node within the struct.
Basic C Library Functions¶
When writing drivers, you cannot in general use routines which are from the C Library. Some of the functions have been found generally useful and they are listed below. The behaviour of these functions may vary slightly from those defined by ANSI, and these deviations are noted in the text.
String Conversions¶
-
unsigned long long simple_strtoull(const char *cp, char **endp, unsigned int base)¶
convert a string to an unsigned long long
Parameters
const char *cp
The start of the string
char **endp
A pointer to the end of the parsed string will be placed here
unsigned int base
The number base to use
Description
This function has caveats. Please use kstrtoull instead.
-
unsigned long simple_strtoul(const char *cp, char **endp, unsigned int base)¶
convert a string to an unsigned long
Parameters
const char *cp
The start of the string
char **endp
A pointer to the end of the parsed string will be placed here
unsigned int base
The number base to use
Description
This function has caveats. Please use kstrtoul instead.
-
long simple_strtol(const char *cp, char **endp, unsigned int base)¶
convert a string to a signed long
Parameters
const char *cp
The start of the string
char **endp
A pointer to the end of the parsed string will be placed here
unsigned int base
The number base to use
Description
This function has caveats. Please use kstrtol instead.
-
long long simple_strtoll(const char *cp, char **endp, unsigned int base)¶
convert a string to a signed long long
Parameters
const char *cp
The start of the string
char **endp
A pointer to the end of the parsed string will be placed here
unsigned int base
The number base to use
Description
This function has caveats. Please use kstrtoll instead.
-
int vsnprintf(char *buf, size_t size, const char *fmt, va_list args)¶
Format a string and place it in a buffer
Parameters
char *buf
The buffer to place the result into
size_t size
The size of the buffer, including the trailing null space
const char *fmt
The format string to use
va_list args
Arguments for the format string
Description
This function generally follows C99 vsnprintf, but has some extensions and a few limitations:
``n``
is unsupported
``p``*
is handled by pointer()
See pointer() or How to get printk format specifiers right for more extensive description.
Please update the documentation in both places when making changes
The return value is the number of characters which would
be generated for the given input, excluding the trailing
'0', as per ISO C99. If you want to have the exact
number of characters written into buf as return value
(not including the trailing '0'), use vscnprintf()
. If the
return is greater than or equal to size, the resulting
string is truncated.
If you're not already dealing with a va_list consider using snprintf()
.
-
int vscnprintf(char *buf, size_t size, const char *fmt, va_list args)¶
Format a string and place it in a buffer
Parameters
char *buf
The buffer to place the result into
size_t size
The size of the buffer, including the trailing null space
const char *fmt
The format string to use
va_list args
Arguments for the format string
Description
The return value is the number of characters which have been written into the buf not including the trailing '0'. If size is == 0 the function returns 0.
If you're not already dealing with a va_list consider using scnprintf()
.
See the vsnprintf()
documentation for format string extensions over C99.
-
int snprintf(char *buf, size_t size, const char *fmt, ...)¶
Format a string and place it in a buffer
Parameters
char *buf
The buffer to place the result into
size_t size
The size of the buffer, including the trailing null space
const char *fmt
The format string to use
...
Arguments for the format string
Description
The return value is the number of characters which would be generated for the given input, excluding the trailing null, as per ISO C99. If the return is greater than or equal to size, the resulting string is truncated.
See the vsnprintf()
documentation for format string extensions over C99.
-
int scnprintf(char *buf, size_t size, const char *fmt, ...)¶
Format a string and place it in a buffer
Parameters
char *buf
The buffer to place the result into
size_t size
The size of the buffer, including the trailing null space
const char *fmt
The format string to use
...
Arguments for the format string
Description
The return value is the number of characters written into buf not including the trailing '0'. If size is == 0 the function returns 0.
-
int vsprintf(char *buf, const char *fmt, va_list args)¶
Format a string and place it in a buffer
Parameters
char *buf
The buffer to place the result into
const char *fmt
The format string to use
va_list args
Arguments for the format string
Description
The function returns the number of characters written
into buf. Use vsnprintf()
or vscnprintf()
in order to avoid
buffer overflows.
If you're not already dealing with a va_list consider using sprintf()
.
See the vsnprintf()
documentation for format string extensions over C99.
-
int sprintf(char *buf, const char *fmt, ...)¶
Format a string and place it in a buffer
Parameters
char *buf
The buffer to place the result into
const char *fmt
The format string to use
...
Arguments for the format string
Description
The function returns the number of characters written
into buf. Use snprintf()
or scnprintf()
in order to avoid
buffer overflows.
See the vsnprintf()
documentation for format string extensions over C99.
-
int vbin_printf(u32 *bin_buf, size_t size, const char *fmt, va_list args)¶
Parse a format string and place args' binary value in a buffer
Parameters
u32 *bin_buf
The buffer to place args' binary value
size_t size
The size of the buffer(by words(32bits), not characters)
const char *fmt
The format string to use
va_list args
Arguments for the format string
Description
The format follows C99 vsnprintf, except n
is ignored, and its argument
is skipped.
The return value is the number of words(32bits) which would be generated for the given input.
NOTE
If the return value is greater than size, the resulting bin_buf is NOT
valid for bstr_printf()
.
-
int bstr_printf(char *buf, size_t size, const char *fmt, const u32 *bin_buf)¶
Format a string from binary arguments and place it in a buffer
Parameters
char *buf
The buffer to place the result into
size_t size
The size of the buffer, including the trailing null space
const char *fmt
The format string to use
const u32 *bin_buf
Binary arguments for the format string
Description
This function like C99 vsnprintf, but the difference is that vsnprintf gets arguments from stack, and bstr_printf gets arguments from bin_buf which is a binary buffer that generated by vbin_printf.
- The format follows C99 vsnprintf, but has some extensions:
see vsnprintf comment for details.
The return value is the number of characters which would
be generated for the given input, excluding the trailing
'0', as per ISO C99. If you want to have the exact
number of characters written into buf as return value
(not including the trailing '0'), use vscnprintf()
. If the
return is greater than or equal to size, the resulting
string is truncated.
-
int bprintf(u32 *bin_buf, size_t size, const char *fmt, ...)¶
Parse a format string and place args' binary value in a buffer
Parameters
u32 *bin_buf
The buffer to place args' binary value
size_t size
The size of the buffer(by words(32bits), not characters)
const char *fmt
The format string to use
...
Arguments for the format string
Description
The function returns the number of words(u32) written into bin_buf.
-
int vsscanf(const char *buf, const char *fmt, va_list args)¶
Unformat a buffer into a list of arguments
Parameters
const char *buf
input buffer
const char *fmt
format of buffer
va_list args
arguments
-
int sscanf(const char *buf, const char *fmt, ...)¶
Unformat a buffer into a list of arguments
Parameters
const char *buf
input buffer
const char *fmt
formatting of buffer
...
resulting arguments
-
int kstrtoul(const char *s, unsigned int base, unsigned long *res)¶
convert a string to an unsigned long
Parameters
const char *s
The start of the string. The string must be null-terminated, and may also include a single newline before its terminating null. The first character may also be a plus sign, but not a minus sign.
unsigned int base
The number base to use. The maximum supported base is 16. If base is given as 0, then the base of the string is automatically detected with the conventional semantics - If it begins with 0x the number will be parsed as a hexadecimal (case insensitive), if it otherwise begins with 0, it will be parsed as an octal number. Otherwise it will be parsed as a decimal.
unsigned long *res
Where to write the result of the conversion on success.
Description
Returns 0 on success, -ERANGE on overflow and -EINVAL on parsing error.
Preferred over simple_strtoul()
. Return code must be checked.
-
int kstrtol(const char *s, unsigned int base, long *res)¶
convert a string to a long
Parameters
const char *s
The start of the string. The string must be null-terminated, and may also include a single newline before its terminating null. The first character may also be a plus sign or a minus sign.
unsigned int base
The number base to use. The maximum supported base is 16. If base is given as 0, then the base of the string is automatically detected with the conventional semantics - If it begins with 0x the number will be parsed as a hexadecimal (case insensitive), if it otherwise begins with 0, it will be parsed as an octal number. Otherwise it will be parsed as a decimal.
long *res
Where to write the result of the conversion on success.
Description
Returns 0 on success, -ERANGE on overflow and -EINVAL on parsing error.
Preferred over simple_strtol()
. Return code must be checked.
-
int kstrtoull(const char *s, unsigned int base, unsigned long long *res)¶
convert a string to an unsigned long long
Parameters
const char *s
The start of the string. The string must be null-terminated, and may also include a single newline before its terminating null. The first character may also be a plus sign, but not a minus sign.
unsigned int base
The number base to use. The maximum supported base is 16. If base is given as 0, then the base of the string is automatically detected with the conventional semantics - If it begins with 0x the number will be parsed as a hexadecimal (case insensitive), if it otherwise begins with 0, it will be parsed as an octal number. Otherwise it will be parsed as a decimal.
unsigned long long *res
Where to write the result of the conversion on success.
Description
Returns 0 on success, -ERANGE on overflow and -EINVAL on parsing error.
Preferred over simple_strtoull()
. Return code must be checked.
-
int kstrtoll(const char *s, unsigned int base, long long *res)¶
convert a string to a long long
Parameters
const char *s
The start of the string. The string must be null-terminated, and may also include a single newline before its terminating null. The first character may also be a plus sign or a minus sign.
unsigned int base
The number base to use. The maximum supported base is 16. If base is given as 0, then the base of the string is automatically detected with the conventional semantics - If it begins with 0x the number will be parsed as a hexadecimal (case insensitive), if it otherwise begins with 0, it will be parsed as an octal number. Otherwise it will be parsed as a decimal.
long long *res
Where to write the result of the conversion on success.
Description
Returns 0 on success, -ERANGE on overflow and -EINVAL on parsing error.
Preferred over simple_strtoll()
. Return code must be checked.
-
int kstrtouint(const char *s, unsigned int base, unsigned int *res)¶
convert a string to an unsigned int
Parameters
const char *s
The start of the string. The string must be null-terminated, and may also include a single newline before its terminating null. The first character may also be a plus sign, but not a minus sign.
unsigned int base
The number base to use. The maximum supported base is 16. If base is given as 0, then the base of the string is automatically detected with the conventional semantics - If it begins with 0x the number will be parsed as a hexadecimal (case insensitive), if it otherwise begins with 0, it will be parsed as an octal number. Otherwise it will be parsed as a decimal.
unsigned int *res
Where to write the result of the conversion on success.
Description
Returns 0 on success, -ERANGE on overflow and -EINVAL on parsing error.
Preferred over simple_strtoul()
. Return code must be checked.
-
int kstrtoint(const char *s, unsigned int base, int *res)¶
convert a string to an int
Parameters
const char *s
The start of the string. The string must be null-terminated, and may also include a single newline before its terminating null. The first character may also be a plus sign or a minus sign.
unsigned int base
The number base to use. The maximum supported base is 16. If base is given as 0, then the base of the string is automatically detected with the conventional semantics - If it begins with 0x the number will be parsed as a hexadecimal (case insensitive), if it otherwise begins with 0, it will be parsed as an octal number. Otherwise it will be parsed as a decimal.
int *res
Where to write the result of the conversion on success.
Description
Returns 0 on success, -ERANGE on overflow and -EINVAL on parsing error.
Preferred over simple_strtol()
. Return code must be checked.
-
int kstrtobool(const char *s, bool *res)¶
convert common user inputs into boolean values
Parameters
const char *s
input string
bool *res
result
Description
This routine returns 0 iff the first character is one of 'YyTt1NnFf0', or [oO][NnFf] for "on" and "off". Otherwise it will return -EINVAL. Value pointed to by res is updated upon finding a match.
-
void string_get_size(u64 size, u64 blk_size, const enum string_size_units units, char *buf, int len)¶
get the size in the specified units
Parameters
u64 size
The size to be converted in blocks
u64 blk_size
Size of the block (use 1 for size in bytes)
const enum string_size_units units
units to use (powers of 1000 or 1024)
char *buf
buffer to format to
int len
length of buffer
Description
This function returns a string formatted to 3 significant figures giving the size in the required units. buf should have room for at least 9 bytes and will always be zero terminated.
-
int parse_int_array_user(const char __user *from, size_t count, int **array)¶
Split string into a sequence of integers
Parameters
const char __user *from
The user space buffer to read from
size_t count
The maximum number of bytes to read
int **array
Returned pointer to sequence of integers
Description
On success array is allocated and initialized with a sequence of integers extracted from the from plus an additional element that begins the sequence and specifies the integers count.
Caller takes responsibility for freeing array when it is no longer needed.
-
int string_unescape(char *src, char *dst, size_t size, unsigned int flags)¶
unquote characters in the given string
Parameters
char *src
source buffer (escaped)
char *dst
destination buffer (unescaped)
size_t size
size of the destination buffer (0 to unlimit)
unsigned int flags
combination of the flags.
Description
The function unquotes characters in the given string.
Because the size of the output will be the same as or less than the size of the input, the transformation may be performed in place.
Caller must provide valid source and destination pointers. Be aware that destination buffer will always be NULL-terminated. Source string must be NULL-terminated as well. The supported flags are:
UNESCAPE_SPACE:
'\f' - form feed
'\n' - new line
'\r' - carriage return
'\t' - horizontal tab
'\v' - vertical tab
UNESCAPE_OCTAL:
'\NNN' - byte with octal value NNN (1 to 3 digits)
UNESCAPE_HEX:
'\xHH' - byte with hexadecimal value HH (1 to 2 digits)
UNESCAPE_SPECIAL:
'\"' - double quote
'\\' - backslash
'\a' - alert (BEL)
'\e' - escape
UNESCAPE_ANY:
all previous together
Return
The amount of the characters processed to the destination buffer excluding trailing '0' is returned.
-
int string_escape_mem(const char *src, size_t isz, char *dst, size_t osz, unsigned int flags, const char *only)¶
quote characters in the given memory buffer
Parameters
const char *src
source buffer (unescaped)
size_t isz
source buffer size
char *dst
destination buffer (escaped)
size_t osz
destination buffer size
unsigned int flags
combination of the flags
const char *only
NULL-terminated string containing characters used to limit the selected escape class. If characters are included in only that would not normally be escaped by the classes selected in flags, they will be copied to dst unescaped.
Description
The process of escaping byte buffer includes several parts. They are applied in the following sequence.
The character is not matched to the one from only string and thus must go as-is to the output.
The character is matched to the printable and ASCII classes, if asked, and in case of match it passes through to the output.
The character is matched to the printable or ASCII class, if asked, and in case of match it passes through to the output.
The character is checked if it falls into the class given by flags.
ESCAPE_OCTAL
andESCAPE_HEX
are going last since they cover any character. Note that they actually can't go together, otherwiseESCAPE_HEX
will be ignored.
Caller must provide valid source and destination pointers. Be aware that destination buffer will not be NULL-terminated, thus caller have to append it if needs. The supported flags are:
%ESCAPE_SPACE: (special white space, not space itself)
'\f' - form feed
'\n' - new line
'\r' - carriage return
'\t' - horizontal tab
'\v' - vertical tab
%ESCAPE_SPECIAL:
'\"' - double quote
'\\' - backslash
'\a' - alert (BEL)
'\e' - escape
%ESCAPE_NULL:
'\0' - null
%ESCAPE_OCTAL:
'\NNN' - byte with octal value NNN (3 digits)
%ESCAPE_ANY:
all previous together
%ESCAPE_NP:
escape only non-printable characters, checked by isprint()
%ESCAPE_ANY_NP:
all previous together
%ESCAPE_HEX:
'\xHH' - byte with hexadecimal value HH (2 digits)
%ESCAPE_NA:
escape only non-ascii characters, checked by isascii()
%ESCAPE_NAP:
escape only non-printable or non-ascii characters
%ESCAPE_APPEND:
append characters from @only to be escaped by the given classes
ESCAPE_APPEND
would help to pass additional characters to the escaped, when
one of ESCAPE_NP
, ESCAPE_NA
, or ESCAPE_NAP
is provided.
One notable caveat, the ESCAPE_NAP
, ESCAPE_NP
and ESCAPE_NA
have the
higher priority than the rest of the flags (ESCAPE_NAP
is the highest).
It doesn't make much sense to use either of them without ESCAPE_OCTAL
or ESCAPE_HEX
, because they cover most of the other character classes.
ESCAPE_NAP
can utilize ESCAPE_SPACE
or ESCAPE_SPECIAL
in addition to
the above.
Return
The total size of the escaped output that would be generated for the given input and flags. To check whether the output was truncated, compare the return value to osz. There is room left in dst for a '0' terminator if and only if ret < osz.
-
char **kasprintf_strarray(gfp_t gfp, const char *prefix, size_t n)¶
allocate and fill array of sequential strings
Parameters
gfp_t gfp
flags for the slab allocator
const char *prefix
prefix to be used
size_t n
amount of lines to be allocated and filled
Description
Allocates and fills n strings using pattern "s-````zu
", where prefix
is provided by caller. The caller is responsible to free them with
kfree_strarray()
after use.
Returns array of strings or NULL when memory can't be allocated.
-
void kfree_strarray(char **array, size_t n)¶
free a number of dynamically allocated strings contained in an array and the array itself
Parameters
char **array
Dynamically allocated array of strings to free.
size_t n
Number of strings (starting from the beginning of the array) to free.
Description
Passing a non-NULL array and n == 0 as well as NULL array are valid use-cases. If array is NULL, the function does nothing.
-
ssize_t strscpy_pad(char *dest, const char *src, size_t count)¶
Copy a C-string into a sized buffer
Parameters
char *dest
Where to copy the string to
const char *src
Where to copy the string from
size_t count
Size of destination buffer
Description
Copy the string, or as much of it as fits, into the dest buffer. The
behavior is undefined if the string buffers overlap. The destination
buffer is always NUL
terminated, unless it's zero-sized.
If the source string is shorter than the destination buffer, zeros the tail of the destination buffer.
For full explanation of why you may want to consider using the
'strscpy' functions please see the function docstring for strscpy()
.
Return
The number of characters copied (not including the trailing
NUL
)-E2BIG if count is 0 or src was truncated.
-
char *skip_spaces(const char *str)¶
Removes leading whitespace from str.
Parameters
const char *str
The string to be stripped.
Description
Returns a pointer to the first non-whitespace character in str.
-
char *strim(char *s)¶
Removes leading and trailing whitespace from s.
Parameters
char *s
The string to be stripped.
Description
Note that the first trailing whitespace is replaced with a NUL-terminator
in the given string s. Returns a pointer to the first non-whitespace
character in s.
-
bool sysfs_streq(const char *s1, const char *s2)¶
return true if strings are equal, modulo trailing newline
Parameters
const char *s1
one string
const char *s2
another string
Description
This routine returns true iff two strings are equal, treating both NUL and newline-then-NUL as equivalent string terminations. It's geared for use with sysfs input strings, which generally terminate with newlines but are compared against values without newlines.
-
int match_string(const char *const *array, size_t n, const char *string)¶
matches given string in an array
Parameters
const char * const *array
array of strings
size_t n
number of strings in the array or -1 for NULL terminated arrays
const char *string
string to match with
Description
This routine will look for a string in an array of strings up to the n-th element in the array or until the first NULL element.
Historically the value of -1 for n, was used to search in arrays that are NULL terminated. However, the function does not make a distinction when finishing the search: either n elements have been compared OR the first NULL element was found.
Return
index of a string in the array if matches, or -EINVAL
otherwise.
-
int __sysfs_match_string(const char *const *array, size_t n, const char *str)¶
matches given string in an array
Parameters
const char * const *array
array of strings
size_t n
number of strings in the array or -1 for NULL terminated arrays
const char *str
string to match with
Description
Returns index of str in the array or -EINVAL, just like match_string()
.
Uses sysfs_streq instead of strcmp for matching.
This routine will look for a string in an array of strings up to the n-th element in the array or until the first NULL element.
Historically the value of -1 for n, was used to search in arrays that are NULL terminated. However, the function does not make a distinction when finishing the search: either n elements have been compared OR the first NULL element was found.
-
char *strreplace(char *str, char old, char new)¶
Replace all occurrences of character in string.
Parameters
char *str
The string to operate on.
char old
The character being replaced.
char new
The character old is replaced with.
Description
Replaces the each old character with a new one in the given string str.
Return
pointer to the string str itself.
-
void memcpy_and_pad(void *dest, size_t dest_len, const void *src, size_t count, int pad)¶
Copy one buffer to another with padding
Parameters
void *dest
Where to copy to
size_t dest_len
The destination buffer size
const void *src
Where to copy from
size_t count
The number of bytes to copy
int pad
Character to use for padding if space is left in destination.
String Manipulation¶
-
unsafe_memcpy¶
unsafe_memcpy (dst, src, bytes, justification)
memcpy implementation with no FORTIFY bounds checking
Parameters
dst
Destination memory address to write to
src
Source memory address to read from
bytes
How many bytes to write to dst from src
justification
Free-form text or comment describing why the use is needed
Description
This should be used for corner cases where the compiler cannot do the right thing, or during transitions between APIs, etc. It should be used very rarely, and includes a place for justification detailing where bounds checking has happened, and why existing solutions cannot be employed.
-
char *strncpy(char *const p, const char *q, __kernel_size_t size)¶
Copy a string to memory with non-guaranteed NUL padding
Parameters
char * const p
pointer to destination of copy
const char *q
pointer to NUL-terminated source string to copy
__kernel_size_t size
bytes to write at p
Description
If strlen(q) >= size, the copy of q will stop after size bytes, and p will NOT be NUL-terminated
If strlen(q) < size, following the copy of q, trailing NUL bytes will be written to p until size total bytes have been written.
Do not use this function. While FORTIFY_SOURCE tries to avoid
over-reads of q, it cannot defend against writing unterminated
results to p. Using strncpy()
remains ambiguous and fragile.
Instead, please choose an alternative, so that the expectation
of p's contents is unambiguous:
p needs to be: |
padded to size |
not padded |
---|---|---|
NUL-terminated |
||
not NUL-terminated |
Note strscpy*()'s differing return values for detecting truncation, and strtomem*()'s expectation that the destination is marked with __nonstring when it is a character array.
-
__kernel_size_t strnlen(const char *const p, __kernel_size_t maxlen)¶
Return bounded count of characters in a NUL-terminated string
Parameters
const char * const p
pointer to NUL-terminated string to count.
__kernel_size_t maxlen
maximum number of characters to count.
Description
Returns number of characters in p (NOT including the final NUL), or maxlen, if no NUL has been found up to there.
-
strlen¶
strlen (p)
Return count of characters in a NUL-terminated string
Parameters
p
pointer to NUL-terminated string to count.
Description
Do not use this function unless the string length is known at
compile-time. When p is unterminated, this function may crash
or return unexpected counts that could lead to memory content
exposures. Prefer strnlen()
.
Returns number of characters in p (NOT including the final NUL).
-
size_t strlcpy(char *const p, const char *const q, size_t size)¶
Copy a string into another string buffer
Parameters
char * const p
pointer to destination of copy
const char * const q
pointer to NUL-terminated source string to copy
size_t size
maximum number of bytes to write at p
Description
If strlen(q) >= size, the copy of q will be truncated at size - 1 bytes. p will always be NUL-terminated.
Do not use this function. While FORTIFY_SOURCE tries to avoid
over-reads when calculating strlen(q), it is still possible.
Prefer strscpy()
, though note its different return values for
detecting truncation.
Returns total number of bytes written to p, including terminating NUL.
-
ssize_t strscpy(char *const p, const char *const q, size_t size)¶
Copy a C-string into a sized buffer
Parameters
char * const p
Where to copy the string to
const char * const q
Where to copy the string from
size_t size
Size of destination buffer
Description
Copy the source string q, or as much of it as fits, into the destination p buffer. The behavior is undefined if the string buffers overlap. The destination p buffer is always NUL terminated, unless it's zero-sized.
Preferred to strlcpy()
since the API doesn't require reading memory
from the source q string beyond the specified size bytes, and since
the return value is easier to error-check than strlcpy()
's.
In addition, the implementation is robust to the string changing out
from underneath it, unlike the current strlcpy()
implementation.
Preferred to strncpy()
since it always returns a valid string, and
doesn't unnecessarily force the tail of the destination buffer to be
zero padded. If padding is desired please use strscpy_pad()
.
Returns the number of characters copied in p (not including the
trailing NUL
) or -E2BIG if size is 0 or the copy of q was truncated.
-
size_t strlcat(char *const p, const char *const q, size_t avail)¶
Append a string to an existing string
Parameters
char * const p
pointer to
NUL-terminated
string to append toconst char * const q
pointer to
NUL-terminated
string to append fromsize_t avail
Maximum bytes available in p
Description
Appends NUL-terminated
string q after the NUL-terminated
string at p, but will not write beyond avail bytes total,
potentially truncating the copy from q. p will stay
NUL-terminated
only if a NUL
already existed within
the avail bytes of p. If so, the resulting number of
bytes copied from q will be at most "avail - strlen(p) - 1".
Do not use this function. While FORTIFY_SOURCE tries to avoid
read and write overflows, this is only possible when the sizes
of p and q are known to the compiler. Prefer building the
string with formatting, via scnprintf()
, seq_buf, or similar.
Returns total bytes that _would_ have been contained by p
regardless of truncation, similar to snprintf()
. If return
value is >= avail, the string has been truncated.
-
char *strcat(char *const p, const char *q)¶
Append a string to an existing string
Parameters
char * const p
pointer to NUL-terminated string to append to
const char *q
pointer to NUL-terminated source string to append from
Description
Do not use this function. While FORTIFY_SOURCE tries to avoid
read and write overflows, this is only possible when the
destination buffer size is known to the compiler. Prefer
building the string with formatting, via scnprintf()
or similar.
At the very least, use strncat()
.
Returns p.
-
char *strncat(char *const p, const char *const q, __kernel_size_t count)¶
Append a string to an existing string
Parameters
char * const p
pointer to NUL-terminated string to append to
const char * const q
pointer to source string to append from
__kernel_size_t count
Maximum bytes to read from q
Description
Appends at most count bytes from q (stopping at the first NUL byte) after the NUL-terminated string at p. p will be NUL-terminated.
Do not use this function. While FORTIFY_SOURCE tries to avoid
read and write overflows, this is only possible when the sizes
of p and q are known to the compiler. Prefer building the
string with formatting, via scnprintf()
or similar.
Returns p.
-
char *strcpy(char *const p, const char *const q)¶
Copy a string into another string buffer
Parameters
char * const p
pointer to destination of copy
const char * const q
pointer to NUL-terminated source string to copy
Description
Do not use this function. While FORTIFY_SOURCE tries to avoid
overflows, this is only possible when the sizes of q and p are
known to the compiler. Prefer strscpy()
, though note its different
return values for detecting truncation.
Returns p.
-
int strncasecmp(const char *s1, const char *s2, size_t len)¶
Case insensitive, length-limited string comparison
Parameters
const char *s1
One string
const char *s2
The other string
size_t len
the maximum number of characters to compare
-
char *stpcpy(char *__restrict__ dest, const char *__restrict__ src)¶
copy a string from src to dest returning a pointer to the new end of dest, including src's
NUL-terminator
. May overrun dest.
Parameters
char *__restrict__ dest
pointer to end of string being copied into. Must be large enough to receive copy.
const char *__restrict__ src
pointer to the beginning of string being copied from. Must not overlap dest.
Description
stpcpy differs from strcpy in a key way: the return value is a pointer
to the new NUL-terminating
character in dest. (For strcpy, the return
value is a pointer to the start of dest). This interface is considered
unsafe as it doesn't perform bounds checking of the inputs. As such it's
not recommended for usage. Instead, its definition is provided in case
the compiler lowers other libcalls to stpcpy.
-
int strcmp(const char *cs, const char *ct)¶
Compare two strings
Parameters
const char *cs
One string
const char *ct
Another string
-
int strncmp(const char *cs, const char *ct, size_t count)¶
Compare two length-limited strings
Parameters
const char *cs
One string
const char *ct
Another string
size_t count
The maximum number of bytes to compare
-
char *strchr(const char *s, int c)¶
Find the first occurrence of a character in a string
Parameters
const char *s
The string to be searched
int c
The character to search for
Description
Note that the NUL-terminator
is considered part of the string, and can
be searched for.
-
char *strchrnul(const char *s, int c)¶
Find and return a character in a string, or end of string
Parameters
const char *s
The string to be searched
int c
The character to search for
Description
Returns pointer to first occurrence of 'c' in s. If c is not found, then return a pointer to the null byte at the end of s.
-
char *strrchr(const char *s, int c)¶
Find the last occurrence of a character in a string
Parameters
const char *s
The string to be searched
int c
The character to search for
-
char *strnchr(const char *s, size_t count, int c)¶
Find a character in a length limited string
Parameters
const char *s
The string to be searched
size_t count
The number of characters to be searched
int c
The character to search for
Description
Note that the NUL-terminator
is considered part of the string, and can
be searched for.
-
size_t strspn(const char *s, const char *accept)¶
Calculate the length of the initial substring of s which only contain letters in accept
Parameters
const char *s
The string to be searched
const char *accept
The string to search for
-
size_t strcspn(const char *s, const char *reject)¶
Calculate the length of the initial substring of s which does not contain letters in reject
Parameters
const char *s
The string to be searched
const char *reject
The string to avoid
-
char *strpbrk(const char *cs, const char *ct)¶
Find the first occurrence of a set of characters
Parameters
const char *cs
The string to be searched
const char *ct
The characters to search for
-
char *strsep(char **s, const char *ct)¶
Split a string into tokens
Parameters
char **s
The string to be searched
const char *ct
The characters to search for
Description
strsep()
updates s to point after the token, ready for the next call.
It returns empty tokens, too, behaving exactly like the libc function of that name. In fact, it was stolen from glibc2 and de-fancy-fied. Same semantics, slimmer shape. ;)
-
void *memset(void *s, int c, size_t count)¶
Fill a region of memory with the given value
Parameters
void *s
Pointer to the start of the area.
int c
The byte to fill the area with
size_t count
The size of the area.
Description
Do not use memset()
to access IO space, use memset_io() instead.
-
void *memset16(uint16_t *s, uint16_t v, size_t count)¶
Fill a memory area with a uint16_t
Parameters
uint16_t *s
Pointer to the start of the area.
uint16_t v
The value to fill the area with
size_t count
The number of values to store
Description
Differs from memset()
in that it fills with a uint16_t instead
of a byte. Remember that count is the number of uint16_ts to
store, not the number of bytes.
-
void *memset32(uint32_t *s, uint32_t v, size_t count)¶
Fill a memory area with a uint32_t
Parameters
uint32_t *s
Pointer to the start of the area.
uint32_t v
The value to fill the area with
size_t count
The number of values to store
Description
Differs from memset()
in that it fills with a uint32_t instead
of a byte. Remember that count is the number of uint32_ts to
store, not the number of bytes.
-
void *memset64(uint64_t *s, uint64_t v, size_t count)¶
Fill a memory area with a uint64_t
Parameters
uint64_t *s
Pointer to the start of the area.
uint64_t v
The value to fill the area with
size_t count
The number of values to store
Description
Differs from memset()
in that it fills with a uint64_t instead
of a byte. Remember that count is the number of uint64_ts to
store, not the number of bytes.
-
void *memcpy(void *dest, const void *src, size_t count)¶
Copy one area of memory to another
Parameters
void *dest
Where to copy to
const void *src
Where to copy from
size_t count
The size of the area.
Description
You should not use this function to access IO space, use memcpy_toio() or memcpy_fromio() instead.
-
void *memmove(void *dest, const void *src, size_t count)¶
Copy one area of memory to another
Parameters
void *dest
Where to copy to
const void *src
Where to copy from
size_t count
The size of the area.
Description
-
__visible int memcmp(const void *cs, const void *ct, size_t count)¶
Compare two areas of memory
Parameters
const void *cs
One area of memory
const void *ct
Another area of memory
size_t count
The size of the area.
-
int bcmp(const void *a, const void *b, size_t len)¶
returns 0 if and only if the buffers have identical contents.
Parameters
const void *a
pointer to first buffer.
const void *b
pointer to second buffer.
size_t len
size of buffers.
Description
The sign or magnitude of a non-zero return value has no particular
meaning, and architectures may implement their own more efficient bcmp()
. So
while this particular implementation is a simple (tail) call to memcmp, do
not rely on anything but whether the return value is zero or non-zero.
-
void *memscan(void *addr, int c, size_t size)¶
Find a character in an area of memory.
Parameters
void *addr
The memory area
int c
The byte to search for
size_t size
The size of the area.
Description
returns the address of the first occurrence of c, or 1 byte past the area if c is not found
-
char *strstr(const char *s1, const char *s2)¶
Find the first substring in a
NUL
terminated string
Parameters
const char *s1
The string to be searched
const char *s2
The string to search for
-
char *strnstr(const char *s1, const char *s2, size_t len)¶
Find the first substring in a length-limited string
Parameters
const char *s1
The string to be searched
const char *s2
The string to search for
size_t len
the maximum number of characters to search
-
void *memchr(const void *s, int c, size_t n)¶
Find a character in an area of memory.
Parameters
const void *s
The memory area
int c
The byte to search for
size_t n
The size of the area.
Description
returns the address of the first occurrence of c, or NULL
if c is not found
-
void *memchr_inv(const void *start, int c, size_t bytes)¶
Find an unmatching character in an area of memory.
Parameters
const void *start
The memory area
int c
Find a character other than c
size_t bytes
The size of the area.
Description
returns the address of the first character other than c, or NULL
if the whole buffer contains just c.
-
sysfs_match_string¶
sysfs_match_string (_a, _s)
matches given string in an array
Parameters
_a
array of strings
_s
string to match with
Description
Helper for __sysfs_match_string()
. Calculates the size of a automatically.
-
bool strstarts(const char *str, const char *prefix)¶
does str start with prefix?
Parameters
const char *str
string to examine
const char *prefix
prefix to look for.
-
void memzero_explicit(void *s, size_t count)¶
Fill a region of memory (e.g. sensitive keying data) with 0s.
Parameters
void *s
Pointer to the start of the area.
size_t count
The size of the area.
Note
usually using memset()
is just fine (!), but in cases
where clearing out _local_ data at the end of a scope is
necessary, memzero_explicit()
should be used instead in
order to prevent the compiler from optimising away zeroing.
Description
memzero_explicit()
doesn't need an arch-specific version as
it just invokes the one of memset()
implicitly.
-
const char *kbasename(const char *path)¶
return the last part of a pathname.
Parameters
const char *path
path to extract the filename from.
-
strtomem_pad¶
strtomem_pad (dest, src, pad)
Copy NUL-terminated string to non-NUL-terminated buffer
Parameters
dest
Pointer of destination character array (marked as __nonstring)
src
Pointer to NUL-terminated string
pad
Padding character to fill any remaining bytes of dest after copy
Description
This is a replacement for strncpy()
uses where the destination is not
a NUL-terminated string, but with bounds checking on the source size, and
an explicit padding character. If padding is not required, use strtomem()
.
Note that the size of dest is not an argument, as the length of dest must be discoverable by the compiler.
-
strtomem¶
strtomem (dest, src)
Copy NUL-terminated string to non-NUL-terminated buffer
Parameters
dest
Pointer of destination character array (marked as __nonstring)
src
Pointer to NUL-terminated string
Description
This is a replacement for strncpy()
uses where the destination is not
a NUL-terminated string, but with bounds checking on the source size, and
without trailing padding. If padding is required, use strtomem_pad()
.
Note that the size of dest is not an argument, as the length of dest must be discoverable by the compiler.
-
memset_after¶
memset_after (obj, v, member)
Set a value after a struct member to the end of a struct
Parameters
obj
Address of target struct instance
v
Byte value to repeatedly write
member
after which struct member to start writing bytes
Description
This is good for clearing padding following the given member.
-
memset_startat¶
memset_startat (obj, v, member)
Set a value starting at a member to the end of a struct
Parameters
obj
Address of target struct instance
v
Byte value to repeatedly write
member
struct member to start writing at
Description
Note that if there is padding between the prior member and the target
member, memset_after()
should be used to clear the prior padding.
-
size_t str_has_prefix(const char *str, const char *prefix)¶
Test if a string has a given prefix
Parameters
const char *str
The string to test
const char *prefix
The string to see if str starts with
Description
- A common way to test a prefix of a string is to do:
strncmp(str, prefix, sizeof(prefix) - 1)
But this can lead to bugs due to typos, or if prefix is a pointer
and not a constant. Instead use str_has_prefix()
.
Return
strlen(prefix) if str starts with prefix
0 if str does not start with prefix
-
char *kstrdup(const char *s, gfp_t gfp)¶
allocate space for and copy an existing string
Parameters
const char *s
the string to duplicate
gfp_t gfp
the GFP mask used in the
kmalloc()
call when allocating memory
Return
newly allocated copy of s or NULL
in case of error
-
const char *kstrdup_const(const char *s, gfp_t gfp)¶
conditionally duplicate an existing const string
Parameters
const char *s
the string to duplicate
gfp_t gfp
the GFP mask used in the
kmalloc()
call when allocating memory
Note
Strings allocated by kstrdup_const should be freed by kfree_const and
must not be passed to krealloc()
.
Return
source string if it is in .rodata section otherwise fallback to kstrdup.
-
char *kstrndup(const char *s, size_t max, gfp_t gfp)¶
allocate space for and copy an existing string
Parameters
const char *s
the string to duplicate
size_t max
read at most max chars from s
gfp_t gfp
the GFP mask used in the
kmalloc()
call when allocating memory
Note
Use kmemdup_nul()
instead if the size is known exactly.
Return
newly allocated copy of s or NULL
in case of error
-
void *kmemdup(const void *src, size_t len, gfp_t gfp)¶
duplicate region of memory
Parameters
const void *src
memory region to duplicate
size_t len
memory region length
gfp_t gfp
GFP mask to use
Return
newly allocated copy of src or NULL
in case of error,
result is physically contiguous. Use kfree()
to free.
-
char *kmemdup_nul(const char *s, size_t len, gfp_t gfp)¶
Create a NUL-terminated string from unterminated data
Parameters
const char *s
The data to stringify
size_t len
The size of the data
gfp_t gfp
the GFP mask used in the
kmalloc()
call when allocating memory
Return
newly allocated copy of s with NUL-termination or NULL
in
case of error
-
void *memdup_user(const void __user *src, size_t len)¶
duplicate memory region from user space
Parameters
const void __user *src
source address in user space
size_t len
number of bytes to copy
Return
an ERR_PTR()
on failure. Result is physically
contiguous, to be freed by kfree()
.
-
void *vmemdup_user(const void __user *src, size_t len)¶
duplicate memory region from user space
Parameters
const void __user *src
source address in user space
size_t len
number of bytes to copy
Return
an ERR_PTR()
on failure. Result may be not
physically contiguous. Use kvfree()
to free.
-
char *strndup_user(const char __user *s, long n)¶
duplicate an existing string from user space
Parameters
const char __user *s
The string to duplicate
long n
Maximum number of bytes to copy, including the trailing NUL.
Return
newly allocated copy of s or an ERR_PTR()
in case of error
-
void *memdup_user_nul(const void __user *src, size_t len)¶
duplicate memory region from user space and NUL-terminate
Parameters
const void __user *src
source address in user space
size_t len
number of bytes to copy
Return
an ERR_PTR()
on failure.
Basic Kernel Library Functions¶
The Linux kernel provides more basic utility functions.
Bit Operations¶
-
void set_bit(long nr, volatile unsigned long *addr)¶
Atomically set a bit in memory
Parameters
long nr
the bit to set
volatile unsigned long *addr
the address to start counting from
Description
This is a relaxed atomic operation (no implied memory barriers).
Note that nr may be almost arbitrarily large; this function is not restricted to acting on a single-word quantity.
-
void clear_bit(long nr, volatile unsigned long *addr)¶
Clears a bit in memory
Parameters
long nr
Bit to clear
volatile unsigned long *addr
Address to start counting from
Description
This is a relaxed atomic operation (no implied memory barriers).
-
void change_bit(long nr, volatile unsigned long *addr)¶
Toggle a bit in memory
Parameters
long nr
Bit to change
volatile unsigned long *addr
Address to start counting from
Description
This is a relaxed atomic operation (no implied memory barriers).
Note that nr may be almost arbitrarily large; this function is not restricted to acting on a single-word quantity.
-
bool test_and_set_bit(long nr, volatile unsigned long *addr)¶
Set a bit and return its old value
Parameters
long nr
Bit to set
volatile unsigned long *addr
Address to count from
Description
This is an atomic fully-ordered operation (implied full memory barrier).
-
bool test_and_clear_bit(long nr, volatile unsigned long *addr)¶
Clear a bit and return its old value
Parameters
long nr
Bit to clear
volatile unsigned long *addr
Address to count from
Description
This is an atomic fully-ordered operation (implied full memory barrier).
-
bool test_and_change_bit(long nr, volatile unsigned long *addr)¶
Change a bit and return its old value
Parameters
long nr
Bit to change
volatile unsigned long *addr
Address to count from
Description
This is an atomic fully-ordered operation (implied full memory barrier).
-
void ___set_bit(unsigned long nr, volatile unsigned long *addr)¶
Set a bit in memory
Parameters
unsigned long nr
the bit to set
volatile unsigned long *addr
the address to start counting from
Description
Unlike set_bit()
, this function is non-atomic. If it is called on the same
region of memory concurrently, the effect may be that only one operation
succeeds.
-
void ___clear_bit(unsigned long nr, volatile unsigned long *addr)¶
Clears a bit in memory
Parameters
unsigned long nr
the bit to clear
volatile unsigned long *addr
the address to start counting from
Description
Unlike clear_bit()
, this function is non-atomic. If it is called on the same
region of memory concurrently, the effect may be that only one operation
succeeds.
-
void ___change_bit(unsigned long nr, volatile unsigned long *addr)¶
Toggle a bit in memory
Parameters
unsigned long nr
the bit to change
volatile unsigned long *addr
the address to start counting from
Description
Unlike change_bit()
, this function is non-atomic. If it is called on the same
region of memory concurrently, the effect may be that only one operation
succeeds.
-
bool ___test_and_set_bit(unsigned long nr, volatile unsigned long *addr)¶
Set a bit and return its old value
Parameters
unsigned long nr
Bit to set
volatile unsigned long *addr
Address to count from
Description
This operation is non-atomic. If two instances of this operation race, one can appear to succeed but actually fail.
-
bool ___test_and_clear_bit(unsigned long nr, volatile unsigned long *addr)¶
Clear a bit and return its old value
Parameters
unsigned long nr
Bit to clear
volatile unsigned long *addr
Address to count from
Description
This operation is non-atomic. If two instances of this operation race, one can appear to succeed but actually fail.
-
bool ___test_and_change_bit(unsigned long nr, volatile unsigned long *addr)¶
Change a bit and return its old value
Parameters
unsigned long nr
Bit to change
volatile unsigned long *addr
Address to count from
Description
This operation is non-atomic. If two instances of this operation race, one can appear to succeed but actually fail.
-
bool _test_bit(unsigned long nr, volatile const unsigned long *addr)¶
Determine whether a bit is set
Parameters
unsigned long nr
bit number to test
const volatile unsigned long *addr
Address to start counting from
-
bool _test_bit_acquire(unsigned long nr, volatile const unsigned long *addr)¶
Determine, with acquire semantics, whether a bit is set
Parameters
unsigned long nr
bit number to test
const volatile unsigned long *addr
Address to start counting from
-
void clear_bit_unlock(long nr, volatile unsigned long *addr)¶
Clear a bit in memory, for unlock
Parameters
long nr
the bit to set
volatile unsigned long *addr
the address to start counting from
Description
This operation is atomic and provides release barrier semantics.
-
void __clear_bit_unlock(long nr, volatile unsigned long *addr)¶
Clears a bit in memory
Parameters
long nr
Bit to clear
volatile unsigned long *addr
Address to start counting from
Description
This is a non-atomic operation but implies a release barrier before the memory operation. It can be used for an unlock if no other CPUs can concurrently modify other bits in the word.
-
bool test_and_set_bit_lock(long nr, volatile unsigned long *addr)¶
Set a bit and return its old value, for lock
Parameters
long nr
Bit to set
volatile unsigned long *addr
Address to count from
Description
This operation is atomic and provides acquire barrier semantics if the returned value is 0. It can be used to implement bit locks.
-
bool clear_bit_unlock_is_negative_byte(long nr, volatile unsigned long *addr)¶
Clear a bit in memory and test if bottom byte is negative, for unlock.
Parameters
long nr
the bit to clear
volatile unsigned long *addr
the address to start counting from
Description
This operation is atomic and provides release barrier semantics.
This is a bit of a one-trick-pony for the filemap code, which clears PG_locked and tests PG_waiters,
Bitmap Operations¶
bitmaps provide an array of bits, implemented using an array of unsigned longs. The number of valid bits in a given bitmap does _not_ need to be an exact multiple of BITS_PER_LONG.
The possible unused bits in the last, partially used word of a bitmap are 'don't care'. The implementation makes no particular effort to keep them zero. It ensures that their value will not affect the results of any operation. The bitmap operations that return Boolean (bitmap_empty, for example) or scalar (bitmap_weight, for example) results carefully filter out these unused bits from impacting their results.
The byte ordering of bitmaps is more natural on little endian architectures. See the big-endian headers include/asm-ppc64/bitops.h and include/asm-s390/bitops.h for the best explanations of this ordering.
The DECLARE_BITMAP(name,bits) macro, in linux/types.h, can be used to declare an array named 'name' of just enough unsigned longs to contain all bit positions from 0 to 'bits' - 1.
The available bitmap operations and their rough meaning in the case that the bitmap is a single unsigned long are thus:
The generated code is more efficient when nbits is known at compile-time and at most BITS_PER_LONG.
bitmap_zero(dst, nbits) *dst = 0UL
bitmap_fill(dst, nbits) *dst = ~0UL
bitmap_copy(dst, src, nbits) *dst = *src
bitmap_and(dst, src1, src2, nbits) *dst = *src1 & *src2
bitmap_or(dst, src1, src2, nbits) *dst = *src1 | *src2
bitmap_xor(dst, src1, src2, nbits) *dst = *src1 ^ *src2
bitmap_andnot(dst, src1, src2, nbits) *dst = *src1 & ~(*src2)
bitmap_complement(dst, src, nbits) *dst = ~(*src)
bitmap_equal(src1, src2, nbits) Are *src1 and *src2 equal?
bitmap_intersects(src1, src2, nbits) Do *src1 and *src2 overlap?
bitmap_subset(src1, src2, nbits) Is *src1 a subset of *src2?
bitmap_empty(src, nbits) Are all bits zero in *src?
bitmap_full(src, nbits) Are all bits set in *src?
bitmap_weight(src, nbits) Hamming Weight: number set bits
bitmap_weight_and(src1, src2, nbits) Hamming Weight of and'ed bitmap
bitmap_set(dst, pos, nbits) Set specified bit area
bitmap_clear(dst, pos, nbits) Clear specified bit area
bitmap_find_next_zero_area(buf, len, pos, n, mask) Find bit free area
bitmap_find_next_zero_area_off(buf, len, pos, n, mask, mask_off) as above
bitmap_shift_right(dst, src, n, nbits) *dst = *src >> n
bitmap_shift_left(dst, src, n, nbits) *dst = *src << n
bitmap_cut(dst, src, first, n, nbits) Cut n bits from first, copy rest
bitmap_replace(dst, old, new, mask, nbits) *dst = (*old & ~(*mask)) | (*new & *mask)
bitmap_remap(dst, src, old, new, nbits) *dst = map(old, new)(src)
bitmap_bitremap(oldbit, old, new, nbits) newbit = map(old, new)(oldbit)
bitmap_onto(dst, orig, relmap, nbits) *dst = orig relative to relmap
bitmap_fold(dst, orig, sz, nbits) dst bits = orig bits mod sz
bitmap_parse(buf, buflen, dst, nbits) Parse bitmap dst from kernel buf
bitmap_parse_user(ubuf, ulen, dst, nbits) Parse bitmap dst from user buf
bitmap_parselist(buf, dst, nbits) Parse bitmap dst from kernel buf
bitmap_parselist_user(buf, dst, nbits) Parse bitmap dst from user buf
bitmap_find_free_region(bitmap, bits, order) Find and allocate bit region
bitmap_release_region(bitmap, pos, order) Free specified bit region
bitmap_allocate_region(bitmap, pos, order) Allocate specified bit region
bitmap_from_arr32(dst, buf, nbits) Copy nbits from u32[] buf to dst
bitmap_from_arr64(dst, buf, nbits) Copy nbits from u64[] buf to dst
bitmap_to_arr32(buf, src, nbits) Copy nbits from buf to u32[] dst
bitmap_to_arr64(buf, src, nbits) Copy nbits from buf to u64[] dst
bitmap_get_value8(map, start) Get 8bit value from map at start
bitmap_set_value8(map, value, start) Set 8bit value to map at start
Note, bitmap_zero() and bitmap_fill() operate over the region of unsigned longs, that is, bits behind bitmap till the unsigned long boundary will be zeroed or filled as well. Consider to use bitmap_clear() or bitmap_set() to make explicit zeroing or filling respectively.
Also the following operations in asm/bitops.h apply to bitmaps.:
set_bit(bit, addr) *addr |= bit
clear_bit(bit, addr) *addr &= ~bit
change_bit(bit, addr) *addr ^= bit
test_bit(bit, addr) Is bit set in *addr?
test_and_set_bit(bit, addr) Set bit and return old value
test_and_clear_bit(bit, addr) Clear bit and return old value
test_and_change_bit(bit, addr) Change bit and return old value
find_first_zero_bit(addr, nbits) Position first zero bit in *addr
find_first_bit(addr, nbits) Position first set bit in *addr
find_next_zero_bit(addr, nbits, bit)
Position next zero bit in *addr >= bit
find_next_bit(addr, nbits, bit) Position next set bit in *addr >= bit
find_next_and_bit(addr1, addr2, nbits, bit)
Same as find_next_bit, but in
(*addr1 & *addr2)
-
void __bitmap_shift_right(unsigned long *dst, const unsigned long *src, unsigned shift, unsigned nbits)¶
logical right shift of the bits in a bitmap
Parameters
unsigned long *dst
destination bitmap
const unsigned long *src
source bitmap
unsigned shift
shift by this many bits
unsigned nbits
bitmap size, in bits
Description
Shifting right (dividing) means moving bits in the MS -> LS bit direction. Zeros are fed into the vacated MS positions and the LS bits shifted off the bottom are lost.
-
void __bitmap_shift_left(unsigned long *dst, const unsigned long *src, unsigned int shift, unsigned int nbits)¶
logical left shift of the bits in a bitmap
Parameters
unsigned long *dst
destination bitmap
const unsigned long *src
source bitmap
unsigned int shift
shift by this many bits
unsigned int nbits
bitmap size, in bits
Description
Shifting left (multiplying) means moving bits in the LS -> MS direction. Zeros are fed into the vacated LS bit positions and those MS bits shifted off the top are lost.
-
void bitmap_cut(unsigned long *dst, const unsigned long *src, unsigned int first, unsigned int cut, unsigned int nbits)¶
remove bit region from bitmap and right shift remaining bits
Parameters
unsigned long *dst
destination bitmap, might overlap with src
const unsigned long *src
source bitmap
unsigned int first
start bit of region to be removed
unsigned int cut
number of bits to remove
unsigned int nbits
bitmap size, in bits
Description
Set the n-th bit of dst iff the n-th bit of src is set and n is less than first, or the m-th bit of src is set for any m such that first <= n < nbits, and m = n + cut.
In pictures, example for a big-endian 32-bit architecture:
The src bitmap is:
31 63
| |
10000000 11000001 11110010 00010101 10000000 11000001 01110010 00010101
| | | |
16 14 0 32
if cut is 3, and first is 14, bits 14-16 in src are cut and dst is:
31 63
| |
10110000 00011000 00110010 00010101 00010000 00011000 00101110 01000010
| | |
14 (bit 17 0 32
from @src)
Note that dst and src might overlap partially or entirely.
This is implemented in the obvious way, with a shift and carry step for each moved bit. Optimisation is left as an exercise for the compiler.
-
unsigned long bitmap_find_next_zero_area_off(unsigned long *map, unsigned long size, unsigned long start, unsigned int nr, unsigned long align_mask, unsigned long align_offset)¶
find a contiguous aligned zero area
Parameters
unsigned long *map
The address to base the search on
unsigned long size
The bitmap size in bits
unsigned long start
The bitnumber to start searching at
unsigned int nr
The number of zeroed bits we're looking for
unsigned long align_mask
Alignment mask for zero area
unsigned long align_offset
Alignment offset for zero area.
Description
The align_mask should be one less than a power of 2; the effect is that the bit offset of all zero areas this function finds plus align_offset is multiple of that power of 2.
-
int bitmap_parse_user(const char __user *ubuf, unsigned int ulen, unsigned long *maskp, int nmaskbits)¶
convert an ASCII hex string in a user buffer into a bitmap
Parameters
const char __user *ubuf
pointer to user buffer containing string.
unsigned int ulen
buffer size in bytes. If string is smaller than this then it must be terminated with a 0.
unsigned long *maskp
pointer to bitmap array that will contain result.
int nmaskbits
size of bitmap, in bits.
-
int bitmap_print_to_pagebuf(bool list, char *buf, const unsigned long *maskp, int nmaskbits)¶
convert bitmap to list or hex format ASCII string
Parameters
bool list
indicates whether the bitmap must be list
char *buf
page aligned buffer into which string is placed
const unsigned long *maskp
pointer to bitmap to convert
int nmaskbits
size of bitmap, in bits
Description
Output format is a comma-separated list of decimal numbers and ranges if list is specified or hex digits grouped into comma-separated sets of 8 digits/set. Returns the number of characters written to buf.
It is assumed that buf is a pointer into a PAGE_SIZE, page-aligned
area and that sufficient storage remains at buf to accommodate the
bitmap_print_to_pagebuf()
output. Returns the number of characters
actually printed to buf, excluding terminating '0'.
-
int bitmap_print_bitmask_to_buf(char *buf, const unsigned long *maskp, int nmaskbits, loff_t off, size_t count)¶
convert bitmap to hex bitmask format ASCII string
Parameters
char *buf
buffer into which string is placed
const unsigned long *maskp
pointer to bitmap to convert
int nmaskbits
size of bitmap, in bits
loff_t off
in the string from which we are copying, We copy to buf
size_t count
the maximum number of bytes to print
Description
The bitmap_print_to_pagebuf()
is used indirectly via its cpumap wrapper
cpumap_print_to_pagebuf() or directly by drivers to export hexadecimal
bitmask and decimal list to userspace by sysfs ABI.
Drivers might be using a normal attribute for this kind of ABIs. A
normal attribute typically has show entry as below:
static ssize_t example_attribute_show(struct device *dev,
struct device_attribute *attr, char *buf)
{
...
return bitmap_print_to_pagebuf(true, buf, &mask, nr_trig_max);
}
show entry of attribute has no offset and count parameters and this
means the file is limited to one page only.
bitmap_print_to_pagebuf()
API works terribly well for this kind of
normal attribute with buf parameter and without offset, count:
bitmap_print_to_pagebuf(bool list, char *buf, const unsigned long *maskp,
int nmaskbits)
{
}
The problem is once we have a large bitmap, we have a chance to get a bitmask or list more than one page. Especially for list, it could be as complex as 0,3,5,7,9,... We have no simple way to know it exact size. It turns out bin_attribute is a way to break this limit. bin_attribute has show entry as below:
static ssize_t
example_bin_attribute_show(struct file *filp, struct kobject *kobj,
struct bin_attribute *attr, char *buf,
loff_t offset, size_t count)
{
...
}
With the new offset and count parameters, this makes sysfs ABI be able
to support file size more than one page. For example, offset could be
>= 4096.
bitmap_print_bitmask_to_buf()
, bitmap_print_list_to_buf()
wit their
cpumap wrapper cpumap_print_bitmask_to_buf(), cpumap_print_list_to_buf()
make those drivers be able to support large bitmask and list after they
move to use bin_attribute. In result, we have to pass the corresponding
parameters such as off, count from bin_attribute show entry to this API.
The role of cpumap_print_bitmask_to_buf() and cpumap_print_list_to_buf()
is similar with cpumap_print_to_pagebuf(), the difference is that
bitmap_print_to_pagebuf()
mainly serves sysfs attribute with the assumption
the destination buffer is exactly one page and won't be more than one page.
cpumap_print_bitmask_to_buf() and cpumap_print_list_to_buf(), on the other
hand, mainly serves bin_attribute which doesn't work with exact one page,
and it can break the size limit of converted decimal list and hexadecimal
bitmask.
WARNING!
This function is not a replacement for sprintf()
or bitmap_print_to_pagebuf()
.
It is intended to workaround sysfs limitations discussed above and should be
used carefully in general case for the following reasons:
Time complexity is O(nbits^2/count), comparing to O(nbits) for
snprintf()
.Memory complexity is O(nbits), comparing to O(1) for
snprintf()
.off and count are NOT offset and number of bits to print.
If printing part of bitmap as list, the resulting string is not a correct list representation of bitmap. Particularly, some bits within or out of related interval may be erroneously set or unset. The format of the string may be broken, so bitmap_parselist-like parser may fail parsing it.
If printing the whole bitmap as list by parts, user must ensure the order of calls of the function such that the offset is incremented linearly.
If printing the whole bitmap as list by parts, user must keep bitmap unchanged between the very first and very last call. Otherwise concatenated result may be incorrect, and format may be broken.
Returns the number of characters actually printed to buf
-
int bitmap_print_list_to_buf(char *buf, const unsigned long *maskp, int nmaskbits, loff_t off, size_t count)¶
convert bitmap to decimal list format ASCII string
Parameters
char *buf
buffer into which string is placed
const unsigned long *maskp
pointer to bitmap to convert
int nmaskbits
size of bitmap, in bits
loff_t off
in the string from which we are copying, We copy to buf
size_t count
the maximum number of bytes to print
Description
Everything is same with the above bitmap_print_bitmask_to_buf()
except
the print format.
-
int bitmap_parselist(const char *buf, unsigned long *maskp, int nmaskbits)¶
convert list format ASCII string to bitmap
Parameters
const char *buf
read user string from this buffer; must be terminated with a 0 or n.
unsigned long *maskp
write resulting mask here
int nmaskbits
number of bits in mask to be written
Description
Input format is a comma-separated list of decimal numbers and ranges. Consecutively set bits are shown as two hyphen-separated decimal numbers, the smallest and largest bit numbers set in the range. Optionally each range can be postfixed to denote that only parts of it should be set. The range will divided to groups of specific size. From each group will be used only defined amount of bits. Syntax: range:used_size/group_size
Example
0-1023:2/256 ==> 0,1,256,257,512,513,768,769 The value 'N' can be used as a dynamically substituted token for the maximum allowed value; i.e (nmaskbits - 1). Keep in mind that it is dynamic, so if system changes cause the bitmap width to change, such as more cores in a CPU list, then any ranges using N will also change.
Return
0 on success, -errno on invalid input strings. Error values:
-EINVAL
: wrong region format
-EINVAL
: invalid character in string
-ERANGE
: bit number specified too large for mask
-EOVERFLOW
: integer overflow in the input parameters
-
int bitmap_parselist_user(const char __user *ubuf, unsigned int ulen, unsigned long *maskp, int nmaskbits)¶
convert user buffer's list format ASCII string to bitmap
Parameters
const char __user *ubuf
pointer to user buffer containing string.
unsigned int ulen
buffer size in bytes. If string is smaller than this then it must be terminated with a 0.
unsigned long *maskp
pointer to bitmap array that will contain result.
int nmaskbits
size of bitmap, in bits.
Description
Wrapper for bitmap_parselist()
, providing it with user buffer.
-
int bitmap_parse(const char *start, unsigned int buflen, unsigned long *maskp, int nmaskbits)¶
convert an ASCII hex string into a bitmap.
Parameters
const char *start
pointer to buffer containing string.
unsigned int buflen
buffer size in bytes. If string is smaller than this then it must be terminated with a 0 or n. In that case, UINT_MAX may be provided instead of string length.
unsigned long *maskp
pointer to bitmap array that will contain result.
int nmaskbits
size of bitmap, in bits.
Description
Commas group hex digits into chunks. Each chunk defines exactly 32
bits of the resultant bitmask. No chunk may specify a value larger
than 32 bits (-EOVERFLOW
), and if a chunk specifies a smaller value
then leading 0-bits are prepended. -EINVAL
is returned for illegal
characters. Grouping such as "1,,5", ",44", "," or "" is allowed.
Leading, embedded and trailing whitespace accepted.
-
void bitmap_remap(unsigned long *dst, const unsigned long *src, const unsigned long *old, const unsigned long *new, unsigned int nbits)¶
Apply map defined by a pair of bitmaps to another bitmap
Parameters
unsigned long *dst
remapped result
const unsigned long *src
subset to be remapped
const unsigned long *old
defines domain of map
const unsigned long *new
defines range of map
unsigned int nbits
number of bits in each of these bitmaps
Description
Let old and new define a mapping of bit positions, such that whatever position is held by the n-th set bit in old is mapped to the n-th set bit in new. In the more general case, allowing for the possibility that the weight 'w' of new is less than the weight of old, map the position of the n-th set bit in old to the position of the m-th set bit in new, where m == n % w.
If either of the old and new bitmaps are empty, or if src and dst point to the same location, then this routine copies src to dst.
The positions of unset bits in old are mapped to themselves (the identify map).
Apply the above specified mapping to src, placing the result in dst, clearing any bits previously set in dst.
For example, lets say that old has bits 4 through 7 set, and new has bits 12 through 15 set. This defines the mapping of bit position 4 to 12, 5 to 13, 6 to 14 and 7 to 15, and of all other bit positions unchanged. So if say src comes into this routine with bits 1, 5 and 7 set, then dst should leave with bits 1, 13 and 15 set.
-
int bitmap_bitremap(int oldbit, const unsigned long *old, const unsigned long *new, int bits)¶
Apply map defined by a pair of bitmaps to a single bit
Parameters
int oldbit
bit position to be mapped
const unsigned long *old
defines domain of map
const unsigned long *new
defines range of map
int bits
number of bits in each of these bitmaps
Description
Let old and new define a mapping of bit positions, such that whatever position is held by the n-th set bit in old is mapped to the n-th set bit in new. In the more general case, allowing for the possibility that the weight 'w' of new is less than the weight of old, map the position of the n-th set bit in old to the position of the m-th set bit in new, where m == n % w.
The positions of unset bits in old are mapped to themselves (the identify map).
Apply the above specified mapping to bit position oldbit, returning the new bit position.
For example, lets say that old has bits 4 through 7 set, and new has bits 12 through 15 set. This defines the mapping of bit position 4 to 12, 5 to 13, 6 to 14 and 7 to 15, and of all other bit positions unchanged. So if say oldbit is 5, then this routine returns 13.
-
int bitmap_find_free_region(unsigned long *bitmap, unsigned int bits, int order)¶
find a contiguous aligned mem region
Parameters
unsigned long *bitmap
array of unsigned longs corresponding to the bitmap
unsigned int bits
number of bits in the bitmap
int order
region size (log base 2 of number of bits) to find
Description
Find a region of free (zero) bits in a bitmap of bits bits and allocate them (set them to one). Only consider regions of length a power (order) of two, aligned to that power of two, which makes the search algorithm much faster.
Return the bit offset in bitmap of the allocated region, or -errno on failure.
-
void bitmap_release_region(unsigned long *bitmap, unsigned int pos, int order)¶
release allocated bitmap region
Parameters
unsigned long *bitmap
array of unsigned longs corresponding to the bitmap
unsigned int pos
beginning of bit region to release
int order
region size (log base 2 of number of bits) to release
Description
This is the complement to __bitmap_find_free_region() and releases the found region (by clearing it in the bitmap).
No return value.
-
int bitmap_allocate_region(unsigned long *bitmap, unsigned int pos, int order)¶
allocate bitmap region
Parameters
unsigned long *bitmap
array of unsigned longs corresponding to the bitmap
unsigned int pos
beginning of bit region to allocate
int order
region size (log base 2 of number of bits) to allocate
Description
Allocate (set bits in) a specified region of a bitmap.
Return 0 on success, or -EBUSY
if specified region wasn't
free (not all bits were zero).
-
void bitmap_copy_le(unsigned long *dst, const unsigned long *src, unsigned int nbits)¶
copy a bitmap, putting the bits into little-endian order.
Parameters
unsigned long *dst
destination buffer
const unsigned long *src
bitmap to copy
unsigned int nbits
number of bits in the bitmap
Description
Require nbits % BITS_PER_LONG == 0.
-
void bitmap_from_arr32(unsigned long *bitmap, const u32 *buf, unsigned int nbits)¶
copy the contents of u32 array of bits to bitmap
Parameters
unsigned long *bitmap
array of unsigned longs, the destination bitmap
const u32 *buf
array of u32 (in host byte order), the source bitmap
unsigned int nbits
number of bits in bitmap
-
void bitmap_to_arr32(u32 *buf, const unsigned long *bitmap, unsigned int nbits)¶
copy the contents of bitmap to a u32 array of bits
Parameters
u32 *buf
array of u32 (in host byte order), the dest bitmap
const unsigned long *bitmap
array of unsigned longs, the source bitmap
unsigned int nbits
number of bits in bitmap
-
void bitmap_from_arr64(unsigned long *bitmap, const u64 *buf, unsigned int nbits)¶
copy the contents of u64 array of bits to bitmap
Parameters
unsigned long *bitmap
array of unsigned longs, the destination bitmap
const u64 *buf
array of u64 (in host byte order), the source bitmap
unsigned int nbits
number of bits in bitmap
-
void bitmap_to_arr64(u64 *buf, const unsigned long *bitmap, unsigned int nbits)¶
copy the contents of bitmap to a u64 array of bits
Parameters
u64 *buf
array of u64 (in host byte order), the dest bitmap
const unsigned long *bitmap
array of unsigned longs, the source bitmap
unsigned int nbits
number of bits in bitmap
-
int bitmap_print_to_buf(bool list, char *buf, const unsigned long *maskp, int nmaskbits, loff_t off, size_t count)¶
convert bitmap to list or hex format ASCII string
Parameters
bool list
indicates whether the bitmap must be list true: print in decimal list format false: print in hexadecimal bitmask format
char *buf
buffer into which string is placed
const unsigned long *maskp
pointer to bitmap to convert
int nmaskbits
size of bitmap, in bits
loff_t off
in the string from which we are copying, We copy to buf
size_t count
the maximum number of bytes to print
-
int bitmap_pos_to_ord(const unsigned long *buf, unsigned int pos, unsigned int nbits)¶
find ordinal of set bit at given position in bitmap
Parameters
const unsigned long *buf
pointer to a bitmap
unsigned int pos
a bit position in buf (0 <= pos < nbits)
unsigned int nbits
number of valid bit positions in buf
Description
Map the bit at position pos in buf (of length nbits) to the ordinal of which set bit it is. If it is not set or if pos is not a valid bit position, map to -1.
If for example, just bits 4 through 7 are set in buf, then pos values 4 through 7 will get mapped to 0 through 3, respectively, and other pos values will get mapped to -1. When pos value 7 gets mapped to (returns) ord value 3 in this example, that means that bit 7 is the 3rd (starting with 0th) set bit in buf.
The bit positions 0 through bits are valid positions in buf.
-
void bitmap_onto(unsigned long *dst, const unsigned long *orig, const unsigned long *relmap, unsigned int bits)¶
translate one bitmap relative to another
Parameters
unsigned long *dst
resulting translated bitmap
const unsigned long *orig
original untranslated bitmap
const unsigned long *relmap
bitmap relative to which translated
unsigned int bits
number of bits in each of these bitmaps
Description
Set the n-th bit of dst iff there exists some m such that the n-th bit of relmap is set, the m-th bit of orig is set, and the n-th bit of relmap is also the m-th _set_ bit of relmap. (If you understood the previous sentence the first time your read it, you're overqualified for your current job.)
In other words, orig is mapped onto (surjectively) dst, using the map { <n, m> | the n-th bit of relmap is the m-th set bit of relmap }.
Any set bits in orig above bit number W, where W is the
weight of (number of set bits in) relmap are mapped nowhere.
In particular, if for all bits m set in orig, m >= W, then
dst will end up empty. In situations where the possibility
of such an empty result is not desired, one way to avoid it is
to use the bitmap_fold()
operator, below, to first fold the
orig bitmap over itself so that all its set bits x are in the
range 0 <= x < W. The bitmap_fold()
operator does this by
setting the bit (m % W) in dst, for each bit (m) set in orig.
- Example [1] for bitmap_onto():
Let's say relmap has bits 30-39 set, and orig has bits 1, 3, 5, 7, 9 and 11 set. Then on return from this routine, dst will have bits 31, 33, 35, 37 and 39 set.
When bit 0 is set in orig, it means turn on the bit in dst corresponding to whatever is the first bit (if any) that is turned on in relmap. Since bit 0 was off in the above example, we leave off that bit (bit 30) in dst.
When bit 1 is set in orig (as in the above example), it means turn on the bit in dst corresponding to whatever is the second bit that is turned on in relmap. The second bit in relmap that was turned on in the above example was bit 31, so we turned on bit 31 in dst.
Similarly, we turned on bits 33, 35, 37 and 39 in dst, because they were the 4th, 6th, 8th and 10th set bits set in relmap, and the 4th, 6th, 8th and 10th bits of orig (i.e. bits 3, 5, 7 and 9) were also set.
When bit 11 is set in orig, it means turn on the bit in dst corresponding to whatever is the twelfth bit that is turned on in relmap. In the above example, there were only ten bits turned on in relmap (30..39), so that bit 11 was set in orig had no affect on dst.
- Example [2] for bitmap_fold() + bitmap_onto():
Let's say relmap has these ten bits set:
40 41 42 43 45 48 53 61 74 95
(for the curious, that's 40 plus the first ten terms of the Fibonacci sequence.)
Further lets say we use the following code, invoking
bitmap_fold()
then bitmap_onto, as suggested above to avoid the possibility of an empty dst result:unsigned long *tmp; // a temporary bitmap's bits bitmap_fold(tmp, orig, bitmap_weight(relmap, bits), bits); bitmap_onto(dst, tmp, relmap, bits);
Then this table shows what various values of dst would be, for various orig's. I list the zero-based positions of each set bit. The tmp column shows the intermediate result, as computed by using
bitmap_fold()
to fold the orig bitmap modulo ten (the weight of relmap):
- 1(1,2)
For these marked lines, if we hadn't first done
bitmap_fold()
into tmp, then the dst result would have been empty.
If either of orig or relmap is empty (no set bits), then dst will be returned empty.
If (as explained above) the only set bits in orig are in positions m where m >= W, (where W is the weight of relmap) then dst will once again be returned empty.
All bits in dst not set by the above rule are cleared.
-
void bitmap_fold(unsigned long *dst, const unsigned long *orig, unsigned int sz, unsigned int nbits)¶
fold larger bitmap into smaller, modulo specified size
Parameters
unsigned long *dst
resulting smaller bitmap
const unsigned long *orig
original larger bitmap
unsigned int sz
specified size
unsigned int nbits
number of bits in each of these bitmaps
Description
For each bit oldbit in orig, set bit oldbit mod sz in dst.
Clear all other bits in dst. See further the comment and
Example [2] for bitmap_onto()
for why and how to use this.
-
unsigned long bitmap_find_next_zero_area(unsigned long *map, unsigned long size, unsigned long start, unsigned int nr, unsigned long align_mask)¶
find a contiguous aligned zero area
Parameters
unsigned long *map
The address to base the search on
unsigned long size
The bitmap size in bits
unsigned long start
The bitnumber to start searching at
unsigned int nr
The number of zeroed bits we're looking for
unsigned long align_mask
Alignment mask for zero area
Description
The align_mask should be one less than a power of 2; the effect is that the bit offset of all zero areas this function finds is multiples of that power of 2. A align_mask of 0 means no alignment is required.
-
bool bitmap_or_equal(const unsigned long *src1, const unsigned long *src2, const unsigned long *src3, unsigned int nbits)¶
Check whether the or of two bitmaps is equal to a third
Parameters
const unsigned long *src1
Pointer to bitmap 1
const unsigned long *src2
Pointer to bitmap 2 will be or'ed with bitmap 1
const unsigned long *src3
Pointer to bitmap 3. Compare to the result of *src1 | *src2
unsigned int nbits
number of bits in each of these bitmaps
Return
True if (*src1 | *src2) == *src3, false otherwise
-
BITMAP_FROM_U64¶
BITMAP_FROM_U64 (n)
Represent u64 value in the format suitable for bitmap.
Parameters
n
u64 value
Description
Linux bitmaps are internally arrays of unsigned longs, i.e. 32-bit integers in 32-bit environment, and 64-bit integers in 64-bit one.
There are four combinations of endianness and length of the word in linux ABIs: LE64, BE64, LE32 and BE32.
On 64-bit kernels 64-bit LE and BE numbers are naturally ordered in bitmaps and therefore don't require any special handling.
On 32-bit kernels 32-bit LE ABI orders lo word of 64-bit number in memory prior to hi, and 32-bit BE orders hi word prior to lo. The bitmap on the other hand is represented as an array of 32-bit words and the position of bit N may therefore be calculated as: word #(N/32) and bit #(N``32``) in that word. For example, bit #42 is located at 10th position of 2nd word. It matches 32-bit LE ABI, and we can simply let the compiler store 64-bit values in memory as it usually does. But for BE we need to swap hi and lo words manually.
With all that, the macro BITMAP_FROM_U64()
does explicit reordering of hi and
lo parts of u64. For LE32 it does nothing, and for BE environment it swaps
hi and lo words, as is expected by bitmap.
-
void bitmap_from_u64(unsigned long *dst, u64 mask)¶
Check and swap words within u64.
Parameters
unsigned long *dst
destination bitmap
u64 mask
source bitmap
Description
In 32-bit Big Endian kernel, when using (u32 *)(:c:type:`val`)[*]
to read u64 mask, we will get the wrong word.
That is (u32 *)(:c:type:`val`)[0]
gets the upper 32 bits,
but we expect the lower 32-bits of u64.
-
unsigned long bitmap_get_value8(const unsigned long *map, unsigned long start)¶
get an 8-bit value within a memory region
Parameters
const unsigned long *map
address to the bitmap memory region
unsigned long start
bit offset of the 8-bit value; must be a multiple of 8
Description
Returns the 8-bit value located at the start bit offset within the src memory region.
-
void bitmap_set_value8(unsigned long *map, unsigned long value, unsigned long start)¶
set an 8-bit value within a memory region
Parameters
unsigned long *map
address to the bitmap memory region
unsigned long value
the 8-bit value; values wider than 8 bits may clobber bitmap
unsigned long start
bit offset of the 8-bit value; must be a multiple of 8
Command-line Parsing¶
-
int get_option(char **str, int *pint)¶
Parse integer from an option string
Parameters
char **str
option string
int *pint
(optional output) integer value parsed from str
Read an int from an option string; if available accept a subsequent comma as well.
When pint is NULL the function can be used as a validator of the current option in the string.
Return values: 0 - no int in string 1 - int found, no subsequent comma 2 - int found including a subsequent comma 3 - hyphen found to denote a range
Leading hyphen without integer is no integer case, but we consume it for the sake of simplification.
-
char *get_options(const char *str, int nints, int *ints)¶
Parse a string into a list of integers
Parameters
const char *str
String to be parsed
int nints
size of integer array
int *ints
integer array (must have room for at least one element)
This function parses a string containing a comma-separated list of integers, a hyphen-separated range of _positive_ integers, or a combination of both. The parse halts when the array is full, or when no more numbers can be retrieved from the string.
When nints is 0, the function just validates the given str and returns the amount of parseable integers as described below.
Return
The first element is filled by the number of collected integers in the range. The rest is what was parsed from the str.
Return value is the character in the string which caused the parse to end (typically a null terminator, if str is completely parseable).
-
unsigned long long memparse(const char *ptr, char **retptr)¶
parse a string with mem suffixes into a number
Parameters
const char *ptr
Where parse begins
char **retptr
(output) Optional pointer to next char after parse completes
Parses a string into a number. The number stored at ptr is potentially suffixed with K, M, G, T, P, E.
Error Pointers¶
-
IS_ERR_VALUE¶
IS_ERR_VALUE (x)
Detect an error pointer.
Parameters
x
The pointer to check.
Description
Like IS_ERR()
, but does not generate a compiler warning if result is unused.
-
void *ERR_PTR(long error)¶
Create an error pointer.
Parameters
long error
A negative error code.
Description
Encodes error into a pointer value. Users should consider the result opaque and not assume anything about how the error is encoded.
Return
A pointer with error encoded within its value.
-
long PTR_ERR(__force const void *ptr)¶
Extract the error code from an error pointer.
Parameters
__force const void *ptr
An error pointer.
Return
The error code within ptr.
-
bool IS_ERR(__force const void *ptr)¶
Detect an error pointer.
Parameters
__force const void *ptr
The pointer to check.
Return
true if ptr is an error pointer, false otherwise.
-
bool IS_ERR_OR_NULL(__force const void *ptr)¶
Detect an error pointer or a null pointer.
Parameters
__force const void *ptr
The pointer to check.
Description
Like IS_ERR()
, but also returns true for a null pointer.
-
void *ERR_CAST(__force const void *ptr)¶
Explicitly cast an error-valued pointer to another pointer type
Parameters
__force const void *ptr
The pointer to cast.
Description
Explicitly cast an error-valued pointer to another pointer type in such a way as to make it clear that's what's going on.
-
int PTR_ERR_OR_ZERO(__force const void *ptr)¶
Extract the error code from a pointer if it has one.
Parameters
__force const void *ptr
A potential error pointer.
Description
Convenience function that can be used inside a function that returns
an error code to propagate errors received as error pointers.
For example, return PTR_ERR_OR_ZERO(ptr);
replaces:
if (IS_ERR(ptr))
return PTR_ERR(ptr);
else
return 0;
Return
The error code within ptr if it is an error pointer; 0 otherwise.
Sorting¶
-
void sort_r(void *base, size_t num, size_t size, cmp_r_func_t cmp_func, swap_r_func_t swap_func, const void *priv)¶
sort an array of elements
Parameters
void *base
pointer to data to sort
size_t num
number of elements
size_t size
size of each element
cmp_r_func_t cmp_func
pointer to comparison function
swap_r_func_t swap_func
pointer to swap function or NULL
const void *priv
third argument passed to comparison function
Description
This function does a heapsort on the given array. You may provide a swap_func function if you need to do something more than a memory copy (e.g. fix up pointers or auxiliary data), but the built-in swap avoids a slow retpoline and so is significantly faster.
Sorting time is O(n log n) both on average and worst-case. While quicksort is slightly faster on average, it suffers from exploitable O(n*n) worst-case behavior and extra memory requirements that make it less suitable for kernel use.
-
void list_sort(void *priv, struct list_head *head, list_cmp_func_t cmp)¶
sort a list
Parameters
void *priv
private data, opaque to
list_sort()
, passed to cmpstruct list_head *head
the list to sort
list_cmp_func_t cmp
the elements comparison function
Description
The comparison function cmp must return > 0 if a should sort after b ("a > b" if you want an ascending sort), and <= 0 if a should sort before b or their original order should be preserved. It is always called with the element that came first in the input in a, and list_sort is a stable sort, so it is not necessary to distinguish the a < b and a == b cases.
This is compatible with two styles of cmp function: - The traditional style which returns <0 / =0 / >0, or - Returning a boolean 0/1. The latter offers a chance to save a few cycles in the comparison (which is used by e.g. plug_ctx_cmp() in block/blk-mq.c).
A good way to write a multi-word comparison is:
if (a->high != b->high)
return a->high > b->high;
if (a->middle != b->middle)
return a->middle > b->middle;
return a->low > b->low;
This mergesort is as eager as possible while always performing at least 2:1 balanced merges. Given two pending sublists of size 2^k, they are merged to a size-2^(k+1) list as soon as we have 2^k following elements.
Thus, it will avoid cache thrashing as long as 3*2^k elements can fit into the cache. Not quite as good as a fully-eager bottom-up mergesort, but it does use 0.2*n fewer comparisons, so is faster in the common case that everything fits into L1.
The merging is controlled by "count", the number of elements in the pending lists. This is beautifully simple code, but rather subtle.
Each time we increment "count", we set one bit (bit k) and clear bits k-1 .. 0. Each time this happens (except the very first time for each bit, when count increments to 2^k), we merge two lists of size 2^k into one list of size 2^(k+1).
This merge happens exactly when the count reaches an odd multiple of 2^k, which is when we have 2^k elements pending in smaller lists, so it's safe to merge away two lists of size 2^k.
After this happens twice, we have created two lists of size 2^(k+1), which will be merged into a list of size 2^(k+2) before we create a third list of size 2^(k+1), so there are never more than two pending.
The number of pending lists of size 2^k is determined by the state of bit k of "count" plus two extra pieces of information:
The state of bit k-1 (when k == 0, consider bit -1 always set), and
Whether the higher-order bits are zero or non-zero (i.e. is count >= 2^(k+1)).
There are six states we distinguish. "x" represents some arbitrary bits, and "y" represents some arbitrary non-zero bits: 0: 00x: 0 pending of size 2^k; x pending of sizes < 2^k 1: 01x: 0 pending of size 2^k; 2^(k-1) + x pending of sizes < 2^k 2: x10x: 0 pending of size 2^k; 2^k + x pending of sizes < 2^k 3: x11x: 1 pending of size 2^k; 2^(k-1) + x pending of sizes < 2^k 4: y00x: 1 pending of size 2^k; 2^k + x pending of sizes < 2^k 5: y01x: 2 pending of size 2^k; 2^(k-1) + x pending of sizes < 2^k (merge and loop back to state 2)
We gain lists of size 2^k in the 2->3 and 4->5 transitions (because bit k-1 is set while the more significant bits are non-zero) and merge them away in the 5->2 transition. Note in particular that just before the 5->2 transition, all lower-order bits are 11 (state 3), so there is one list of each smaller size.
When we reach the end of the input, we merge all the pending lists, from smallest to largest. If you work through cases 2 to 5 above, you can see that the number of elements we merge with a list of size 2^k varies from 2^(k-1) (cases 3 and 5 when x == 0) to 2^(k+1) - 1 (second merge of case 5 when x == 2^(k-1) - 1).
Text Searching¶
INTRODUCTION
The textsearch infrastructure provides text searching facilities for both linear and non-linear data. Individual search algorithms are implemented in modules and chosen by the user.
ARCHITECTURE
User
+----------------+
| finish()|<--------------(6)-----------------+
|get_next_block()|<--------------(5)---------------+ |
| | Algorithm | |
| | +------------------------------+
| | | init() find() destroy() |
| | +------------------------------+
| | Core API ^ ^ ^
| | +---------------+ (2) (4) (8)
| (1)|----->| prepare() |---+ | |
| (3)|----->| find()/next() |-----------+ |
| (7)|----->| destroy() |----------------------+
+----------------+ +---------------+
(1) User configures a search by calling textsearch_prepare() specifying
the search parameters such as the pattern and algorithm name.
(2) Core requests the algorithm to allocate and initialize a search
configuration according to the specified parameters.
(3) User starts the search(es) by calling textsearch_find() or
textsearch_next() to fetch subsequent occurrences. A state variable
is provided to the algorithm to store persistent variables.
(4) Core eventually resets the search offset and forwards the find()
request to the algorithm.
(5) Algorithm calls get_next_block() provided by the user continuously
to fetch the data to be searched in block by block.
(6) Algorithm invokes finish() after the last call to get_next_block
to clean up any leftovers from get_next_block. (Optional)
(7) User destroys the configuration by calling textsearch_destroy().
(8) Core notifies the algorithm to destroy algorithm specific
allocations. (Optional)
USAGE
Before a search can be performed, a configuration must be created by calling
textsearch_prepare()
specifying the searching algorithm, the pattern to look for and flags. As a flag, you can set TS_IGNORECASE to perform case insensitive matching. But it might slow down performance of algorithm, so you should use it at own your risk. The returned configuration may then be used for an arbitrary amount of times and even in parallel as long as a separate struct ts_state variable is provided to every instance.The actual search is performed by either calling
textsearch_find_continuous()
for linear data or by providing an own get_next_block() implementation and callingtextsearch_find()
. Both functions return the position of the first occurrence of the pattern or UINT_MAX if no match was found. Subsequent occurrences can be found by callingtextsearch_next()
regardless of the linearity of the data.Once you're done using a configuration it must be given back via textsearch_destroy.
EXAMPLE:
int pos;
struct ts_config *conf;
struct ts_state state;
const char *pattern = "chicken";
const char *example = "We dance the funky chicken";
conf = textsearch_prepare("kmp", pattern, strlen(pattern),
GFP_KERNEL, TS_AUTOLOAD);
if (IS_ERR(conf)) {
err = PTR_ERR(conf);
goto errout;
}
pos = textsearch_find_continuous(conf, &state, example, strlen(example));
if (pos != UINT_MAX)
panic("Oh my god, dancing chickens at %d\n", pos);
textsearch_destroy(conf);
-
int textsearch_register(struct ts_ops *ops)¶
register a textsearch module
Parameters
struct ts_ops *ops
operations lookup table
Description
This function must be called by textsearch modules to announce
their presence. The specified &**ops** must have name
set to a
unique identifier and the callbacks find(), init(), get_pattern(),
and get_pattern_len() must be implemented.
Returns 0 or -EEXISTS if another module has already registered with same name.
-
int textsearch_unregister(struct ts_ops *ops)¶
unregister a textsearch module
Parameters
struct ts_ops *ops
operations lookup table
Description
This function must be called by textsearch modules to announce
their disappearance for examples when the module gets unloaded.
The ops
parameter must be the same as the one during the
registration.
Returns 0 on success or -ENOENT if no matching textsearch registration was found.
-
unsigned int textsearch_find_continuous(struct ts_config *conf, struct ts_state *state, const void *data, unsigned int len)¶
search a pattern in continuous/linear data
Parameters
struct ts_config *conf
search configuration
struct ts_state *state
search state
const void *data
data to search in
unsigned int len
length of data
Description
A simplified version of textsearch_find()
for continuous/linear data.
Call textsearch_next()
to retrieve subsequent matches.
Returns the position of first occurrence of the pattern or
UINT_MAX
if no occurrence was found.
-
struct ts_config *textsearch_prepare(const char *algo, const void *pattern, unsigned int len, gfp_t gfp_mask, int flags)¶
Prepare a search
Parameters
const char *algo
name of search algorithm
const void *pattern
pattern data
unsigned int len
length of pattern
gfp_t gfp_mask
allocation mask
int flags
search flags
Description
Looks up the search algorithm module and creates a new textsearch configuration for the specified pattern.
Returns a new textsearch configuration according to the specified
parameters or a ERR_PTR()
. If a zero length pattern is passed, this
function returns EINVAL.
Note
- The format of the pattern may not be compatible between
the various search algorithms.
-
void textsearch_destroy(struct ts_config *conf)¶
destroy a search configuration
Parameters
struct ts_config *conf
search configuration
Description
Releases all references of the configuration and frees up the memory.
-
unsigned int textsearch_next(struct ts_config *conf, struct ts_state *state)¶
continue searching for a pattern
Parameters
struct ts_config *conf
search configuration
struct ts_state *state
search state
Description
Continues a search looking for more occurrences of the pattern.
textsearch_find()
must be called to find the first occurrence
in order to reset the state.
Returns the position of the next occurrence of the pattern or UINT_MAX if not match was found.
-
unsigned int textsearch_find(struct ts_config *conf, struct ts_state *state)¶
start searching for a pattern
Parameters
struct ts_config *conf
search configuration
struct ts_state *state
search state
Description
Returns the position of first occurrence of the pattern or UINT_MAX if no match was found.
-
void *textsearch_get_pattern(struct ts_config *conf)¶
return head of the pattern
Parameters
struct ts_config *conf
search configuration
-
unsigned int textsearch_get_pattern_len(struct ts_config *conf)¶
return length of the pattern
Parameters
struct ts_config *conf
search configuration
CRC and Math Functions in Linux¶
Arithmetic Overflow Checking¶
-
check_add_overflow¶
check_add_overflow (a, b, d)
Calculate addition with overflow checking
Parameters
a
first addend
b
second addend
d
pointer to store sum
Description
Returns 0 on success.
*d holds the results of the attempted addition, but is not considered "safe for use" on a non-zero return value, which indicates that the sum has overflowed or been truncated.
-
check_sub_overflow¶
check_sub_overflow (a, b, d)
Calculate subtraction with overflow checking
Parameters
a
minuend; value to subtract from
b
subtrahend; value to subtract from a
d
pointer to store difference
Description
Returns 0 on success.
*d holds the results of the attempted subtraction, but is not considered "safe for use" on a non-zero return value, which indicates that the difference has underflowed or been truncated.
-
check_mul_overflow¶
check_mul_overflow (a, b, d)
Calculate multiplication with overflow checking
Parameters
a
first factor
b
second factor
d
pointer to store product
Description
Returns 0 on success.
*d holds the results of the attempted multiplication, but is not considered "safe for use" on a non-zero return value, which indicates that the product has overflowed or been truncated.
-
check_shl_overflow¶
check_shl_overflow (a, s, d)
Calculate a left-shifted value and check overflow
Parameters
a
Value to be shifted
s
How many bits left to shift
d
Pointer to where to store the result
Description
Computes *d = (a << s)
Returns true if '*d' cannot hold the result or when 'a << s' doesn't make sense. Example conditions:
'a << s' causes bits to be lost when stored in *d.
's' is garbage (e.g. negative) or so large that the result of 'a << s' is guaranteed to be 0.
'a' is negative.
'a << s' sets the sign bit, if any, in '*d'.
'*d' will hold the results of the attempted shift, but is not considered "safe for use" if true is returned.
-
overflows_type¶
overflows_type (n, T)
helper for checking the overflows between value, variables, or data type
Parameters
n
source constant value or variable to be checked
T
destination variable or data type proposed to store x
Description
Compares the x expression for whether or not it can safely fit in the storage of the type in T. x and T can have different types. If x is a constant expression, this will also resolve to a constant expression.
Return
true if overflow can occur, false otherwise.
-
castable_to_type¶
castable_to_type (n, T)
like __same_type(), but also allows for casted literals
Parameters
n
variable or constant value
T
variable or data type
Description
Unlike the __same_type() macro, this allows a constant value as the first argument. If this value would not overflow into an assignment of the second argument's type, it returns true. Otherwise, this falls back to __same_type().
-
size_t size_mul(size_t factor1, size_t factor2)¶
Calculate size_t multiplication with saturation at SIZE_MAX
Parameters
size_t factor1
first factor
size_t factor2
second factor
Return
calculate factor1 * factor2, both promoted to size_t, with any overflow causing the return value to be SIZE_MAX. The lvalue must be size_t to avoid implicit type conversion.
-
size_t size_add(size_t addend1, size_t addend2)¶
Calculate size_t addition with saturation at SIZE_MAX
Parameters
size_t addend1
first addend
size_t addend2
second addend
Return
calculate addend1 + addend2, both promoted to size_t, with any overflow causing the return value to be SIZE_MAX. The lvalue must be size_t to avoid implicit type conversion.
-
size_t size_sub(size_t minuend, size_t subtrahend)¶
Calculate size_t subtraction with saturation at SIZE_MAX
Parameters
size_t minuend
value to subtract from
size_t subtrahend
value to subtract from minuend
Return
calculate minuend - subtrahend, both promoted to size_t,
with any overflow causing the return value to be SIZE_MAX. For
composition with the size_add()
and size_mul()
helpers, neither
argument may be SIZE_MAX (or the result with be forced to SIZE_MAX).
The lvalue must be size_t to avoid implicit type conversion.
-
array_size¶
array_size (a, b)
Calculate size of 2-dimensional array.
Parameters
a
dimension one
b
dimension two
Description
Calculates size of 2-dimensional array: a * b.
Return
number of bytes needed to represent the array or SIZE_MAX on overflow.
-
array3_size¶
array3_size (a, b, c)
Calculate size of 3-dimensional array.
Parameters
a
dimension one
b
dimension two
c
dimension three
Description
Calculates size of 3-dimensional array: a * b * c.
Return
number of bytes needed to represent the array or SIZE_MAX on overflow.
-
flex_array_size¶
flex_array_size (p, member, count)
Calculate size of a flexible array member within an enclosing structure.
Parameters
p
Pointer to the structure.
member
Name of the flexible array member.
count
Number of elements in the array.
Description
Calculates size of a flexible array of count number of member elements, at the end of structure p.
Return
number of bytes needed or SIZE_MAX on overflow.
-
struct_size¶
struct_size (p, member, count)
Calculate size of structure with trailing flexible array.
Parameters
p
Pointer to the structure.
member
Name of the array member.
count
Number of elements in the array.
Description
Calculates size of memory needed for structure of p followed by an array of count number of member elements.
Return
number of bytes needed or SIZE_MAX on overflow.
-
struct_size_t¶
struct_size_t (type, member, count)
Calculate size of structure with trailing flexible array
Parameters
type
structure type name.
member
Name of the array member.
count
Number of elements in the array.
Description
Calculates size of memory needed for structure type followed by an
array of count number of member elements. Prefer using struct_size()
when possible instead, to keep calculations associated with a specific
instance variable of type type.
Return
number of bytes needed or SIZE_MAX on overflow.
CRC Functions¶
-
uint8_t crc4(uint8_t c, uint64_t x, int bits)¶
calculate the 4-bit crc of a value.
Parameters
uint8_t c
starting crc4
uint64_t x
value to checksum
int bits
number of bits in x to checksum
Description
Returns the crc4 value of x, using polynomial 0b10111.
The x value is treated as left-aligned, and bits above bits are ignored in the crc calculations.
-
u8 crc7_be(u8 crc, const u8 *buffer, size_t len)¶
update the CRC7 for the data buffer
Parameters
u8 crc
previous CRC7 value
const u8 *buffer
data pointer
size_t len
number of bytes in the buffer
Context
any
Description
Returns the updated CRC7 value. The CRC7 is left-aligned in the byte (the lsbit is always 0), as that makes the computation easier, and all callers want it in that form.
-
void crc8_populate_msb(u8 table[CRC8_TABLE_SIZE], u8 polynomial)¶
fill crc table for given polynomial in reverse bit order.
Parameters
u8 table[CRC8_TABLE_SIZE]
table to be filled.
u8 polynomial
polynomial for which table is to be filled.
-
void crc8_populate_lsb(u8 table[CRC8_TABLE_SIZE], u8 polynomial)¶
fill crc table for given polynomial in regular bit order.
Parameters
u8 table[CRC8_TABLE_SIZE]
table to be filled.
u8 polynomial
polynomial for which table is to be filled.
-
u8 crc8(const u8 table[CRC8_TABLE_SIZE], const u8 *pdata, size_t nbytes, u8 crc)¶
calculate a crc8 over the given input data.
Parameters
const u8 table[CRC8_TABLE_SIZE]
crc table used for calculation.
const u8 *pdata
pointer to data buffer.
size_t nbytes
number of bytes in data buffer.
u8 crc
previous returned crc8 value.
-
u16 crc16(u16 crc, u8 const *buffer, size_t len)¶
compute the CRC-16 for the data buffer
Parameters
u16 crc
previous CRC value
u8 const *buffer
data pointer
size_t len
number of bytes in the buffer
Description
Returns the updated CRC value.
-
u32 __pure crc32_le_generic(u32 crc, unsigned char const *p, size_t len, const u32 (*tab)[256], u32 polynomial)¶
Calculate bitwise little-endian Ethernet AUTODIN II CRC32/CRC32C
Parameters
u32 crc
seed value for computation. ~0 for Ethernet, sometimes 0 for other uses, or the previous crc32/crc32c value if computing incrementally.
unsigned char const *p
pointer to buffer over which CRC32/CRC32C is run
size_t len
length of buffer p
const u32 (*tab)[256]
little-endian Ethernet table
u32 polynomial
CRC32/CRC32c LE polynomial
-
u32 crc32_generic_shift(u32 crc, size_t len, u32 polynomial)¶
Append len 0 bytes to crc, in logarithmic time
Parameters
u32 crc
The original little-endian CRC (i.e. lsbit is x^31 coefficient)
size_t len
The number of bytes. crc is multiplied by x^(8***len**)
u32 polynomial
The modulus used to reduce the result to 32 bits.
Description
It's possible to parallelize CRC computations by computing a CRC over separate ranges of a buffer, then summing them. This shifts the given CRC by 8*len bits (i.e. produces the same effect as appending len bytes of zero to the data), in time proportional to log(len).
-
u32 __pure crc32_be_generic(u32 crc, unsigned char const *p, size_t len, const u32 (*tab)[256], u32 polynomial)¶
Calculate bitwise big-endian Ethernet AUTODIN II CRC32
Parameters
u32 crc
seed value for computation. ~0 for Ethernet, sometimes 0 for other uses, or the previous crc32 value if computing incrementally.
unsigned char const *p
pointer to buffer over which CRC32 is run
size_t len
length of buffer p
const u32 (*tab)[256]
big-endian Ethernet table
u32 polynomial
CRC32 BE polynomial
-
u16 crc_ccitt(u16 crc, u8 const *buffer, size_t len)¶
recompute the CRC (CRC-CCITT variant) for the data buffer
Parameters
u16 crc
previous CRC value
u8 const *buffer
data pointer
size_t len
number of bytes in the buffer
-
u16 crc_ccitt_false(u16 crc, u8 const *buffer, size_t len)¶
recompute the CRC (CRC-CCITT-FALSE variant) for the data buffer
Parameters
u16 crc
previous CRC value
u8 const *buffer
data pointer
size_t len
number of bytes in the buffer
-
u16 crc_itu_t(u16 crc, const u8 *buffer, size_t len)¶
Compute the CRC-ITU-T for the data buffer
Parameters
u16 crc
previous CRC value
const u8 *buffer
data pointer
size_t len
number of bytes in the buffer
Description
Returns the updated CRC value
Base 2 log and power Functions¶
-
bool is_power_of_2(unsigned long n)¶
check if a value is a power of two
Parameters
unsigned long n
the value to check
Description
Determine whether some value is a power of two, where zero is not considered a power of two.
Return
true if n is a power of 2, otherwise false.
-
unsigned long __roundup_pow_of_two(unsigned long n)¶
round up to nearest power of two
Parameters
unsigned long n
value to round up
-
unsigned long __rounddown_pow_of_two(unsigned long n)¶
round down to nearest power of two
Parameters
unsigned long n
value to round down
-
const_ilog2¶
const_ilog2 (n)
log base 2 of 32-bit or a 64-bit constant unsigned value
Parameters
n
parameter
Description
Use this where sparse expects a true constant expression, e.g. for array indices.
-
ilog2¶
ilog2 (n)
log base 2 of 32-bit or a 64-bit unsigned value
Parameters
n
parameter
Description
constant-capable log of base 2 calculation - this can be used to initialise global variables from constant data, hence the massive ternary operator construction
selects the appropriately-sized optimised version depending on sizeof(n)
-
roundup_pow_of_two¶
roundup_pow_of_two (n)
round the given value up to nearest power of two
Parameters
n
parameter
Description
round the given value up to the nearest power of two - the result is undefined when n == 0 - this can be used to initialise global variables from constant data
-
rounddown_pow_of_two¶
rounddown_pow_of_two (n)
round the given value down to nearest power of two
Parameters
n
parameter
Description
round the given value down to the nearest power of two - the result is undefined when n == 0 - this can be used to initialise global variables from constant data
-
order_base_2¶
order_base_2 (n)
calculate the (rounded up) base 2 order of the argument
Parameters
n
parameter
Description
- The first few values calculated by this routine:
ob2(0) = 0 ob2(1) = 0 ob2(2) = 1 ob2(3) = 2 ob2(4) = 2 ob2(5) = 3 ... and so on.
-
bits_per¶
bits_per (n)
calculate the number of bits required for the argument
Parameters
n
parameter
Description
This is constant-capable and can be used for compile time initializations, e.g bitfields.
The first few values calculated by this routine: bf(0) = 1 bf(1) = 1 bf(2) = 2 bf(3) = 2 bf(4) = 3 ... and so on.
Integer power Functions¶
-
u64 int_pow(u64 base, unsigned int exp)¶
computes the exponentiation of the given base and exponent
Parameters
u64 base
base which will be raised to the given power
unsigned int exp
power to be raised to
Description
Computes: pow(base, exp), i.e. base raised to the exp power
-
unsigned long int_sqrt(unsigned long x)¶
computes the integer square root
Parameters
unsigned long x
integer of which to calculate the sqrt
Description
Computes: floor(sqrt(x))
-
u32 int_sqrt64(u64 x)¶
strongly typed int_sqrt function when minimum 64 bit input is expected.
Parameters
u64 x
64bit integer of which to calculate the sqrt
Division Functions¶
-
do_div¶
do_div (n, base)
returns 2 values: calculate remainder and update new dividend
Parameters
n
uint64_t dividend (will be updated)
base
uint32_t divisor
Description
Summary:
uint32_t remainder = n % base;
n = n / base;
Return
(uint32_t)remainder
NOTE
macro parameter n is evaluated multiple times, beware of side effects!
-
u64 div_u64_rem(u64 dividend, u32 divisor, u32 *remainder)¶
unsigned 64bit divide with 32bit divisor with remainder
Parameters
u64 dividend
unsigned 64bit dividend
u32 divisor
unsigned 32bit divisor
u32 *remainder
pointer to unsigned 32bit remainder
Return
sets *remainder
, then returns dividend / divisor
Description
This is commonly provided by 32bit archs to provide an optimized 64bit divide.
-
s64 div_s64_rem(s64 dividend, s32 divisor, s32 *remainder)¶
signed 64bit divide with 32bit divisor with remainder
Parameters
s64 dividend
signed 64bit dividend
s32 divisor
signed 32bit divisor
s32 *remainder
pointer to signed 32bit remainder
Return
sets *remainder
, then returns dividend / divisor
-
u64 div64_u64_rem(u64 dividend, u64 divisor, u64 *remainder)¶
unsigned 64bit divide with 64bit divisor and remainder
Parameters
u64 dividend
unsigned 64bit dividend
u64 divisor
unsigned 64bit divisor
u64 *remainder
pointer to unsigned 64bit remainder
Return
sets *remainder
, then returns dividend / divisor
-
u64 div64_u64(u64 dividend, u64 divisor)¶
unsigned 64bit divide with 64bit divisor
Parameters
u64 dividend
unsigned 64bit dividend
u64 divisor
unsigned 64bit divisor
Return
dividend / divisor
-
s64 div64_s64(s64 dividend, s64 divisor)¶
signed 64bit divide with 64bit divisor
Parameters
s64 dividend
signed 64bit dividend
s64 divisor
signed 64bit divisor
Return
dividend / divisor
-
u64 div_u64(u64 dividend, u32 divisor)¶
unsigned 64bit divide with 32bit divisor
Parameters
u64 dividend
unsigned 64bit dividend
u32 divisor
unsigned 32bit divisor
Description
This is the most common 64bit divide and should be used if possible, as many 32bit archs can optimize this variant better than a full 64bit divide.
Return
dividend / divisor
-
s64 div_s64(s64 dividend, s32 divisor)¶
signed 64bit divide with 32bit divisor
Parameters
s64 dividend
signed 64bit dividend
s32 divisor
signed 32bit divisor
Return
dividend / divisor
-
DIV64_U64_ROUND_UP¶
DIV64_U64_ROUND_UP (ll, d)
unsigned 64bit divide with 64bit divisor rounded up
Parameters
ll
unsigned 64bit dividend
d
unsigned 64bit divisor
Description
Divide unsigned 64bit dividend by unsigned 64bit divisor and round up.
Return
dividend / divisor rounded up
-
DIV64_U64_ROUND_CLOSEST¶
DIV64_U64_ROUND_CLOSEST (dividend, divisor)
unsigned 64bit divide with 64bit divisor rounded to nearest integer
Parameters
dividend
unsigned 64bit dividend
divisor
unsigned 64bit divisor
Description
Divide unsigned 64bit dividend by unsigned 64bit divisor and round to closest integer.
Return
dividend / divisor rounded to nearest integer
-
DIV_U64_ROUND_CLOSEST¶
DIV_U64_ROUND_CLOSEST (dividend, divisor)
unsigned 64bit divide with 32bit divisor rounded to nearest integer
Parameters
dividend
unsigned 64bit dividend
divisor
unsigned 32bit divisor
Description
Divide unsigned 64bit dividend by unsigned 32bit divisor and round to closest integer.
Return
dividend / divisor rounded to nearest integer
-
DIV_S64_ROUND_CLOSEST¶
DIV_S64_ROUND_CLOSEST (dividend, divisor)
signed 64bit divide with 32bit divisor rounded to nearest integer
Parameters
dividend
signed 64bit dividend
divisor
signed 32bit divisor
Description
Divide signed 64bit dividend by signed 32bit divisor and round to closest integer.
Return
dividend / divisor rounded to nearest integer
-
unsigned long gcd(unsigned long a, unsigned long b)¶
calculate and return the greatest common divisor of 2 unsigned longs
Parameters
unsigned long a
first value
unsigned long b
second value
UUID/GUID¶
-
void generate_random_uuid(unsigned char uuid[16])¶
generate a random UUID
Parameters
unsigned char uuid[16]
where to put the generated UUID
Description
Random UUID interface
Used to create a Boot ID or a filesystem UUID/GUID, but can be useful for other kernel drivers.
-
bool uuid_is_valid(const char *uuid)¶
checks if a UUID string is valid
Parameters
const char *uuid
UUID string to check
Description
- It checks if the UUID string is following the format:
xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
where x is a hex digit.
Return
true if input is valid UUID string.
Kernel IPC facilities¶
IPC utilities¶
-
int ipc_init(void)¶
initialise ipc subsystem
Parameters
void
no arguments
Description
The various sysv ipc resources (semaphores, messages and shared memory) are initialised.
A callback routine is registered into the memory hotplug notifier chain: since msgmni scales to lowmem this callback routine will be called upon successful memory add / remove to recompute msmgni.
-
void ipc_init_ids(struct ipc_ids *ids)¶
initialise ipc identifiers
Parameters
struct ipc_ids *ids
ipc identifier set
Description
Set up the sequence range to use for the ipc identifier range (limited below ipc_mni) then initialise the keys hashtable and ids idr.
-
void ipc_init_proc_interface(const char *path, const char *header, int ids, int (*show)(struct seq_file*, void*))¶
create a proc interface for sysipc types using a seq_file interface.
Parameters
const char *path
Path in procfs
const char *header
Banner to be printed at the beginning of the file.
int ids
ipc id table to iterate.
int (*show)(struct seq_file *, void *)
show routine.
-
struct kern_ipc_perm *ipc_findkey(struct ipc_ids *ids, key_t key)¶
find a key in an ipc identifier set
Parameters
struct ipc_ids *ids
ipc identifier set
key_t key
key to find
Description
Returns the locked pointer to the ipc structure if found or NULL otherwise. If key is found ipc points to the owning ipc structure
Called with writer ipc_ids.rwsem held.
-
int ipc_addid(struct ipc_ids *ids, struct kern_ipc_perm *new, int limit)¶
add an ipc identifier
Parameters
struct ipc_ids *ids
ipc identifier set
struct kern_ipc_perm *new
new ipc permission set
int limit
limit for the number of used ids
Description
Add an entry 'new' to the ipc ids idr. The permissions object is initialised and the first free entry is set up and the index assigned is returned. The 'new' entry is returned in a locked state on success.
On failure the entry is not locked and a negative err-code is returned. The caller must use ipc_rcu_putref() to free the identifier.
Called with writer ipc_ids.rwsem held.
-
int ipcget_new(struct ipc_namespace *ns, struct ipc_ids *ids, const struct ipc_ops *ops, struct ipc_params *params)¶
create a new ipc object
Parameters
struct ipc_namespace *ns
ipc namespace
struct ipc_ids *ids
ipc identifier set
const struct ipc_ops *ops
the actual creation routine to call
struct ipc_params *params
its parameters
Description
This routine is called by sys_msgget, sys_semget() and sys_shmget() when the key is IPC_PRIVATE.
-
int ipc_check_perms(struct ipc_namespace *ns, struct kern_ipc_perm *ipcp, const struct ipc_ops *ops, struct ipc_params *params)¶
check security and permissions for an ipc object
Parameters
struct ipc_namespace *ns
ipc namespace
struct kern_ipc_perm *ipcp
ipc permission set
const struct ipc_ops *ops
the actual security routine to call
struct ipc_params *params
its parameters
Description
This routine is called by sys_msgget(), sys_semget() and sys_shmget() when the key is not IPC_PRIVATE and that key already exists in the ds IDR.
On success, the ipc id is returned.
It is called with ipc_ids.rwsem and ipcp->lock held.
-
int ipcget_public(struct ipc_namespace *ns, struct ipc_ids *ids, const struct ipc_ops *ops, struct ipc_params *params)¶
get an ipc object or create a new one
Parameters
struct ipc_namespace *ns
ipc namespace
struct ipc_ids *ids
ipc identifier set
const struct ipc_ops *ops
the actual creation routine to call
struct ipc_params *params
its parameters
Description
This routine is called by sys_msgget, sys_semget() and sys_shmget() when the key is not IPC_PRIVATE. It adds a new entry if the key is not found and does some permission / security checkings if the key is found.
On success, the ipc id is returned.
-
void ipc_kht_remove(struct ipc_ids *ids, struct kern_ipc_perm *ipcp)¶
remove an ipc from the key hashtable
Parameters
struct ipc_ids *ids
ipc identifier set
struct kern_ipc_perm *ipcp
ipc perm structure containing the key to remove
Description
ipc_ids.rwsem (as a writer) and the spinlock for this ID are held before this function is called, and remain locked on the exit.
-
int ipc_search_maxidx(struct ipc_ids *ids, int limit)¶
search for the highest assigned index
Parameters
struct ipc_ids *ids
ipc identifier set
int limit
known upper limit for highest assigned index
Description
The function determines the highest assigned index in ids. It is intended to be called when ids->max_idx needs to be updated. Updating ids->max_idx is necessary when the current highest index ipc object is deleted. If no ipc object is allocated, then -1 is returned.
ipc_ids.rwsem needs to be held by the caller.
-
void ipc_rmid(struct ipc_ids *ids, struct kern_ipc_perm *ipcp)¶
remove an ipc identifier
Parameters
struct ipc_ids *ids
ipc identifier set
struct kern_ipc_perm *ipcp
ipc perm structure containing the identifier to remove
Description
ipc_ids.rwsem (as a writer) and the spinlock for this ID are held before this function is called, and remain locked on the exit.
-
void ipc_set_key_private(struct ipc_ids *ids, struct kern_ipc_perm *ipcp)¶
switch the key of an existing ipc to IPC_PRIVATE
Parameters
struct ipc_ids *ids
ipc identifier set
struct kern_ipc_perm *ipcp
ipc perm structure containing the key to modify
Description
ipc_ids.rwsem (as a writer) and the spinlock for this ID are held before this function is called, and remain locked on the exit.
-
int ipcperms(struct ipc_namespace *ns, struct kern_ipc_perm *ipcp, short flag)¶
check ipc permissions
Parameters
struct ipc_namespace *ns
ipc namespace
struct kern_ipc_perm *ipcp
ipc permission set
short flag
desired permission set
Description
Check user, group, other permissions for access to ipc resources. return 0 if allowed
flag will most probably be 0 or S_...UGO
from <linux/stat.h>
-
void kernel_to_ipc64_perm(struct kern_ipc_perm *in, struct ipc64_perm *out)¶
convert kernel ipc permissions to user
Parameters
struct kern_ipc_perm *in
kernel permissions
struct ipc64_perm *out
new style ipc permissions
Description
Turn the kernel object in into a set of permissions descriptions for returning to userspace (out).
-
void ipc64_perm_to_ipc_perm(struct ipc64_perm *in, struct ipc_perm *out)¶
convert new ipc permissions to old
Parameters
struct ipc64_perm *in
new style ipc permissions
struct ipc_perm *out
old style ipc permissions
Description
Turn the new style permissions object in into a compatibility object and store it into the out pointer.
-
struct kern_ipc_perm *ipc_obtain_object_idr(struct ipc_ids *ids, int id)¶
Parameters
struct ipc_ids *ids
ipc identifier set
int id
ipc id to look for
Description
Look for an id in the ipc ids idr and return associated ipc object.
Call inside the RCU critical section. The ipc object is not locked on exit.
-
struct kern_ipc_perm *ipc_obtain_object_check(struct ipc_ids *ids, int id)¶
Parameters
struct ipc_ids *ids
ipc identifier set
int id
ipc id to look for
Description
Similar to ipc_obtain_object_idr()
but also checks the ipc object
sequence number.
Call inside the RCU critical section. The ipc object is not locked on exit.
-
int ipcget(struct ipc_namespace *ns, struct ipc_ids *ids, const struct ipc_ops *ops, struct ipc_params *params)¶
Common sys_*get() code
Parameters
struct ipc_namespace *ns
namespace
struct ipc_ids *ids
ipc identifier set
const struct ipc_ops *ops
operations to be called on ipc object creation, permission checks and further checks
struct ipc_params *params
the parameters needed by the previous operations.
Description
Common routine called by sys_msgget(), sys_semget() and sys_shmget().
-
int ipc_update_perm(struct ipc64_perm *in, struct kern_ipc_perm *out)¶
update the permissions of an ipc object
Parameters
struct ipc64_perm *in
the permission given as input.
struct kern_ipc_perm *out
the permission of the ipc to set.
-
struct kern_ipc_perm *ipcctl_obtain_check(struct ipc_namespace *ns, struct ipc_ids *ids, int id, int cmd, struct ipc64_perm *perm, int extra_perm)¶
retrieve an ipc object and check permissions
Parameters
struct ipc_namespace *ns
ipc namespace
struct ipc_ids *ids
the table of ids where to look for the ipc
int id
the id of the ipc to retrieve
int cmd
the cmd to check
struct ipc64_perm *perm
the permission to set
int extra_perm
one extra permission parameter used by msq
Description
This function does some common audit and permissions check for some IPC_XXX cmd and is called from semctl_down, shmctl_down and msgctl_down.
- It:
retrieves the ipc object with the given id in the given table.
performs some audit and permission check, depending on the given cmd
returns a pointer to the ipc object or otherwise, the corresponding error.
Call holding the both the rwsem and the rcu read lock.
-
int ipc_parse_version(int *cmd)¶
ipc call version
Parameters
int *cmd
pointer to command
Description
Return IPC_64 for new style IPC and IPC_OLD for old style IPC. The cmd value is turned from an encoding command and version into just the command code.
-
struct kern_ipc_perm *sysvipc_find_ipc(struct ipc_ids *ids, loff_t *pos)¶
Find and lock the ipc structure based on seq pos
Parameters
struct ipc_ids *ids
ipc identifier set
loff_t *pos
expected position
Description
The function finds an ipc structure, based on the sequence file
position pos. If there is no ipc structure at position pos, then
the successor is selected.
If a structure is found, then it is locked (both rcu_read_lock()
and
ipc_lock_object()) and pos is set to the position needed to locate
the found ipc structure.
If nothing is found (i.e. EOF), pos is not modified.
The function returns the found ipc structure, or NULL at EOF.
FIFO Buffer¶
kfifo interface¶
-
DECLARE_KFIFO_PTR¶
DECLARE_KFIFO_PTR (fifo, type)
macro to declare a fifo pointer object
Parameters
fifo
name of the declared fifo
type
type of the fifo elements
-
DECLARE_KFIFO¶
DECLARE_KFIFO (fifo, type, size)
macro to declare a fifo object
Parameters
fifo
name of the declared fifo
type
type of the fifo elements
size
the number of elements in the fifo, this must be a power of 2
-
INIT_KFIFO¶
INIT_KFIFO (fifo)
Initialize a fifo declared by DECLARE_KFIFO
Parameters
fifo
name of the declared fifo datatype
-
DEFINE_KFIFO¶
DEFINE_KFIFO (fifo, type, size)
macro to define and initialize a fifo
Parameters
fifo
name of the declared fifo datatype
type
type of the fifo elements
size
the number of elements in the fifo, this must be a power of 2
Note
the macro can be used for global and local fifo data type variables.
-
kfifo_initialized¶
kfifo_initialized (fifo)
Check if the fifo is initialized
Parameters
fifo
address of the fifo to check
Description
Return true
if fifo is initialized, otherwise false
.
Assumes the fifo was 0 before.
-
kfifo_esize¶
kfifo_esize (fifo)
returns the size of the element managed by the fifo
Parameters
fifo
address of the fifo to be used
-
kfifo_recsize¶
kfifo_recsize (fifo)
returns the size of the record length field
Parameters
fifo
address of the fifo to be used
-
kfifo_size¶
kfifo_size (fifo)
returns the size of the fifo in elements
Parameters
fifo
address of the fifo to be used
-
kfifo_reset¶
kfifo_reset (fifo)
removes the entire fifo content
Parameters
fifo
address of the fifo to be used
Note
usage of kfifo_reset()
is dangerous. It should be only called when the
fifo is exclusived locked or when it is secured that no other thread is
accessing the fifo.
-
kfifo_reset_out¶
kfifo_reset_out (fifo)
skip fifo content
Parameters
fifo
address of the fifo to be used
Note
The usage of kfifo_reset_out()
is safe until it will be only called
from the reader thread and there is only one concurrent reader. Otherwise
it is dangerous and must be handled in the same way as kfifo_reset()
.
-
kfifo_len¶
kfifo_len (fifo)
returns the number of used elements in the fifo
Parameters
fifo
address of the fifo to be used
-
kfifo_is_empty¶
kfifo_is_empty (fifo)
returns true if the fifo is empty
Parameters
fifo
address of the fifo to be used
-
kfifo_is_empty_spinlocked¶
kfifo_is_empty_spinlocked (fifo, lock)
returns true if the fifo is empty using a spinlock for locking
Parameters
fifo
address of the fifo to be used
lock
spinlock to be used for locking
-
kfifo_is_empty_spinlocked_noirqsave¶
kfifo_is_empty_spinlocked_noirqsave (fifo, lock)
returns true if the fifo is empty using a spinlock for locking, doesn't disable interrupts
Parameters
fifo
address of the fifo to be used
lock
spinlock to be used for locking
-
kfifo_is_full¶
kfifo_is_full (fifo)
returns true if the fifo is full
Parameters
fifo
address of the fifo to be used
-
kfifo_avail¶
kfifo_avail (fifo)
returns the number of unused elements in the fifo
Parameters
fifo
address of the fifo to be used
-
kfifo_skip¶
kfifo_skip (fifo)
skip output data
Parameters
fifo
address of the fifo to be used
-
kfifo_peek_len¶
kfifo_peek_len (fifo)
gets the size of the next fifo record
Parameters
fifo
address of the fifo to be used
Description
This function returns the size of the next fifo record in number of bytes.
-
kfifo_alloc¶
kfifo_alloc (fifo, size, gfp_mask)
dynamically allocates a new fifo buffer
Parameters
fifo
pointer to the fifo
size
the number of elements in the fifo, this must be a power of 2
gfp_mask
get_free_pages mask, passed to
kmalloc()
Description
This macro dynamically allocates a new fifo buffer.
The number of elements will be rounded-up to a power of 2.
The fifo will be release with kfifo_free()
.
Return 0 if no error, otherwise an error code.
-
kfifo_free¶
kfifo_free (fifo)
frees the fifo
Parameters
fifo
the fifo to be freed
-
kfifo_init¶
kfifo_init (fifo, buffer, size)
initialize a fifo using a preallocated buffer
Parameters
fifo
the fifo to assign the buffer
buffer
the preallocated buffer to be used
size
the size of the internal buffer, this have to be a power of 2
Description
This macro initializes a fifo using a preallocated buffer.
The number of elements will be rounded-up to a power of 2. Return 0 if no error, otherwise an error code.
-
kfifo_put¶
kfifo_put (fifo, val)
put data into the fifo
Parameters
fifo
address of the fifo to be used
val
the data to be added
Description
This macro copies the given value into the fifo. It returns 0 if the fifo was full. Otherwise it returns the number processed elements.
Note that with only one concurrent reader and one concurrent writer, you don't need extra locking to use these macro.
-
kfifo_get¶
kfifo_get (fifo, val)
get data from the fifo
Parameters
fifo
address of the fifo to be used
val
address where to store the data
Description
This macro reads the data from the fifo. It returns 0 if the fifo was empty. Otherwise it returns the number processed elements.
Note that with only one concurrent reader and one concurrent writer, you don't need extra locking to use these macro.
-
kfifo_peek¶
kfifo_peek (fifo, val)
get data from the fifo without removing
Parameters
fifo
address of the fifo to be used
val
address where to store the data
Description
This reads the data from the fifo without removing it from the fifo. It returns 0 if the fifo was empty. Otherwise it returns the number processed elements.
Note that with only one concurrent reader and one concurrent writer, you don't need extra locking to use these macro.
-
kfifo_in¶
kfifo_in (fifo, buf, n)
put data into the fifo
Parameters
fifo
address of the fifo to be used
buf
the data to be added
n
number of elements to be added
Description
This macro copies the given buffer into the fifo and returns the number of copied elements.
Note that with only one concurrent reader and one concurrent writer, you don't need extra locking to use these macro.
-
kfifo_in_spinlocked¶
kfifo_in_spinlocked (fifo, buf, n, lock)
put data into the fifo using a spinlock for locking
Parameters
fifo
address of the fifo to be used
buf
the data to be added
n
number of elements to be added
lock
pointer to the spinlock to use for locking
Description
This macro copies the given values buffer into the fifo and returns the number of copied elements.
-
kfifo_in_spinlocked_noirqsave¶
kfifo_in_spinlocked_noirqsave (fifo, buf, n, lock)
put data into fifo using a spinlock for locking, don't disable interrupts
Parameters
fifo
address of the fifo to be used
buf
the data to be added
n
number of elements to be added
lock
pointer to the spinlock to use for locking
Description
This is a variant of kfifo_in_spinlocked()
but uses spin_lock/unlock()
for locking and doesn't disable interrupts.
-
kfifo_out¶
kfifo_out (fifo, buf, n)
get data from the fifo
Parameters
fifo
address of the fifo to be used
buf
pointer to the storage buffer
n
max. number of elements to get
Description
This macro get some data from the fifo and return the numbers of elements copied.
Note that with only one concurrent reader and one concurrent writer, you don't need extra locking to use these macro.
-
kfifo_out_spinlocked¶
kfifo_out_spinlocked (fifo, buf, n, lock)
get data from the fifo using a spinlock for locking
Parameters
fifo
address of the fifo to be used
buf
pointer to the storage buffer
n
max. number of elements to get
lock
pointer to the spinlock to use for locking
Description
This macro get the data from the fifo and return the numbers of elements copied.
-
kfifo_out_spinlocked_noirqsave¶
kfifo_out_spinlocked_noirqsave (fifo, buf, n, lock)
get data from the fifo using a spinlock for locking, don't disable interrupts
Parameters
fifo
address of the fifo to be used
buf
pointer to the storage buffer
n
max. number of elements to get
lock
pointer to the spinlock to use for locking
Description
This is a variant of kfifo_out_spinlocked()
which uses spin_lock/unlock()
for locking and doesn't disable interrupts.
-
kfifo_from_user¶
kfifo_from_user (fifo, from, len, copied)
puts some data from user space into the fifo
Parameters
fifo
address of the fifo to be used
from
pointer to the data to be added
len
the length of the data to be added
copied
pointer to output variable to store the number of copied bytes
Description
This macro copies at most len bytes from the from into the fifo, depending of the available space and returns -EFAULT/0.
Note that with only one concurrent reader and one concurrent writer, you don't need extra locking to use these macro.
-
kfifo_to_user¶
kfifo_to_user (fifo, to, len, copied)
copies data from the fifo into user space
Parameters
fifo
address of the fifo to be used
to
where the data must be copied
len
the size of the destination buffer
copied
pointer to output variable to store the number of copied bytes
Description
This macro copies at most len bytes from the fifo into the to buffer and returns -EFAULT/0.
Note that with only one concurrent reader and one concurrent writer, you don't need extra locking to use these macro.
-
kfifo_dma_in_prepare¶
kfifo_dma_in_prepare (fifo, sgl, nents, len)
setup a scatterlist for DMA input
Parameters
fifo
address of the fifo to be used
sgl
pointer to the scatterlist array
nents
number of entries in the scatterlist array
len
number of elements to transfer
Description
This macro fills a scatterlist for DMA input. It returns the number entries in the scatterlist array.
Note that with only one concurrent reader and one concurrent writer, you don't need extra locking to use these macros.
-
kfifo_dma_in_finish¶
kfifo_dma_in_finish (fifo, len)
finish a DMA IN operation
Parameters
fifo
address of the fifo to be used
len
number of bytes to received
Description
This macro finish a DMA IN operation. The in counter will be updated by the len parameter. No error checking will be done.
Note that with only one concurrent reader and one concurrent writer, you don't need extra locking to use these macros.
-
kfifo_dma_out_prepare¶
kfifo_dma_out_prepare (fifo, sgl, nents, len)
setup a scatterlist for DMA output
Parameters
fifo
address of the fifo to be used
sgl
pointer to the scatterlist array
nents
number of entries in the scatterlist array
len
number of elements to transfer
Description
This macro fills a scatterlist for DMA output which at most len bytes to transfer. It returns the number entries in the scatterlist array. A zero means there is no space available and the scatterlist is not filled.
Note that with only one concurrent reader and one concurrent writer, you don't need extra locking to use these macros.
-
kfifo_dma_out_finish¶
kfifo_dma_out_finish (fifo, len)
finish a DMA OUT operation
Parameters
fifo
address of the fifo to be used
len
number of bytes transferred
Description
This macro finish a DMA OUT operation. The out counter will be updated by the len parameter. No error checking will be done.
Note that with only one concurrent reader and one concurrent writer, you don't need extra locking to use these macros.
-
kfifo_out_peek¶
kfifo_out_peek (fifo, buf, n)
gets some data from the fifo
Parameters
fifo
address of the fifo to be used
buf
pointer to the storage buffer
n
max. number of elements to get
Description
This macro get the data from the fifo and return the numbers of elements copied. The data is not removed from the fifo.
Note that with only one concurrent reader and one concurrent writer, you don't need extra locking to use these macro.
relay interface support¶
Relay interface support is designed to provide an efficient mechanism for tools and facilities to relay large amounts of data from kernel space to user space.
relay interface¶
-
int relay_buf_full(struct rchan_buf *buf)¶
boolean, is the channel buffer full?
Parameters
struct rchan_buf *buf
channel buffer
Returns 1 if the buffer is full, 0 otherwise.
-
void relay_reset(struct rchan *chan)¶
reset the channel
Parameters
struct rchan *chan
the channel
This has the effect of erasing all data from all channel buffers and restarting the channel in its initial state. The buffers are not freed, so any mappings are still in effect.
NOTE. Care should be taken that the channel isn't actually being used by anything when this call is made.
-
struct rchan *relay_open(const char *base_filename, struct dentry *parent, size_t subbuf_size, size_t n_subbufs, const struct rchan_callbacks *cb, void *private_data)¶
create a new relay channel
Parameters
const char *base_filename
base name of files to create,
NULL
for buffering onlystruct dentry *parent
dentry of parent directory,
NULL
for root directory or buffersize_t subbuf_size
size of sub-buffers
size_t n_subbufs
number of sub-buffers
const struct rchan_callbacks *cb
client callback functions
void *private_data
user-defined data
Returns channel pointer if successful,
NULL
otherwise.Creates a channel buffer for each cpu using the sizes and attributes specified. The created channel buffer files will be named base_filename0...base_filenameN-1. File permissions will be
S_IRUSR
.If opening a buffer (parent = NULL) that you later wish to register in a filesystem, call
relay_late_setup_files()
once the parent dentry is available.
-
int relay_late_setup_files(struct rchan *chan, const char *base_filename, struct dentry *parent)¶
triggers file creation
Parameters
struct rchan *chan
channel to operate on
const char *base_filename
base name of files to create
struct dentry *parent
dentry of parent directory,
NULL
for root directoryReturns 0 if successful, non-zero otherwise.
Use to setup files for a previously buffer-only channel created by
relay_open()
with a NULL parent dentry.For example, this is useful for perfomring early tracing in kernel, before VFS is up and then exposing the early results once the dentry is available.
-
size_t relay_switch_subbuf(struct rchan_buf *buf, size_t length)¶
switch to a new sub-buffer
Parameters
struct rchan_buf *buf
channel buffer
size_t length
size of current event
Returns either the length passed in or 0 if full.
Performs sub-buffer-switch tasks such as invoking callbacks, updating padding counts, waking up readers, etc.
-
void relay_subbufs_consumed(struct rchan *chan, unsigned int cpu, size_t subbufs_consumed)¶
update the buffer's sub-buffers-consumed count
Parameters
struct rchan *chan
the channel
unsigned int cpu
the cpu associated with the channel buffer to update
size_t subbufs_consumed
number of sub-buffers to add to current buf's count
Adds to the channel buffer's consumed sub-buffer count. subbufs_consumed should be the number of sub-buffers newly consumed, not the total consumed.
NOTE. Kernel clients don't need to call this function if the channel mode is 'overwrite'.
-
void relay_close(struct rchan *chan)¶
close the channel
Parameters
struct rchan *chan
the channel
Closes all channel buffers and frees the channel.
-
void relay_flush(struct rchan *chan)¶
close the channel
Parameters
struct rchan *chan
the channel
Flushes all channel buffers, i.e. forces buffer switch.
-
int relay_mmap_buf(struct rchan_buf *buf, struct vm_area_struct *vma)¶
mmap channel buffer to process address space
Parameters
struct rchan_buf *buf
relay channel buffer
struct vm_area_struct *vma
vm_area_struct describing memory to be mapped
Returns 0 if ok, negative on error
Caller should already have grabbed mmap_lock.
-
void *relay_alloc_buf(struct rchan_buf *buf, size_t *size)¶
allocate a channel buffer
Parameters
struct rchan_buf *buf
the buffer struct
size_t *size
total size of the buffer
Returns a pointer to the resulting buffer,
NULL
if unsuccessful. The passed in size will get page aligned, if it isn't already.
-
struct rchan_buf *relay_create_buf(struct rchan *chan)¶
allocate and initialize a channel buffer
Parameters
struct rchan *chan
the relay channel
Returns channel buffer if successful,
NULL
otherwise.
Parameters
struct kref *kref
target kernel reference that contains the relay channel
Should only be called from kref_put().
-
void relay_destroy_buf(struct rchan_buf *buf)¶
destroy an rchan_buf struct and associated buffer
Parameters
struct rchan_buf *buf
the buffer struct
Parameters
struct kref *kref
target kernel reference that contains the relay buffer
Removes the file from the filesystem, which also frees the rchan_buf_struct and the channel buffer. Should only be called from kref_put().
-
int relay_buf_empty(struct rchan_buf *buf)¶
boolean, is the channel buffer empty?
Parameters
struct rchan_buf *buf
channel buffer
Returns 1 if the buffer is empty, 0 otherwise.
-
void wakeup_readers(struct irq_work *work)¶
wake up readers waiting on a channel
Parameters
struct irq_work *work
contains the channel buffer
This is the function used to defer reader waking
-
void __relay_reset(struct rchan_buf *buf, unsigned int init)¶
reset a channel buffer
Parameters
struct rchan_buf *buf
the channel buffer
unsigned int init
1 if this is a first-time initialization
See
relay_reset()
for description of effect.
-
void relay_close_buf(struct rchan_buf *buf)¶
close a channel buffer
Parameters
struct rchan_buf *buf
channel buffer
Marks the buffer finalized and restores the default callbacks. The channel buffer and channel buffer data structure are then freed automatically when the last reference is given up.
Parameters
struct inode *inode
the inode
struct file *filp
the file
Increments the channel buffer refcount.
-
int relay_file_mmap(struct file *filp, struct vm_area_struct *vma)¶
mmap file op for relay files
Parameters
struct file *filp
the file
struct vm_area_struct *vma
the vma describing what to map
Calls upon
relay_mmap_buf()
to map the file into user space.
-
__poll_t relay_file_poll(struct file *filp, poll_table *wait)¶
poll file op for relay files
Parameters
struct file *filp
the file
poll_table *wait
poll table
Poll implemention.
Parameters
struct inode *inode
the inode
struct file *filp
the file
Decrements the channel refcount, as the filesystem is no longer using it.
-
size_t relay_file_read_subbuf_avail(size_t read_pos, struct rchan_buf *buf)¶
return bytes available in sub-buffer
Parameters
size_t read_pos
file read position
struct rchan_buf *buf
relay channel buffer
-
size_t relay_file_read_start_pos(struct rchan_buf *buf)¶
find the first available byte to read
Parameters
struct rchan_buf *buf
relay channel buffer
If the read_pos is in the middle of padding, return the position of the first actually available byte, otherwise return the original value.
-
size_t relay_file_read_end_pos(struct rchan_buf *buf, size_t read_pos, size_t count)¶
return the new read position
Parameters
struct rchan_buf *buf
relay channel buffer
size_t read_pos
file read position
size_t count
number of bytes to be read
Module Support¶
Kernel module auto-loading¶
-
int __request_module(bool wait, const char *fmt, ...)¶
try to load a kernel module
Parameters
bool wait
wait (or not) for the operation to complete
const char *fmt
printf style format string for the name of the module
...
arguments as specified in the format string
Description
Load a module using the user mode module loader. The function returns zero on success or a negative errno code or positive exit code from "modprobe" on failure. Note that a successful module load does not mean the module did not then unload and exit on an error of its own. Callers must check that the service they requested is now available not blindly invoke it.
If module auto-loading support is disabled then this function simply returns -ENOENT.
Module debugging¶
Enabling CONFIG_MODULE_STATS enables module debugging statistics which are useful to monitor and root cause memory pressure issues with module loading. These statistics are useful to allow us to improve production workloads.
The current module debugging statistics supported help keep track of module loading failures to enable improvements either for kernel module auto-loading usage (request_module()) or interactions with userspace. Statistics are provided to track all possible failures in the finit_module() path and memory wasted in this process space. Each of the failure counters are associated to a type of module loading failure which is known to incur a certain amount of memory allocation loss. In the worst case loading a module will fail after a 3 step memory allocation process:
memory allocated with kernel_read_file_from_fd()
module decompression processes the file read from kernel_read_file_from_fd(), and
vmap()
is used to map the decompressed module to a new local buffer which represents a copy of the decompressed module passed from userspace. The buffer from kernel_read_file_from_fd() is freed right away.layout_and_allocate() allocates space for the final resting place where we would keep the module if it were to be processed successfully.
If a failure occurs after these three different allocations only one counter will be incremented with the summation of the allocated bytes freed incurred during this failure. Likewise, if module loading failed only after step b) a separate counter is used and incremented for the bytes freed and not used during both of those allocations.
Virtual memory space can be limited, for example on x86 virtual memory size defaults to 128 MiB. We should strive to limit and avoid wasting virtual memory allocations when possible. These module debugging statistics help to evaluate how much memory is being wasted on bootup due to module loading failures.
All counters are designed to be incremental. Atomic counters are used so to remain simple and avoid delays and deadlocks.
dup_failed_modules - tracks duplicate failed modules¶
Linked list of modules which failed to be loaded because an already existing module with the same name was already being processed or already loaded. The finit_module() system call incurs heavy virtual memory allocations. In the worst case an finit_module() system call can end up allocating virtual memory 3 times:
In practice on a typical boot today most finit_module() calls fail due to the module with the same name already being loaded or about to be processed. All virtual memory allocated to these failed modules will be freed with no functional use.
To help with this the dup_failed_modules allows us to track modules which failed to load due to the fact that a module was already loaded or being processed. There are only two points at which we can fail such calls, we list them below along with the number of virtual memory allocation calls:
FAIL_DUP_MOD_BECOMING: at the end of early_mod_check() before layout_and_allocate(). - with module decompression: 2 virtual memory allocation calls - without module decompression: 1 virtual memory allocation calls
FAIL_DUP_MOD_LOAD: after layout_and_allocate() on add_unformed_module() - with module decompression 3 virtual memory allocation calls - without module decompression 2 virtual memory allocation calls
We should strive to get this list to be as small as possible. If this list is not empty it is a reflection of possible work or optimizations possible either in-kernel or in userspace.
module statistics debugfs counters¶
The total amount of wasted virtual memory allocation space during module loading can be computed by adding the total from the summation:
invalid_kread_bytes + invalid_decompress_bytes + invalid_becoming_bytes + invalid_mod_bytes
The following debugfs counters are available to inspect module loading failures:
total_mod_size: total bytes ever used by all modules we've dealt with on this system
total_text_size: total bytes of the .text and .init.text ELF section sizes we've dealt with on this system
invalid_kread_bytes: bytes allocated and then freed on failures which happen due to the initial kernel_read_file_from_fd(). kernel_read_file_from_fd() uses
vmalloc()
. These should typically not happen unless your system is under memory pressure.invalid_decompress_bytes: number of bytes allocated and freed due to memory allocations in the module decompression path that use
vmap()
. These typically should not happen unless your system is under memory pressure.invalid_becoming_bytes: total number of bytes allocated and freed used used to read the kernel module userspace wants us to read before we promote it to be processed to be added to our modules linked list. These failures can happen if we had a check in between a successful kernel_read_file_from_fd() call and right before we allocate the our private memory for the module which would be kept if the module is successfully loaded. The most common reason for this failure is when userspace is racing to load a module which it does not yet see loaded. The first module to succeed in add_unformed_module() will add a module to our
modules
list and subsequent loads of modules with the same name will error out at the end of early_mod_check(). The check for module_patient_check_exists() at the end of early_mod_check() prevents duplicate allocations on layout_and_allocate() for modules already being processed. These duplicate failed modules are non-fatal, however they typically are indicative of userspace not seeing a module in userspace loaded yet and unnecessarily trying to load a module before the kernel even has a chance to begin to process prior requests. Although duplicate failures can be non-fatal, we should try to reducevmalloc()
pressure proactively, so ideally after boot this will be close to as 0 as possible. If module decompression was used we also add to this counter the cost of the initial kernel_read_file_from_fd() of the compressed module. If module decompression was not used the value represents the total allocated and freed bytes in kernel_read_file_from_fd() calls for these type of failures. These failures can occur because:
module_sig_check() - module signature checks
elf_validity_cache_copy() - some ELF validation issue
early_mod_check():
blacklisting
failed to rewrite section headers
version magic
live patch requirements didn't check out
the module was detected as being already present
invalid_mod_bytes: these are the total number of bytes allocated and freed due to failures after we did all the sanity checks of the module which userspace passed to us and after our first check that the module is unique. A module can still fail to load if we detect the module is loaded after we allocate space for it with layout_and_allocate(), we do this check right before processing the module as live and run its initialization routines. Note that you have a failure of this type it also means the respective kernel_read_file_from_fd() memory space was also freed and not used, and so we increment this counter with twice the size of the module. Additionally if you used module decompression the size of the compressed module is also added to this counter.
modcount: how many modules we've loaded in our kernel life time
failed_kreads: how many modules failed due to failed kernel_read_file_from_fd()
failed_decompress: how many failed module decompression attempts we've had. These really should not happen unless your compression / decompression might be broken.
failed_becoming: how many modules failed after we kernel_read_file_from_fd() it and before we allocate memory for it with layout_and_allocate(). This counter is never incremented if you manage to validate the module and call layout_and_allocate() for it.
failed_load_modules: how many modules failed once we've allocated our private space for our module using layout_and_allocate(). These failures should hopefully mostly be dealt with already. Races in theory could still exist here, but it would just mean the kernel had started processing two threads concurrently up to early_mod_check() and one thread won. These failures are good signs the kernel or userspace is doing something seriously stupid or that could be improved. We should strive to fix these, but it is perhaps not easy to fix them. A recent example are the modules requests incurred for frequency modules, a separate module request was being issued for each CPU on a system.
Inter Module support¶
Refer to the files in kernel/module/ for more information.
Hardware Interfaces¶
DMA Channels¶
-
int request_dma(unsigned int dmanr, const char *device_id)¶
request and reserve a system DMA channel
Parameters
unsigned int dmanr
DMA channel number
const char * device_id
reserving device ID string, used in /proc/dma
-
void free_dma(unsigned int dmanr)¶
free a reserved system DMA channel
Parameters
unsigned int dmanr
DMA channel number
Resources Management¶
-
struct resource *request_resource_conflict(struct resource *root, struct resource *new)¶
request and reserve an I/O or memory resource
Parameters
struct resource *root
root resource descriptor
struct resource *new
resource descriptor desired by caller
Description
Returns 0 for success, conflict resource on error.
-
int find_next_iomem_res(resource_size_t start, resource_size_t end, unsigned long flags, unsigned long desc, struct resource *res)¶
Finds the lowest iomem resource that covers part of [start..**end**].
Parameters
resource_size_t start
start address of the resource searched for
resource_size_t end
end address of same resource
unsigned long flags
flags which the resource must have
unsigned long desc
descriptor the resource must have
struct resource *res
return ptr, if resource found
Description
If a resource is found, returns 0 and ***res is overwritten with the part of the resource that's within [**start..**end**]; if none is found, returns -ENODEV. Returns -EINVAL for invalid parameters.
The caller must specify start, end, flags, and desc (which may be IORES_DESC_NONE).
-
int reallocate_resource(struct resource *root, struct resource *old, resource_size_t newsize, struct resource_constraint *constraint)¶
allocate a slot in the resource tree given range & alignment. The resource will be relocated if the new size cannot be reallocated in the current location.
Parameters
struct resource *root
root resource descriptor
struct resource *old
resource descriptor desired by caller
resource_size_t newsize
new size of the resource descriptor
struct resource_constraint *constraint
the size and alignment constraints to be met.
-
struct resource *lookup_resource(struct resource *root, resource_size_t start)¶
find an existing resource by a resource start address
Parameters
struct resource *root
root resource descriptor
resource_size_t start
resource start address
Description
Returns a pointer to the resource if found, NULL otherwise
-
struct resource *insert_resource_conflict(struct resource *parent, struct resource *new)¶
Inserts resource in the resource tree
Parameters
struct resource *parent
parent of the new resource
struct resource *new
new resource to insert
Description
Returns 0 on success, conflict resource if the resource can't be inserted.
This function is equivalent to request_resource_conflict when no conflict happens. If a conflict happens, and the conflicting resources entirely fit within the range of the new resource, then the new resource is inserted and the conflicting resources become children of the new resource.
This function is intended for producers of resources, such as FW modules and bus drivers.
-
resource_size_t resource_alignment(struct resource *res)¶
calculate resource's alignment
Parameters
struct resource *res
resource pointer
Description
Returns alignment on success, 0 (invalid alignment) on failure.
-
void release_mem_region_adjustable(resource_size_t start, resource_size_t size)¶
release a previously reserved memory region
Parameters
resource_size_t start
resource start address
resource_size_t size
resource region size
Description
This interface is intended for memory hot-delete. The requested region is released from a currently busy memory resource. The requested region must either match exactly or fit into a single busy resource entry. In the latter case, the remaining resource is adjusted accordingly. Existing children of the busy memory resource must be immutable in the request.
Note
Additional release conditions, such as overlapping region, can be supported after they are confirmed as valid cases.
When a busy memory resource gets split into two entries, the code assumes that all children remain in the lower address entry for simplicity. Enhance this logic when necessary.
-
void merge_system_ram_resource(struct resource *res)¶
mark the System RAM resource mergeable and try to merge it with adjacent, mergeable resources
Parameters
struct resource *res
resource descriptor
Description
This interface is intended for memory hotplug, whereby lots of contiguous system ram resources are added (e.g., via add_memory*()) by a driver, and the actual resource boundaries are not of interest (e.g., it might be relevant for DIMMs). Only resources that are marked mergeable, that have the same parent, and that don't have any children are considered. All mergeable resources must be immutable during the request.
Note
The caller has to make sure that no pointers to resources that are marked mergeable are used anymore after this call - the resource might be freed and the pointer might be stale!
release_mem_region_adjustable()
will split on demand on memory hotunplug
-
int request_resource(struct resource *root, struct resource *new)¶
request and reserve an I/O or memory resource
Parameters
struct resource *root
root resource descriptor
struct resource *new
resource descriptor desired by caller
Description
Returns 0 for success, negative error code on error.
-
int release_resource(struct resource *old)¶
release a previously reserved resource
Parameters
struct resource *old
resource pointer
-
int walk_iomem_res_desc(unsigned long desc, unsigned long flags, u64 start, u64 end, void *arg, int (*func)(struct resource*, void*))¶
Walks through iomem resources and calls func() with matching resource ranges. *
Parameters
unsigned long desc
I/O resource descriptor. Use IORES_DESC_NONE to skip desc check.
unsigned long flags
I/O resource flags
u64 start
start addr
u64 end
end addr
void *arg
function argument for the callback func
int (*func)(struct resource *, void *)
callback function that is called for each qualifying resource area
Description
All the memory ranges which overlap start,end and also match flags and desc are valid candidates.
NOTE
For a new descriptor search, define a new IORES_DESC in <linux/ioport.h> and set it in 'desc' of a target resource entry.
-
int region_intersects(resource_size_t start, size_t size, unsigned long flags, unsigned long desc)¶
determine intersection of region with known resources
Parameters
resource_size_t start
region start address
size_t size
size of region
unsigned long flags
flags of resource (in iomem_resource)
unsigned long desc
descriptor of resource (in iomem_resource) or IORES_DESC_NONE
Description
Check if the specified region partially overlaps or fully eclipses a resource identified by flags and desc (optional with IORES_DESC_NONE). Return REGION_DISJOINT if the region does not overlap flags/desc, return REGION_MIXED if the region overlaps flags/desc and another resource, and return REGION_INTERSECTS if the region overlaps flags/desc and no other defined resource. Note that REGION_INTERSECTS is also returned in the case when the specified region overlaps RAM and undefined memory holes.
region_intersect() is used by memory remapping functions to ensure the user is not remapping RAM and is a vast speed up over walking through the resource table page by page.
-
int allocate_resource(struct resource *root, struct resource *new, resource_size_t size, resource_size_t min, resource_size_t max, resource_size_t align, resource_size_t (*alignf)(void*, const struct resource*, resource_size_t, resource_size_t), void *alignf_data)¶
allocate empty slot in the resource tree given range & alignment. The resource will be reallocated with a new size if it was already allocated
Parameters
struct resource *root
root resource descriptor
struct resource *new
resource descriptor desired by caller
resource_size_t size
requested resource region size
resource_size_t min
minimum boundary to allocate
resource_size_t max
maximum boundary to allocate
resource_size_t align
alignment requested, in bytes
resource_size_t (*alignf)(void *, const struct resource *, resource_size_t, resource_size_t)
alignment function, optional, called if not NULL
void *alignf_data
arbitrary data to pass to the alignf function
-
int insert_resource(struct resource *parent, struct resource *new)¶
Inserts a resource in the resource tree
Parameters
struct resource *parent
parent of the new resource
struct resource *new
new resource to insert
Description
Returns 0 on success, -EBUSY if the resource can't be inserted.
This function is intended for producers of resources, such as FW modules and bus drivers.
-
void insert_resource_expand_to_fit(struct resource *root, struct resource *new)¶
Insert a resource into the resource tree
Parameters
struct resource *root
root resource descriptor
struct resource *new
new resource to insert
Description
Insert a resource into the resource tree, possibly expanding it in order to make it encompass any conflicting resources.
-
int remove_resource(struct resource *old)¶
Remove a resource in the resource tree
Parameters
struct resource *old
resource to remove
Description
Returns 0 on success, -EINVAL if the resource is not valid.
This function removes a resource previously inserted by insert_resource()
or insert_resource_conflict()
, and moves the children (if any) up to
where they were before. insert_resource()
and insert_resource_conflict()
insert a new resource, and move any conflicting resources down to the
children of the new resource.
insert_resource()
, insert_resource_conflict()
and remove_resource()
are
intended for producers of resources, such as FW modules and bus drivers.
-
int adjust_resource(struct resource *res, resource_size_t start, resource_size_t size)¶
modify a resource's start and size
Parameters
struct resource *res
resource to modify
resource_size_t start
new start value
resource_size_t size
new size
Description
Given an existing resource, change its start and size to match the arguments. Returns 0 on success, -EBUSY if it can't fit. Existing children of the resource are assumed to be immutable.
-
struct resource *__request_region(struct resource *parent, resource_size_t start, resource_size_t n, const char *name, int flags)¶
create a new busy resource region
Parameters
struct resource *parent
parent resource descriptor
resource_size_t start
resource start address
resource_size_t n
resource region size
const char *name
reserving caller's ID string
int flags
IO resource flags
-
void __release_region(struct resource *parent, resource_size_t start, resource_size_t n)¶
release a previously reserved resource region
Parameters
struct resource *parent
parent resource descriptor
resource_size_t start
resource start address
resource_size_t n
resource region size
Description
The described resource region must match a currently busy region.
-
int devm_request_resource(struct device *dev, struct resource *root, struct resource *new)¶
request and reserve an I/O or memory resource
Parameters
struct device *dev
device for which to request the resource
struct resource *root
root of the resource tree from which to request the resource
struct resource *new
descriptor of the resource to request
Description
This is a device-managed version of request_resource()
. There is usually
no need to release resources requested by this function explicitly since
that will be taken care of when the device is unbound from its driver.
If for some reason the resource needs to be released explicitly, because
of ordering issues for example, drivers must call devm_release_resource()
rather than the regular release_resource()
.
When a conflict is detected between any existing resources and the newly requested resource, an error message will be printed.
Returns 0 on success or a negative error code on failure.
-
void devm_release_resource(struct device *dev, struct resource *new)¶
release a previously requested resource
Parameters
struct device *dev
device for which to release the resource
struct resource *new
descriptor of the resource to release
Description
Releases a resource previously requested using devm_request_resource()
.
-
struct resource *devm_request_free_mem_region(struct device *dev, struct resource *base, unsigned long size)¶
find free region for device private memory
Parameters
struct device *dev
device struct to bind the resource to
struct resource *base
resource tree to look in
unsigned long size
size in bytes of the device memory to add
Description
This function tries to find an empty range of physical address big enough to contain the new resource, so that it can later be hotplugged as ZONE_DEVICE memory, which in turn allocates struct pages.
-
struct resource *alloc_free_mem_region(struct resource *base, unsigned long size, unsigned long align, const char *name)¶
find a free region relative to base
Parameters
struct resource *base
resource that will parent the new resource
unsigned long size
size in bytes of memory to allocate from base
unsigned long align
alignment requirements for the allocation
const char *name
resource name
Description
Buses like CXL, that can dynamically instantiate new memory regions, need a method to allocate physical address space for those regions. Allocate and insert a new resource to cover a free, unclaimed by a descendant of base, range in the span of base.
MTRR Handling¶
-
int arch_phys_wc_add(unsigned long base, unsigned long size)¶
add a WC MTRR and handle errors if PAT is unavailable
Parameters
unsigned long base
Physical base address
unsigned long size
Size of region
Description
If PAT is available, this does nothing. If PAT is unavailable, it attempts to add a WC MTRR covering size bytes starting at base and logs an error if this fails.
The called should provide a power of two size on an equivalent power of two boundary.
Drivers must store the return value to pass to mtrr_del_wc_if_needed, but drivers should not try to interpret that return value.
Security Framework¶
-
int security_init(void)¶
initializes the security framework
Parameters
void
no arguments
Description
This should be called early in the kernel initialization sequence.
-
void security_add_hooks(struct security_hook_list *hooks, int count, const char *lsm)¶
Add a modules hooks to the hook lists.
Parameters
struct security_hook_list *hooks
the hooks to add
int count
the number of hooks to add
const char *lsm
the name of the security module
Description
Each LSM has to register its hooks with the infrastructure.
Parameters
struct cred *cred
the cred that needs a blob
gfp_t gfp
allocation type
Description
Allocate the cred blob for all the modules
Returns 0, or -ENOMEM if memory can't be allocated.
Parameters
struct cred *cred
the cred that needs a blob
Description
Allocate the cred blob for all the modules
Parameters
struct file *file
the file that needs a blob
Description
Allocate the file blob for all the modules
Returns 0, or -ENOMEM if memory can't be allocated.
Parameters
struct inode *inode
the inode that needs a blob
Description
Allocate the inode blob for all the modules
Returns 0, or -ENOMEM if memory can't be allocated.
-
int lsm_task_alloc(struct task_struct *task)¶
allocate a composite task blob
Parameters
struct task_struct *task
the task that needs a blob
Description
Allocate the task blob for all the modules
Returns 0, or -ENOMEM if memory can't be allocated.
-
int lsm_ipc_alloc(struct kern_ipc_perm *kip)¶
allocate a composite ipc blob
Parameters
struct kern_ipc_perm *kip
the ipc that needs a blob
Description
Allocate the ipc blob for all the modules
Returns 0, or -ENOMEM if memory can't be allocated.
-
int lsm_msg_msg_alloc(struct msg_msg *mp)¶
allocate a composite msg_msg blob
Parameters
struct msg_msg *mp
the msg_msg that needs a blob
Description
Allocate the ipc blob for all the modules
Returns 0, or -ENOMEM if memory can't be allocated.
-
void lsm_early_task(struct task_struct *task)¶
during initialization allocate a composite task blob
Parameters
struct task_struct *task
the task that needs a blob
Description
Allocate the task blob for all the modules
-
int lsm_superblock_alloc(struct super_block *sb)¶
allocate a composite superblock blob
Parameters
struct super_block *sb
the superblock that needs a blob
Description
Allocate the superblock blob for all the modules
Returns 0, or -ENOMEM if memory can't be allocated.
-
int security_binder_set_context_mgr(const struct cred *mgr)¶
Check if becoming binder ctx mgr is ok
Parameters
const struct cred *mgr
task credentials of current binder process
Description
Check whether mgr is allowed to be the binder context manager.
Return
Return 0 if permission is granted.
-
int security_binder_transaction(const struct cred *from, const struct cred *to)¶
Check if a binder transaction is allowed
Parameters
const struct cred *from
sending process
const struct cred *to
receiving process
Description
Check whether from is allowed to invoke a binder transaction call to to.
Return
Returns 0 if permission is granted.
-
int security_binder_transfer_binder(const struct cred *from, const struct cred *to)¶
Check if a binder transfer is allowed
Parameters
const struct cred *from
sending process
const struct cred *to
receiving process
Description
Check whether from is allowed to transfer a binder reference to to.
Return
Returns 0 if permission is granted.
-
int security_binder_transfer_file(const struct cred *from, const struct cred *to, struct file *file)¶
Check if a binder file xfer is allowed
Parameters
const struct cred *from
sending process
const struct cred *to
receiving process
struct file *file
file being transferred
Description
Check whether from is allowed to transfer file to to.
Return
Returns 0 if permission is granted.
-
int security_ptrace_access_check(struct task_struct *child, unsigned int mode)¶
Check if tracing is allowed
Parameters
struct task_struct *child
target process
unsigned int mode
PTRACE_MODE flags
Description
Check permission before allowing the current process to trace the child process. Security modules may also want to perform a process tracing check during an execve in the set_security or apply_creds hooks of tracing check during an execve in the bprm_set_creds hook of binprm_security_ops if the process is being traced and its security attributes would be changed by the execve.
Return
Returns 0 if permission is granted.
-
int security_ptrace_traceme(struct task_struct *parent)¶
Check if tracing is allowed
Parameters
struct task_struct *parent
tracing process
Description
Check that the parent process has sufficient permission to trace the current process before allowing the current process to present itself to the parent process for tracing.
Return
Returns 0 if permission is granted.
-
int security_capget(struct task_struct *target, kernel_cap_t *effective, kernel_cap_t *inheritable, kernel_cap_t *permitted)¶
Get the capability sets for a process
Parameters
struct task_struct *target
target process
kernel_cap_t *effective
effective capability set
kernel_cap_t *inheritable
inheritable capability set
kernel_cap_t *permitted
permitted capability set
Description
Get the effective, inheritable, and permitted capability sets for the target process. The hook may also perform permission checking to determine if the current process is allowed to see the capability sets of the target process.
Return
Returns 0 if the capability sets were successfully obtained.
-
int security_capset(struct cred *new, const struct cred *old, const kernel_cap_t *effective, const kernel_cap_t *inheritable, const kernel_cap_t *permitted)¶
Set the capability sets for a process
Parameters
struct cred *new
new credentials for the target process
const struct cred *old
current credentials of the target process
const kernel_cap_t *effective
effective capability set
const kernel_cap_t *inheritable
inheritable capability set
const kernel_cap_t *permitted
permitted capability set
Description
Set the effective, inheritable, and permitted capability sets for the current process.
Return
Returns 0 and update new if permission is granted.
-
int security_capable(const struct cred *cred, struct user_namespace *ns, int cap, unsigned int opts)¶
Check if a process has the necessary capability
Parameters
const struct cred *cred
credentials to examine
struct user_namespace *ns
user namespace
int cap
capability requested
unsigned int opts
capability check options
Description
Check whether the tsk process has the cap capability in the indicated credentials. cap contains the capability <include/linux/capability.h>. opts contains options for the capable check <include/linux/security.h>.
Return
Returns 0 if the capability is granted.
-
int security_quotactl(int cmds, int type, int id, struct super_block *sb)¶
Check if a quotactl() syscall is allowed for this fs
Parameters
int cmds
commands
int type
type
int id
id
struct super_block *sb
filesystem
Description
Check whether the quotactl syscall is allowed for this sb.
Return
Returns 0 if permission is granted.
Parameters
struct dentry *dentry
dentry
Description
Check whether QUOTAON is allowed for dentry.
Return
Returns 0 if permission is granted.
-
int security_syslog(int type)¶
Check if accessing the kernel message ring is allowed
Parameters
int type
SYSLOG_ACTION_* type
Description
Check permission before accessing the kernel message ring or changing logging to the console. See the syslog(2) manual page for an explanation of the type values.
Return
Return 0 if permission is granted.
-
int security_settime64(const struct timespec64 *ts, const struct timezone *tz)¶
Check if changing the system time is allowed
Parameters
const struct timespec64 *ts
new time
const struct timezone *tz
timezone
Description
Check permission to change the system time, struct timespec64 is defined in <include/linux/time64.h> and timezone is defined in <include/linux/time.h>.
Return
Returns 0 if permission is granted.
-
int security_vm_enough_memory_mm(struct mm_struct *mm, long pages)¶
Check if allocating a new mem map is allowed
Parameters
struct mm_struct *mm
mm struct
long pages
number of pages
Description
Check permissions for allocating a new virtual mapping. If all LSMs return a positive value, __vm_enough_memory() will be called with cap_sys_admin set. If at least one LSM returns 0 or negative, __vm_enough_memory() will be called with cap_sys_admin cleared.
Return
- Returns 0 if permission is granted by the LSM infrastructure to the
caller.
-
int security_bprm_creds_for_exec(struct linux_binprm *bprm)¶
Prepare the credentials for exec()
Parameters
struct linux_binprm *bprm
binary program information
Description
If the setup in prepare_exec_creds did not setup bprm->cred->security properly for executing bprm->file, update the LSM's portion of bprm->cred->security to be what commit_creds needs to install for the new program. This hook may also optionally check permissions (e.g. for transitions between security domains). The hook must set bprm->secureexec to 1 if AT_SECURE should be set to request libc enable secure mode. bprm contains the linux_binprm structure.
Return
Returns 0 if the hook is successful and permission is granted.
-
int security_bprm_creds_from_file(struct linux_binprm *bprm, struct file *file)¶
Update linux_binprm creds based on file
Parameters
struct linux_binprm *bprm
binary program information
struct file *file
associated file
Description
If file is setpcap, suid, sgid or otherwise marked to change privilege upon exec, update bprm->cred to reflect that change. This is called after finding the binary that will be executed without an interpreter. This ensures that the credentials will not be derived from a script that the binary will need to reopen, which when reopend may end up being a completely different file. This hook may also optionally check permissions (e.g. for transitions between security domains). The hook must set bprm->secureexec to 1 if AT_SECURE should be set to request libc enable secure mode. The hook must add to bprm->per_clear any personality flags that should be cleared from current->personality. bprm contains the linux_binprm structure.
Return
Returns 0 if the hook is successful and permission is granted.
-
int security_bprm_check(struct linux_binprm *bprm)¶
Mediate binary handler search
Parameters
struct linux_binprm *bprm
binary program information
Description
This hook mediates the point when a search for a binary handler will begin. It allows a check against the bprm->cred->security value which was set in the preceding creds_for_exec call. The argv list and envp list are reliably available in bprm. This hook may be called multiple times during a single execve. bprm contains the linux_binprm structure.
Return
Returns 0 if the hook is successful and permission is granted.
-
void security_bprm_committing_creds(struct linux_binprm *bprm)¶
Install creds for a process during exec()
Parameters
struct linux_binprm *bprm
binary program information
Description
Prepare to install the new security attributes of a process being transformed by an execve operation, based on the old credentials pointed to by current->cred and the information set in bprm->cred by the bprm_creds_for_exec hook. bprm points to the linux_binprm structure. This hook is a good place to perform state changes on the process such as closing open file descriptors to which access will no longer be granted when the attributes are changed. This is called immediately before commit_creds().
-
void security_bprm_committed_creds(struct linux_binprm *bprm)¶
Tidy up after cred install during exec()
Parameters
struct linux_binprm *bprm
binary program information
Description
Tidy up after the installation of the new security attributes of a process being transformed by an execve operation. The new credentials have, by this point, been set to current->cred. bprm points to the linux_binprm structure. This hook is a good place to perform state changes on the process such as clearing out non-inheritable signal state. This is called immediately after commit_creds().
-
int security_fs_context_dup(struct fs_context *fc, struct fs_context *src_fc)¶
Duplicate a fs_context LSM blob
Parameters
struct fs_context *fc
destination filesystem context
struct fs_context *src_fc
source filesystem context
Description
Allocate and attach a security structure to sc->security. This pointer is initialised to NULL by the caller. fc indicates the new filesystem context. src_fc indicates the original filesystem context.
Return
Returns 0 on success or a negative error code on failure.
-
int security_fs_context_parse_param(struct fs_context *fc, struct fs_parameter *param)¶
Configure a filesystem context
Parameters
struct fs_context *fc
filesystem context
struct fs_parameter *param
filesystem parameter
Description
Userspace provided a parameter to configure a superblock. The LSM can consume the parameter or return it to the caller for use elsewhere.
Return
- If the parameter is used by the LSM it should return 0, if it is
returned to the caller -ENOPARAM is returned, otherwise a negative error code is returned.
-
int security_sb_alloc(struct super_block *sb)¶
Allocate a super_block LSM blob
Parameters
struct super_block *sb
filesystem superblock
Description
Allocate and attach a security structure to the sb->s_security field. The s_security field is initialized to NULL when the structure is allocated. sb contains the super_block structure to be modified.
Return
Returns 0 if operation was successful.
-
void security_sb_delete(struct super_block *sb)¶
Release super_block LSM associated objects
Parameters
struct super_block *sb
filesystem superblock
Description
Release objects tied to a superblock (e.g. inodes). sb contains the super_block structure being released.
-
void security_sb_free(struct super_block *sb)¶
Free a super_block LSM blob
Parameters
struct super_block *sb
filesystem superblock
Description
Deallocate and clear the sb->s_security field. sb contains the super_block structure to be modified.
-
int security_sb_kern_mount(struct super_block *sb)¶
Check if a kernel mount is allowed
Parameters
struct super_block *sb
filesystem superblock
Description
Mount this sb if allowed by permissions.
Return
Returns 0 if permission is granted.
-
int security_sb_show_options(struct seq_file *m, struct super_block *sb)¶
Output the mount options for a superblock
Parameters
struct seq_file *m
output file
struct super_block *sb
filesystem superblock
Description
Show (print on m) mount options for this sb.
Return
Returns 0 on success, negative values on failure.
Parameters
struct dentry *dentry
superblock handle
Description
Check permission before obtaining filesystem statistics for the mnt mountpoint. dentry is a handle on the superblock for the filesystem.
Return
Returns 0 if permission is granted.
-
int security_sb_mount(const char *dev_name, const struct path *path, const char *type, unsigned long flags, void *data)¶
Check permission for mounting a filesystem
Parameters
const char *dev_name
filesystem backing device
const struct path *path
mount point
const char *type
filesystem type
unsigned long flags
mount flags
void *data
filesystem specific data
Description
Check permission before an object specified by dev_name is mounted on the mount point named by nd. For an ordinary mount, dev_name identifies a device if the file system type requires a device. For a remount (flags & MS_REMOUNT), dev_name is irrelevant. For a loopback/bind mount (flags & MS_BIND), dev_name identifies the pathname of the object being mounted.
Return
Returns 0 if permission is granted.
-
int security_sb_umount(struct vfsmount *mnt, int flags)¶
Check permission for unmounting a filesystem
Parameters
struct vfsmount *mnt
mounted filesystem
int flags
unmount flags
Description
Check permission before the mnt file system is unmounted.
Return
Returns 0 if permission is granted.
-
int security_sb_pivotroot(const struct path *old_path, const struct path *new_path)¶
Check permissions for pivoting the rootfs
Parameters
const struct path *old_path
new location for current rootfs
const struct path *new_path
location of the new rootfs
Description
Check permission before pivoting the root filesystem.
Return
Returns 0 if permission is granted.
-
int security_move_mount(const struct path *from_path, const struct path *to_path)¶
Check permissions for moving a mount
Parameters
const struct path *from_path
source mount point
const struct path *to_path
destination mount point
Description
Check permission before a mount is moved.
Return
Returns 0 if permission is granted.
-
int security_path_notify(const struct path *path, u64 mask, unsigned int obj_type)¶
Check if setting a watch is allowed
Parameters
const struct path *path
file path
u64 mask
event mask
unsigned int obj_type
file path type
Description
Check permissions before setting a watch on events as defined by mask, on an object at path, whose type is defined by obj_type.
Return
Returns 0 if permission is granted.
Parameters
struct inode *inode
the inode
Description
Allocate and attach a security structure to inode->i_security. The i_security field is initialized to NULL when the inode structure is allocated.
Return
Return 0 if operation was successful.
Parameters
struct inode *inode
the inode
Description
Deallocate the inode security structure and set inode->i_security to NULL.
-
int security_inode_init_security_anon(struct inode *inode, const struct qstr *name, const struct inode *context_inode)¶
Initialize an anonymous inode
Parameters
struct inode *inode
the inode
const struct qstr *name
the anonymous inode class
const struct inode *context_inode
an optional related inode
Description
Set up the incore security field for the new anonymous inode and return whether the inode creation is permitted by the security module or not.
Return
Returns 0 on success, -EACCES if the security module denies the creation of this inode, or another -errno upon other errors.
-
int security_path_rmdir(const struct path *dir, struct dentry *dentry)¶
Check if removing a directory is allowed
Parameters
const struct path *dir
parent directory
struct dentry *dentry
directory to remove
Description
Check the permission to remove a directory.
Return
Returns 0 if permission is granted.
-
int security_path_symlink(const struct path *dir, struct dentry *dentry, const char *old_name)¶
Check if creating a symbolic link is allowed
Parameters
const struct path *dir
parent directory
struct dentry *dentry
symbolic link
const char *old_name
file pathname
Description
Check the permission to create a symbolic link to a file.
Return
Returns 0 if permission is granted.
-
int security_path_link(struct dentry *old_dentry, const struct path *new_dir, struct dentry *new_dentry)¶
Check if creating a hard link is allowed
Parameters
struct dentry *old_dentry
existing file
const struct path *new_dir
new parent directory
struct dentry *new_dentry
new link
Description
Check permission before creating a new hard link to a file.
Return
Returns 0 if permission is granted.
Parameters
const struct path *path
file
Description
Check permission before truncating the file indicated by path. Note that
truncation permissions may also be checked based on already opened files,
using the security_file_truncate()
hook.
Return
Returns 0 if permission is granted.
-
int security_path_chmod(const struct path *path, umode_t mode)¶
Check if changing the file's mode is allowed
Parameters
const struct path *path
file
umode_t mode
new mode
Description
Check for permission to change a mode of the file path. The new mode is specified in mode which is a bitmask of constants from <include/uapi/linux/stat.h>.
Return
Returns 0 if permission is granted.
-
int security_path_chown(const struct path *path, kuid_t uid, kgid_t gid)¶
Check if changing the file's owner/group is allowed
Parameters
const struct path *path
file
kuid_t uid
file owner
kgid_t gid
file group
Description
Check for permission to change owner/group of a file or directory.
Return
Returns 0 if permission is granted.
Parameters
const struct path *path
directory
Description
Check for permission to change root directory.
Return
Returns 0 if permission is granted.
-
int security_inode_link(struct dentry *old_dentry, struct inode *dir, struct dentry *new_dentry)¶
Check if creating a hard link is allowed
Parameters
struct dentry *old_dentry
existing file
struct inode *dir
new parent directory
struct dentry *new_dentry
new link
Description
Check permission before creating a new hard link to a file.
Return
Returns 0 if permission is granted.
-
int security_inode_unlink(struct inode *dir, struct dentry *dentry)¶
Check if removing a hard link is allowed
Parameters
struct inode *dir
parent directory
struct dentry *dentry
file
Description
Check the permission to remove a hard link to a file.
Return
Returns 0 if permission is granted.
-
int security_inode_symlink(struct inode *dir, struct dentry *dentry, const char *old_name)¶
Check if creating a symbolic link is allowed
Parameters
struct inode *dir
parent directory
struct dentry *dentry
symbolic link
const char *old_name
existing filename
Description
Check the permission to create a symbolic link to a file.
Return
Returns 0 if permission is granted.
-
int security_inode_rmdir(struct inode *dir, struct dentry *dentry)¶
Check if removing a directory is allowed
Parameters
struct inode *dir
parent directory
struct dentry *dentry
directory to be removed
Description
Check the permission to remove a directory.
Return
Returns 0 if permission is granted.
-
int security_inode_mknod(struct inode *dir, struct dentry *dentry, umode_t mode, dev_t dev)¶
Check if creating a special file is allowed
Parameters
struct inode *dir
parent directory
struct dentry *dentry
new file
umode_t mode
new file mode
dev_t dev
device number
Description
Check permissions when creating a special file (or a socket or a fifo file created via the mknod system call). Note that if mknod operation is being done for a regular file, then the create hook will be called and not this hook.
Return
Returns 0 if permission is granted.
-
int security_inode_rename(struct inode *old_dir, struct dentry *old_dentry, struct inode *new_dir, struct dentry *new_dentry, unsigned int flags)¶
Check if renaming a file is allowed
Parameters
struct inode *old_dir
parent directory of the old file
struct dentry *old_dentry
the old file
struct inode *new_dir
parent directory of the new file
struct dentry *new_dentry
the new file
unsigned int flags
flags
Description
Check for permission to rename a file or directory.
Return
Returns 0 if permission is granted.
Parameters
struct dentry *dentry
link
Description
Check the permission to read the symbolic link.
Return
Returns 0 if permission is granted.
-
int security_inode_follow_link(struct dentry *dentry, struct inode *inode, bool rcu)¶
Check if following a symbolic link is allowed
Parameters
struct dentry *dentry
link dentry
struct inode *inode
link inode
bool rcu
true if in RCU-walk mode
Description
Check permission to follow a symbolic link when looking up a pathname. If rcu is true, inode is not stable.
Return
Returns 0 if permission is granted.
-
int security_inode_permission(struct inode *inode, int mask)¶
Check if accessing an inode is allowed
Parameters
struct inode *inode
inode
int mask
access mask
Description
Check permission before accessing an inode. This hook is called by the existing Linux permission function, so a security module can use it to provide additional checking for existing Linux permission checks. Notice that this hook is called when a file is opened (as well as many other operations), whereas the file_security_ops permission hook is called when the actual read/write operations are performed.
Return
Returns 0 if permission is granted.
Parameters
const struct path *path
file
Description
Check permission before obtaining file attributes.
Return
Returns 0 if permission is granted.
-
int security_inode_setxattr(struct mnt_idmap *idmap, struct dentry *dentry, const char *name, const void *value, size_t size, int flags)¶
Check if setting file xattrs is allowed
Parameters
struct mnt_idmap *idmap
idmap of the mount
struct dentry *dentry
file
const char *name
xattr name
const void *value
xattr value
size_t size
size of xattr value
int flags
flags
Description
Check permission before setting the extended attributes.
Return
Returns 0 if permission is granted.
-
int security_inode_set_acl(struct mnt_idmap *idmap, struct dentry *dentry, const char *acl_name, struct posix_acl *kacl)¶
Check if setting posix acls is allowed
Parameters
struct mnt_idmap *idmap
idmap of the mount
struct dentry *dentry
file
const char *acl_name
acl name
struct posix_acl *kacl
acl struct
Description
Check permission before setting posix acls, the posix acls in kacl are identified by acl_name.
Return
Returns 0 if permission is granted.
-
int security_inode_get_acl(struct mnt_idmap *idmap, struct dentry *dentry, const char *acl_name)¶
Check if reading posix acls is allowed
Parameters
struct mnt_idmap *idmap
idmap of the mount
struct dentry *dentry
file
const char *acl_name
acl name
Description
Check permission before getting osix acls, the posix acls are identified by acl_name.
Return
Returns 0 if permission is granted.
-
int security_inode_remove_acl(struct mnt_idmap *idmap, struct dentry *dentry, const char *acl_name)¶
Check if removing a posix acl is allowed
Parameters
struct mnt_idmap *idmap
idmap of the mount
struct dentry *dentry
file
const char *acl_name
acl name
Description
Check permission before removing posix acls, the posix acls are identified by acl_name.
Return
Returns 0 if permission is granted.
-
void security_inode_post_setxattr(struct dentry *dentry, const char *name, const void *value, size_t size, int flags)¶
Update the inode after a setxattr operation
Parameters
struct dentry *dentry
file
const char *name
xattr name
const void *value
xattr value
size_t size
xattr value size
int flags
flags
Description
Update inode security field after successful setxattr operation.
-
int security_inode_getxattr(struct dentry *dentry, const char *name)¶
Check if xattr access is allowed
Parameters
struct dentry *dentry
file
const char *name
xattr name
Description
Check permission before obtaining the extended attributes identified by name for dentry.
Return
Returns 0 if permission is granted.
Parameters
struct dentry *dentry
file
Description
Check permission before obtaining the list of extended attribute names for dentry.
Return
Returns 0 if permission is granted.
-
int security_inode_removexattr(struct mnt_idmap *idmap, struct dentry *dentry, const char *name)¶
Check if removing an xattr is allowed
Parameters
struct mnt_idmap *idmap
idmap of the mount
struct dentry *dentry
file
const char *name
xattr name
Description
Check permission before removing the extended attribute identified by name for dentry.
Return
Returns 0 if permission is granted.
-
int security_inode_need_killpriv(struct dentry *dentry)¶
Check if
security_inode_killpriv()
required
Parameters
struct dentry *dentry
associated dentry
Description
Called when an inode has been changed to determine if
security_inode_killpriv()
should be called.
Return
- Return <0 on error to abort the inode change operation, return 0 if
security_inode_killpriv()
does not need to be called, return >0 ifsecurity_inode_killpriv()
does need to be called.
-
int security_inode_killpriv(struct mnt_idmap *idmap, struct dentry *dentry)¶
The setuid bit is removed, update LSM state
Parameters
struct mnt_idmap *idmap
idmap of the mount
struct dentry *dentry
associated dentry
Description
The dentry's setuid bit is being removed. Remove similar security labels. Called with the dentry->d_inode->i_mutex held.
Return
- Return 0 on success. If error is returned, then the operation
causing setuid bit removal is failed.
-
int security_inode_getsecurity(struct mnt_idmap *idmap, struct inode *inode, const char *name, void **buffer, bool alloc)¶
Get the xattr security label of an inode
Parameters
struct mnt_idmap *idmap
idmap of the mount
struct inode *inode
inode
const char *name
xattr name
void **buffer
security label buffer
bool alloc
allocation flag
Description
Retrieve a copy of the extended attribute representation of the security label associated with name for inode via buffer. Note that name is the remainder of the attribute name after the security prefix has been removed. alloc is used to specify if the call should return a value via the buffer or just the value length.
Return
Returns size of buffer on success.
-
int security_inode_setsecurity(struct inode *inode, const char *name, const void *value, size_t size, int flags)¶
Set the xattr security label of an inode
Parameters
struct inode *inode
inode
const char *name
xattr name
const void *value
security label
size_t size
length of security label
int flags
flags
Description
Set the security label associated with name for inode from the extended attribute value value. size indicates the size of the value in bytes. flags may be XATTR_CREATE, XATTR_REPLACE, or 0. Note that name is the remainder of the attribute name after the security. prefix has been removed.
Return
Returns 0 on success.
Parameters
struct inode *inode
inode
u32 *secid
secid to return
Description
Get the secid associated with the node. In case of failure, secid will be set to zero.
-
int security_kernfs_init_security(struct kernfs_node *kn_dir, struct kernfs_node *kn)¶
Init LSM context for a kernfs node
Parameters
struct kernfs_node *kn_dir
parent kernfs node
struct kernfs_node *kn
the kernfs node to initialize
Description
Initialize the security context of a newly created kernfs node based on its own and its parent's attributes.
Return
Returns 0 if permission is granted.
Parameters
struct file *file
file
int mask
requested permissions
Description
Check file permissions before accessing an open file. This hook is called by various operations that read or write files. A security module can use this hook to perform additional checking on these operations, e.g. to revalidate permissions on use to support privilege bracketing or policy changes. Notice that this hook is used when the actual read/write operations are performed, whereas the inode_security_ops hook is called when a file is opened (as well as many other operations). Although this hook can be used to revalidate permissions for various system call operations that read or write files, it does not address the revalidation of permissions for memory-mapped files. Security modules must handle this separately if they need such revalidation.
Return
Returns 0 if permission is granted.
Parameters
struct file *file
the file
Description
Allocate and attach a security structure to the file->f_security field. The security field is initialized to NULL when the structure is first created.
Return
Return 0 if the hook is successful and permission is granted.
Parameters
struct file *file
the file
Description
Deallocate and free any security structures stored in file->f_security.
-
int security_mmap_file(struct file *file, unsigned long prot, unsigned long flags)¶
Check if mmap'ing a file is allowed
Parameters
struct file *file
file
unsigned long prot
protection applied by the kernel
unsigned long flags
flags
Description
Check permissions for a mmap operation. The file may be NULL, e.g. if mapping anonymous memory.
Return
Returns 0 if permission is granted.
-
int security_mmap_addr(unsigned long addr)¶
Check if mmap'ing an address is allowed
Parameters
unsigned long addr
address
Description
Check permissions for a mmap operation at addr.
Return
Returns 0 if permission is granted.
-
int security_file_mprotect(struct vm_area_struct *vma, unsigned long reqprot, unsigned long prot)¶
Check if changing memory protections is allowed
Parameters
struct vm_area_struct *vma
memory region
unsigned long reqprot
application requested protection
unsigned long prot
protection applied by the kernel
Description
Check permissions before changing memory access permissions.
Return
Returns 0 if permission is granted.
Parameters
struct file *file
file
unsigned int cmd
lock operation (e.g. F_RDLCK, F_WRLCK)
Description
Check permission before performing file locking operations. Note the hook mediates both flock and fcntl style locks.
Return
Returns 0 if permission is granted.
-
int security_file_fcntl(struct file *file, unsigned int cmd, unsigned long arg)¶
Check if fcntl() op is allowed
Parameters
struct file *file
file
unsigned int cmd
fnctl command
unsigned long arg
command argument
Description
Check permission before allowing the file operation specified by cmd from being performed on the file file. Note that arg sometimes represents a user space pointer; in other cases, it may be a simple integer value. When arg represents a user space pointer, it should never be used by the security module.
Return
Returns 0 if permission is granted.
Parameters
struct file *file
the file
Description
Save owner security information (typically from current->security) in file->f_security for later use by the send_sigiotask hook.
Return
Returns 0 on success.
-
int security_file_send_sigiotask(struct task_struct *tsk, struct fown_struct *fown, int sig)¶
Check if sending SIGIO/SIGURG is allowed
Parameters
struct task_struct *tsk
target task
struct fown_struct *fown
signal sender
int sig
signal to be sent, SIGIO is sent if 0
Description
Check permission for the file owner fown to send SIGIO or SIGURG to the process tsk. Note that this hook is sometimes called from interrupt. Note that the fown_struct, fown, is never outside the context of a struct file, so the file structure (and associated security information) can always be obtained: container_of(fown, struct file, f_owner).
Return
Returns 0 if permission is granted.
Parameters
struct file *file
file being received
Description
This hook allows security modules to control the ability of a process to receive an open file descriptor via socket IPC.
Return
Returns 0 if permission is granted.
Parameters
struct file *file
Description
Save open-time permission checking state for later use upon file_permission, and recheck access if anything has changed since inode_permission.
Return
Returns 0 if permission is granted.
Parameters
struct file *file
file
Description
Check permission before truncating a file, i.e. using ftruncate. Note that truncation permission may also be checked based on the path, using the path_truncate hook.
Return
Returns 0 if permission is granted.
-
int security_task_alloc(struct task_struct *task, unsigned long clone_flags)¶
Allocate a task's LSM blob
Parameters
struct task_struct *task
the task
unsigned long clone_flags
flags indicating what is being shared
Description
Handle allocation of task-related resources.
Return
Returns a zero on success, negative values on failure.
-
void security_task_free(struct task_struct *task)¶
Free a task's LSM blob and related resources
Parameters
struct task_struct *task
task
Description
Handle release of task-related resources. Note that this can be called from interrupt context.
-
int security_cred_alloc_blank(struct cred *cred, gfp_t gfp)¶
Allocate the min memory to allow cred_transfer
Parameters
struct cred *cred
credentials
gfp_t gfp
gfp flags
Description
Only allocate sufficient memory and attach to cred such that cred_transfer() will not get ENOMEM.
Return
Returns 0 on success, negative values on failure.
Parameters
struct cred *cred
credentials
Description
Deallocate and clear the cred->security field in a set of credentials.
-
int security_prepare_creds(struct cred *new, const struct cred *old, gfp_t gfp)¶
Prepare a new set of credentials
Parameters
struct cred *new
new credentials
const struct cred *old
original credentials
gfp_t gfp
gfp flags
Description
Prepare a new set of credentials by copying the data from the old set.
Return
Returns 0 on success, negative values on failure.
-
void security_transfer_creds(struct cred *new, const struct cred *old)¶
Transfer creds
Parameters
struct cred *new
target credentials
const struct cred *old
original credentials
Description
Transfer data from original creds to new creds.
-
int security_kernel_act_as(struct cred *new, u32 secid)¶
Set the kernel credentials to act as secid
Parameters
struct cred *new
credentials
u32 secid
secid
Description
Set the credentials for a kernel service to act as (subjective context). The current task must be the one that nominated secid.
Return
Returns 0 if successful.
-
int security_kernel_create_files_as(struct cred *new, struct inode *inode)¶
Set file creation context using an inode
Parameters
struct cred *new
target credentials
struct inode *inode
reference inode
Description
Set the file creation context in a set of credentials to be the same as the objective context of the specified inode. The current task must be the one that nominated inode.
Return
Returns 0 if successful.
-
int security_kernel_module_request(char *kmod_name)¶
Check is loading a module is allowed
Parameters
char *kmod_name
module name
Description
Ability to trigger the kernel to automatically upcall to userspace for userspace to load a kernel module with the given name.
Return
Returns 0 if successful.
-
int security_task_fix_setuid(struct cred *new, const struct cred *old, int flags)¶
Update LSM with new user id attributes
Parameters
struct cred *new
updated credentials
const struct cred *old
credentials being replaced
int flags
LSM_SETID_* flag values
Description
Update the module's state after setting one or more of the user identity attributes of the current process. The flags parameter indicates which of the set*uid system calls invoked this hook. If new is the set of credentials that will be installed. Modifications should be made to this rather than to current->cred.
Return
Returns 0 on success.
-
int security_task_fix_setgid(struct cred *new, const struct cred *old, int flags)¶
Update LSM with new group id attributes
Parameters
struct cred *new
updated credentials
const struct cred *old
credentials being replaced
int flags
LSM_SETID_* flag value
Description
Update the module's state after setting one or more of the group identity attributes of the current process. The flags parameter indicates which of the set*gid system calls invoked this hook. new is the set of credentials that will be installed. Modifications should be made to this rather than to current->cred.
Return
Returns 0 on success.
-
int security_task_fix_setgroups(struct cred *new, const struct cred *old)¶
Update LSM with new supplementary groups
Parameters
struct cred *new
updated credentials
const struct cred *old
credentials being replaced
Description
Update the module's state after setting the supplementary group identity attributes of the current process. new is the set of credentials that will be installed. Modifications should be made to this rather than to current->cred.
Return
Returns 0 on success.
-
int security_task_setpgid(struct task_struct *p, pid_t pgid)¶
Check if setting the pgid is allowed
Parameters
struct task_struct *p
task being modified
pid_t pgid
new pgid
Description
Check permission before setting the process group identifier of the process p to pgid.
Return
Returns 0 if permission is granted.
-
int security_task_getpgid(struct task_struct *p)¶
Check if getting the pgid is allowed
Parameters
struct task_struct *p
task
Description
Check permission before getting the process group identifier of the process p.
Return
Returns 0 if permission is granted.
-
int security_task_getsid(struct task_struct *p)¶
Check if getting the session id is allowed
Parameters
struct task_struct *p
task
Description
Check permission before getting the session identifier of the process p.
Return
Returns 0 if permission is granted.
-
int security_task_setnice(struct task_struct *p, int nice)¶
Check if setting a task's nice value is allowed
Parameters
struct task_struct *p
target task
int nice
nice value
Description
Check permission before setting the nice value of p to nice.
Return
Returns 0 if permission is granted.
-
int security_task_setioprio(struct task_struct *p, int ioprio)¶
Check if setting a task's ioprio is allowed
Parameters
struct task_struct *p
target task
int ioprio
ioprio value
Description
Check permission before setting the ioprio value of p to ioprio.
Return
Returns 0 if permission is granted.
-
int security_task_getioprio(struct task_struct *p)¶
Check if getting a task's ioprio is allowed
Parameters
struct task_struct *p
task
Description
Check permission before getting the ioprio value of p.
Return
Returns 0 if permission is granted.
-
int security_task_prlimit(const struct cred *cred, const struct cred *tcred, unsigned int flags)¶
Check if get/setting resources limits is allowed
Parameters
const struct cred *cred
current task credentials
const struct cred *tcred
target task credentials
unsigned int flags
LSM_PRLIMIT_* flag bits indicating a get/set/both
Description
Check permission before getting and/or setting the resource limits of another task.
Return
Returns 0 if permission is granted.
-
int security_task_setrlimit(struct task_struct *p, unsigned int resource, struct rlimit *new_rlim)¶
Check if setting a new rlimit value is allowed
Parameters
struct task_struct *p
target task's group leader
unsigned int resource
resource whose limit is being set
struct rlimit *new_rlim
new resource limit
Description
Check permission before setting the resource limits of process p for resource to new_rlim. The old resource limit values can be examined by dereferencing (p->signal->rlim + resource).
Return
Returns 0 if permission is granted.
-
int security_task_setscheduler(struct task_struct *p)¶
Check if setting sched policy/param is allowed
Parameters
struct task_struct *p
target task
Description
Check permission before setting scheduling policy and/or parameters of process p.
Return
Returns 0 if permission is granted.
-
int security_task_getscheduler(struct task_struct *p)¶
Check if getting scheduling info is allowed
Parameters
struct task_struct *p
target task
Description
Check permission before obtaining scheduling information for process p.
Return
Returns 0 if permission is granted.
-
int security_task_movememory(struct task_struct *p)¶
Check if moving memory is allowed
Parameters
struct task_struct *p
task
Description
Check permission before moving memory owned by process p.
Return
Returns 0 if permission is granted.
-
int security_task_kill(struct task_struct *p, struct kernel_siginfo *info, int sig, const struct cred *cred)¶
Check if sending a signal is allowed
Parameters
struct task_struct *p
target process
struct kernel_siginfo *info
signal information
int sig
signal value
const struct cred *cred
credentials of the signal sender, NULL if current
Description
Check permission before sending signal sig to p. info can be NULL, the constant 1, or a pointer to a kernel_siginfo structure. If info is 1 or SI_FROMKERNEL(info) is true, then the signal should be viewed as coming from the kernel and should typically be permitted. SIGIO signals are handled separately by the send_sigiotask hook in file_security_ops.
Return
Returns 0 if permission is granted.
-
int security_task_prctl(int option, unsigned long arg2, unsigned long arg3, unsigned long arg4, unsigned long arg5)¶
Check if a prctl op is allowed
Parameters
int option
operation
unsigned long arg2
argument
unsigned long arg3
argument
unsigned long arg4
argument
unsigned long arg5
argument
Description
Check permission before performing a process control operation on the current process.
Return
- Return -ENOSYS if no-one wanted to handle this op, any other value
to cause prctl() to return immediately with that value.
-
void security_task_to_inode(struct task_struct *p, struct inode *inode)¶
Set the security attributes of a task's inode
Parameters
struct task_struct *p
task
struct inode *inode
inode
Description
Set the security attributes for an inode based on an associated task's security attributes, e.g. for /proc/pid inodes.
Parameters
const struct cred *cred
prepared creds
Description
Check permission prior to creating a new user namespace.
Return
Returns 0 if successful, otherwise < 0 error code.
-
int security_ipc_permission(struct kern_ipc_perm *ipcp, short flag)¶
Check if sysv ipc access is allowed
Parameters
struct kern_ipc_perm *ipcp
ipc permission structure
short flag
requested permissions
Description
Check permissions for access to IPC.
Return
Returns 0 if permission is granted.
-
void security_ipc_getsecid(struct kern_ipc_perm *ipcp, u32 *secid)¶
Get the sysv ipc object's secid
Parameters
struct kern_ipc_perm *ipcp
ipc permission structure
u32 *secid
secid pointer
Description
Get the secid associated with the ipc object. In case of failure, secid will be set to zero.
-
int security_msg_msg_alloc(struct msg_msg *msg)¶
Allocate a sysv ipc message LSM blob
Parameters
struct msg_msg *msg
message structure
Description
Allocate and attach a security structure to the msg->security field. The security field is initialized to NULL when the structure is first created.
Return
Return 0 if operation was successful and permission is granted.
-
void security_msg_msg_free(struct msg_msg *msg)¶
Free a sysv ipc message LSM blob
Parameters
struct msg_msg *msg
message structure
Description
Deallocate the security structure for this message.
-
int security_msg_queue_alloc(struct kern_ipc_perm *msq)¶
Allocate a sysv ipc msg queue LSM blob
Parameters
struct kern_ipc_perm *msq
sysv ipc permission structure
Description
Allocate and attach a security structure to msg. The security field is initialized to NULL when the structure is first created.
Return
Returns 0 if operation was successful and permission is granted.
-
void security_msg_queue_free(struct kern_ipc_perm *msq)¶
Free a sysv ipc msg queue LSM blob
Parameters
struct kern_ipc_perm *msq
sysv ipc permission structure
Description
Deallocate security field perm->security for the message queue.
-
int security_msg_queue_associate(struct kern_ipc_perm *msq, int msqflg)¶
Check if a msg queue operation is allowed
Parameters
struct kern_ipc_perm *msq
sysv ipc permission structure
int msqflg
operation flags
Description
Check permission when a message queue is requested through the msgget system call. This hook is only called when returning the message queue identifier for an existing message queue, not when a new message queue is created.
Return
Return 0 if permission is granted.
-
int security_msg_queue_msgctl(struct kern_ipc_perm *msq, int cmd)¶
Check if a msg queue operation is allowed
Parameters
struct kern_ipc_perm *msq
sysv ipc permission structure
int cmd
operation
Description
Check permission when a message control operation specified by cmd is to be performed on the message queue with permissions.
Return
Returns 0 if permission is granted.
-
int security_msg_queue_msgsnd(struct kern_ipc_perm *msq, struct msg_msg *msg, int msqflg)¶
Check if sending a sysv ipc message is allowed
Parameters
struct kern_ipc_perm *msq
sysv ipc permission structure
struct msg_msg *msg
message
int msqflg
operation flags
Description
Check permission before a message, msg, is enqueued on the message queue with permissions specified in msq.
Return
Returns 0 if permission is granted.
-
int security_msg_queue_msgrcv(struct kern_ipc_perm *msq, struct msg_msg *msg, struct task_struct *target, long type, int mode)¶
Check if receiving a sysv ipc msg is allowed
Parameters
struct kern_ipc_perm *msq
sysv ipc permission structure
struct msg_msg *msg
message
struct task_struct *target
target task
long type
type of message requested
int mode
operation flags
Description
Check permission before a message, msg, is removed from the message queue. The target task structure contains a pointer to the process that will be receiving the message (not equal to the current process when inline receives are being performed).
Return
Returns 0 if permission is granted.
-
int security_shm_alloc(struct kern_ipc_perm *shp)¶
Allocate a sysv shm LSM blob
Parameters
struct kern_ipc_perm *shp
sysv ipc permission structure
Description
Allocate and attach a security structure to the shp security field. The security field is initialized to NULL when the structure is first created.
Return
Returns 0 if operation was successful and permission is granted.
-
void security_shm_free(struct kern_ipc_perm *shp)¶
Free a sysv shm LSM blob
Parameters
struct kern_ipc_perm *shp
sysv ipc permission structure
Description
Deallocate the security structure perm->security for the memory segment.
-
int security_shm_associate(struct kern_ipc_perm *shp, int shmflg)¶
Check if a sysv shm operation is allowed
Parameters
struct kern_ipc_perm *shp
sysv ipc permission structure
int shmflg
operation flags
Description
Check permission when a shared memory region is requested through the shmget system call. This hook is only called when returning the shared memory region identifier for an existing region, not when a new shared memory region is created.
Return
Returns 0 if permission is granted.
-
int security_shm_shmctl(struct kern_ipc_perm *shp, int cmd)¶
Check if a sysv shm operation is allowed
Parameters
struct kern_ipc_perm *shp
sysv ipc permission structure
int cmd
operation
Description
Check permission when a shared memory control operation specified by cmd is to be performed on the shared memory region with permissions in shp.
Return
Return 0 if permission is granted.
-
int security_shm_shmat(struct kern_ipc_perm *shp, char __user *shmaddr, int shmflg)¶
Check if a sysv shm attach operation is allowed
Parameters
struct kern_ipc_perm *shp
sysv ipc permission structure
char __user *shmaddr
address of memory region to attach
int shmflg
operation flags
Description
Check permissions prior to allowing the shmat system call to attach the shared memory segment with permissions shp to the data segment of the calling process. The attaching address is specified by shmaddr.
Return
Returns 0 if permission is granted.
-
int security_sem_alloc(struct kern_ipc_perm *sma)¶
Allocate a sysv semaphore LSM blob
Parameters
struct kern_ipc_perm *sma
sysv ipc permission structure
Description
Allocate and attach a security structure to the sma security field. The security field is initialized to NULL when the structure is first created.
Return
Returns 0 if operation was successful and permission is granted.
-
void security_sem_free(struct kern_ipc_perm *sma)¶
Free a sysv semaphore LSM blob
Parameters
struct kern_ipc_perm *sma
sysv ipc permission structure
Description
Deallocate security structure sma->security for the semaphore.
-
int security_sem_associate(struct kern_ipc_perm *sma, int semflg)¶
Check if a sysv semaphore operation is allowed
Parameters
struct kern_ipc_perm *sma
sysv ipc permission structure
int semflg
operation flags
Description
Check permission when a semaphore is requested through the semget system call. This hook is only called when returning the semaphore identifier for an existing semaphore, not when a new one must be created.
Return
Returns 0 if permission is granted.
-
int security_sem_semctl(struct kern_ipc_perm *sma, int cmd)¶
Check if a sysv semaphore operation is allowed
Parameters
struct kern_ipc_perm *sma
sysv ipc permission structure
int cmd
operation
Description
Check permission when a semaphore operation specified by cmd is to be performed on the semaphore.
Return
Returns 0 if permission is granted.
-
int security_sem_semop(struct kern_ipc_perm *sma, struct sembuf *sops, unsigned nsops, int alter)¶
Check if a sysv semaphore operation is allowed
Parameters
struct kern_ipc_perm *sma
sysv ipc permission structure
struct sembuf *sops
operations to perform
unsigned nsops
number of operations
int alter
flag indicating changes will be made
Description
Check permissions before performing operations on members of the semaphore set. If the alter flag is nonzero, the semaphore set may be modified.
Return
Returns 0 if permission is granted.
-
int security_getprocattr(struct task_struct *p, const char *lsm, const char *name, char **value)¶
Read an attribute for a task
Parameters
struct task_struct *p
the task
const char *lsm
LSM name
const char *name
attribute name
char **value
attribute value
Description
Read attribute name for task p and store it into value if allowed.
Return
Returns the length of value on success, a negative value otherwise.
-
int security_setprocattr(const char *lsm, const char *name, void *value, size_t size)¶
Set an attribute for a task
Parameters
const char *lsm
LSM name
const char *name
attribute name
void *value
attribute value
size_t size
attribute value size
Description
Write (set) the current task's attribute name to value, size size if allowed.
Return
Returns bytes written on success, a negative value otherwise.
-
int security_netlink_send(struct sock *sk, struct sk_buff *skb)¶
Save info and check if netlink sending is allowed
Parameters
struct sock *sk
sending socket
struct sk_buff *skb
netlink message
Description
Save security information for a netlink message so that permission checking can be performed when the message is processed. The security information can be saved using the eff_cap field of the netlink_skb_parms structure. Also may be used to provide fine grained control over message transmission.
Return
- Returns 0 if the information was successfully saved and message is
allowed to be transmitted.
-
int security_post_notification(const struct cred *w_cred, const struct cred *cred, struct watch_notification *n)¶
Check if a watch notification can be posted
Parameters
const struct cred *w_cred
credentials of the task that set the watch
const struct cred *cred
credentials of the task which triggered the watch
struct watch_notification *n
the notification
Description
Check to see if a watch notification can be posted to a particular queue.
Return
Returns 0 if permission is granted.
Parameters
struct key *key
the key to watch
Description
Check to see if a process is allowed to watch for event notifications from a key or keyring.
Return
Returns 0 if permission is granted.
-
int security_socket_create(int family, int type, int protocol, int kern)¶
Check if creating a new socket is allowed
Parameters
int family
protocol family
int type
communications type
int protocol
requested protocol
int kern
set to 1 if a kernel socket is requested
Description
Check permissions prior to creating a new socket.
Return
Returns 0 if permission is granted.
-
int security_socket_post_create(struct socket *sock, int family, int type, int protocol, int kern)¶
Initialize a newly created socket
Parameters
struct socket *sock
socket
int family
protocol family
int type
communications type
int protocol
requested protocol
int kern
set to 1 if a kernel socket is requested
Description
This hook allows a module to update or allocate a per-socket security structure. Note that the security field was not added directly to the socket structure, but rather, the socket security information is stored in the associated inode. Typically, the inode alloc_security hook will allocate and attach security information to SOCK_INODE(sock)->i_security. This hook may be used to update the SOCK_INODE(sock)->i_security field with additional information that wasn't available when the inode was allocated.
Return
Returns 0 if permission is granted.
-
int security_socket_bind(struct socket *sock, struct sockaddr *address, int addrlen)¶
Check if a socket bind operation is allowed
Parameters
struct socket *sock
socket
struct sockaddr *address
requested bind address
int addrlen
length of address
Description
Check permission before socket protocol layer bind operation is performed and the socket sock is bound to the address specified in the address parameter.
Return
Returns 0 if permission is granted.
-
int security_socket_connect(struct socket *sock, struct sockaddr *address, int addrlen)¶
Check if a socket connect operation is allowed
Parameters
struct socket *sock
socket
struct sockaddr *address
address of remote connection point
int addrlen
length of address
Description
Check permission before socket protocol layer connect operation attempts to connect socket sock to a remote address, address.
Return
Returns 0 if permission is granted.
-
int security_socket_listen(struct socket *sock, int backlog)¶
Check if a socket is allowed to listen
Parameters
struct socket *sock
socket
int backlog
connection queue size
Description
Check permission before socket protocol layer listen operation.
Return
Returns 0 if permission is granted.
-
int security_socket_accept(struct socket *sock, struct socket *newsock)¶
Check if a socket is allowed to accept connections
Parameters
struct socket *sock
listening socket
struct socket *newsock
newly creation connection socket
Description
Check permission before accepting a new connection. Note that the new socket, newsock, has been created and some information copied to it, but the accept operation has not actually been performed.
Return
Returns 0 if permission is granted.
-
int security_socket_sendmsg(struct socket *sock, struct msghdr *msg, int size)¶
Check is sending a message is allowed
Parameters
struct socket *sock
sending socket
struct msghdr *msg
message to send
int size
size of message
Description
Check permission before transmitting a message to another socket.
Return
Returns 0 if permission is granted.
-
int security_socket_recvmsg(struct socket *sock, struct msghdr *msg, int size, int flags)¶
Check if receiving a message is allowed
Parameters
struct socket *sock
receiving socket
struct msghdr *msg
message to receive
int size
size of message
int flags
operational flags
Description
Check permission before receiving a message from a socket.
Return
Returns 0 if permission is granted.
Parameters
struct socket *sock
socket
Description
Check permission before reading the local address (name) of the socket object.
Return
Returns 0 if permission is granted.
Parameters
struct socket *sock
socket
Description
Check permission before the remote address (name) of a socket object.
Return
Returns 0 if permission is granted.
-
int security_socket_getsockopt(struct socket *sock, int level, int optname)¶
Check if reading a socket option is allowed
Parameters
struct socket *sock
socket
int level
option's protocol level
int optname
option name
Description
Check permissions before retrieving the options associated with socket sock.
Return
Returns 0 if permission is granted.
-
int security_socket_setsockopt(struct socket *sock, int level, int optname)¶
Check if setting a socket option is allowed
Parameters
struct socket *sock
socket
int level
option's protocol level
int optname
option name
Description
Check permissions before setting the options associated with socket sock.
Return
Returns 0 if permission is granted.
-
int security_socket_shutdown(struct socket *sock, int how)¶
Checks if shutting down the socket is allowed
Parameters
struct socket *sock
socket
int how
flag indicating how sends and receives are handled
Description
Checks permission before all or part of a connection on the socket sock is shut down.
Return
Returns 0 if permission is granted.
-
int security_socket_getpeersec_stream(struct socket *sock, sockptr_t optval, sockptr_t optlen, unsigned int len)¶
Get the remote peer label
Parameters
struct socket *sock
socket
sockptr_t optval
destination buffer
sockptr_t optlen
size of peer label copied into the buffer
unsigned int len
maximum size of the destination buffer
Description
This hook allows the security module to provide peer socket security state for unix or connected tcp sockets to userspace via getsockopt SO_GETPEERSEC. For tcp sockets this can be meaningful if the socket is associated with an ipsec SA.
Return
- Returns 0 if all is well, otherwise, typical getsockopt return
values.
-
int security_sk_alloc(struct sock *sk, int family, gfp_t priority)¶
Allocate and initialize a sock's LSM blob
Parameters
struct sock *sk
sock
int family
protocol family
gfp_t priority
gfp flags
Description
Allocate and attach a security structure to the sk->sk_security field, which is used to copy security attributes between local stream sockets.
Return
Returns 0 on success, error on failure.
- void security_sk_free(struct sock<