Fsphinx.addnodesdocument)}( rawsourcechildren]( translations LanguagesNode)}(hhh](h pending_xref)}(hhh]docutils.nodesTextChinese (Simplified)}parenthsba attributes}(ids]classes]names]dupnames]backrefs] refdomainstdreftypedoc reftarget1/translations/zh_CN/admin-guide/cgroup-v1/cgroupsmodnameN classnameN refexplicitutagnamehhh ubh)}(hhh]hChinese (Traditional)}hh2sbah}(h]h ]h"]h$]h&] refdomainh)reftypeh+ reftarget1/translations/zh_TW/admin-guide/cgroup-v1/cgroupsmodnameN classnameN refexplicituh1hhh ubh)}(hhh]hItalian}hhFsbah}(h]h ]h"]h$]h&] refdomainh)reftypeh+ reftarget1/translations/it_IT/admin-guide/cgroup-v1/cgroupsmodnameN classnameN refexplicituh1hhh ubh)}(hhh]hJapanese}hhZsbah}(h]h ]h"]h$]h&] refdomainh)reftypeh+ reftarget1/translations/ja_JP/admin-guide/cgroup-v1/cgroupsmodnameN classnameN refexplicituh1hhh ubh)}(hhh]hKorean}hhnsbah}(h]h ]h"]h$]h&] refdomainh)reftypeh+ reftarget1/translations/ko_KR/admin-guide/cgroup-v1/cgroupsmodnameN classnameN refexplicituh1hhh ubh)}(hhh]hSpanish}hhsbah}(h]h ]h"]h$]h&] refdomainh)reftypeh+ reftarget1/translations/sp_SP/admin-guide/cgroup-v1/cgroupsmodnameN classnameN refexplicituh1hhh ubeh}(h]h ]h"]h$]h&]current_languageEnglishuh1h hh _documenthsourceNlineNubhsection)}(hhh](htitle)}(hControl Groupsh]hControl Groups}(hhhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhhhK/var/lib/git/docbuild/linux/Documentation/admin-guide/cgroup-v1/cgroups.rsthKubh paragraph)}(hcWritten by Paul Menage based on Documentation/admin-guide/cgroup-v1/cpusets.rsth](hWritten by Paul Menage <}(hhhhhNhNubh reference)}(hmenage@google.comh]hmenage@google.com}(hhhhhNhNubah}(h]h ]h"]h$]h&]refurimailto:menage@google.comuh1hhhubh:> based on Documentation/admin-guide/cgroup-v1/cpusets.rst}(hhhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhKhhhhubh)}(h/Original copyright statements from cpusets.txt:h]h/Original copyright statements from cpusets.txt:}(hhhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhhhhubh)}(h$Portions Copyright (C) 2004 BULL SA.h]h$Portions Copyright (C) 2004 BULL SA.}(hhhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhK hhhhubh)}(h7Portions Copyright (c) 2004-2006 Silicon Graphics, Inc.h]h7Portions Copyright (c) 2004-2006 Silicon Graphics, Inc.}(hhhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhK hhhhubh)}(h%Modified by Paul Jackson h](hModified by Paul Jackson <}(hjhhhNhNubh)}(h pj@sgi.comh]h pj@sgi.com}(hjhhhNhNubah}(h]h ]h"]h$]h&]refurimailto:pj@sgi.comuh1hhjubh>}(hjhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhKhhhhubh)}(h,Modified by Christoph Lameter h](hModified by Christoph Lameter <}(hj)hhhNhNubh)}(h cl@linux.comh]h cl@linux.com}(hj1hhhNhNubah}(h]h ]h"]h$]h&]refurimailto:cl@linux.comuh1hhj)ubh>}(hj)hhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhKhhhhubhcomment)}(hXCONTENTS: 1. Control Groups 1.1 What are cgroups ? 1.2 Why are cgroups needed ? 1.3 How are cgroups implemented ? 1.4 What does notify_on_release do ? 1.5 What does clone_children do ? 1.6 How do I use cgroups ? 2. Usage Examples and Syntax 2.1 Basic Usage 2.2 Attaching processes 2.3 Mounting hierarchies by name 3. Kernel API 3.1 Overview 3.2 Synchronization 3.3 Subsystem API 4. Extended attributes usage 5. Questionsh]hXCONTENTS: 1. Control Groups 1.1 What are cgroups ? 1.2 Why are cgroups needed ? 1.3 How are cgroups implemented ? 1.4 What does notify_on_release do ? 1.5 What does clone_children do ? 1.6 How do I use cgroups ? 2. Usage Examples and Syntax 2.1 Basic Usage 2.2 Attaching processes 2.3 Mounting hierarchies by name 3. Kernel API 3.1 Overview 3.2 Synchronization 3.3 Subsystem API 4. Extended attributes usage 5. Questions}hjMsbah}(h]h ]h"]h$]h&] xml:spacepreserveuh1jKhhhhhhhK%ubh)}(hhh](h)}(h1. Control Groupsh]h1. Control Groups}(hj`hhhNhNubah}(h]h ]h"]h$]h&]uh1hhj]hhhhhK'ubh)}(hhh](h)}(h1.1 What are cgroups ?h]h1.1 What are cgroups ?}(hjqhhhNhNubah}(h]h ]h"]h$]h&]uh1hhjnhhhhhK*ubh)}(hControl Groups provide a mechanism for aggregating/partitioning sets of tasks, and all their future children, into hierarchical groups with specialized behaviour.h]hControl Groups provide a mechanism for aggregating/partitioning sets of tasks, and all their future children, into hierarchical groups with specialized behaviour.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhK,hjnhhubh)}(h Definitions:h]h Definitions:}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhK0hjnhhubh)}(hYA *cgroup* associates a set of tasks with a set of parameters for one or more subsystems.h](hA }(hjhhhNhNubhemphasis)}(h*cgroup*h]hcgroup}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjubhO associates a set of tasks with a set of parameters for one or more subsystems.}(hjhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhK2hjnhhubh)}(hXUA *subsystem* is a module that makes use of the task grouping facilities provided by cgroups to treat groups of tasks in particular ways. A subsystem is typically a "resource controller" that schedules a resource or applies per-cgroup limits, but it may be anything that wants to act on a group of processes, e.g. a virtualization subsystem.h](hA }(hjhhhNhNubj)}(h *subsystem*h]h subsystem}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjubhXL is a module that makes use of the task grouping facilities provided by cgroups to treat groups of tasks in particular ways. A subsystem is typically a “resource controller” that schedules a resource or applies per-cgroup limits, but it may be anything that wants to act on a group of processes, e.g. a virtualization subsystem.}(hjhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhK5hjnhhubh)}(hXLA *hierarchy* is a set of cgroups arranged in a tree, such that every task in the system is in exactly one of the cgroups in the hierarchy, and a set of subsystems; each subsystem has system-specific state attached to each cgroup in the hierarchy. Each hierarchy has an instance of the cgroup virtual filesystem associated with it.h](hA }(hjhhhNhNubj)}(h *hierarchy*h]h hierarchy}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjubhX? is a set of cgroups arranged in a tree, such that every task in the system is in exactly one of the cgroups in the hierarchy, and a set of subsystems; each subsystem has system-specific state attached to each cgroup in the hierarchy. Each hierarchy has an instance of the cgroup virtual filesystem associated with it.}(hjhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhKUser-level code may create and destroy cgroups by name in an instance of the cgroup virtual file system, specify and query to which cgroup a task is assigned, and list the task PIDs assigned to a cgroup. Those creations and assignments only affect the hierarchy associated with that instance of the cgroup file system.h]hX>User-level code may create and destroy cgroups by name in an instance of the cgroup virtual file system, specify and query to which cgroup a task is assigned, and list the task PIDs assigned to a cgroup. Those creations and assignments only affect the hierarchy associated with that instance of the cgroup file system.}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKEhjnhhubh)}(hXOn their own, the only use for cgroups is for simple job tracking. The intention is that other subsystems hook into the generic cgroup support to provide new attributes for cgroups, such as accounting/limiting the resources which processes in a cgroup can access. For example, cpusets (see Documentation/admin-guide/cgroup-v1/cpusets.rst) allow you to associate a set of CPUs and a set of memory nodes with the tasks in each cgroup.h]hXOn their own, the only use for cgroups is for simple job tracking. The intention is that other subsystems hook into the generic cgroup support to provide new attributes for cgroups, such as accounting/limiting the resources which processes in a cgroup can access. For example, cpusets (see Documentation/admin-guide/cgroup-v1/cpusets.rst) allow you to associate a set of CPUs and a set of memory nodes with the tasks in each cgroup.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKKhjnhhubhtarget)}(h.. _cgroups-why-needed:h]h}(h]h ]h"]h$]h&]refidcgroups-why-neededuh1j'hKShjnhhhhubeh}(h]what-are-cgroupsah ]h"]1.1 what are cgroups ?ah$]h&]uh1hhj]hhhhhK*ubh)}(hhh](h)}(h1.2 Why are cgroups needed ?h]h1.2 Why are cgroups needed ?}(hj@hhhNhNubah}(h]h ]h"]h$]h&]uh1hhj=hhhhhKVubh)}(hX{There are multiple efforts to provide process aggregations in the Linux kernel, mainly for resource-tracking purposes. Such efforts include cpusets, CKRM/ResGroups, UserBeanCounters, and virtual server namespaces. These all require the basic notion of a grouping/partitioning of processes, with newly forked processes ending up in the same group (cgroup) as their parent process.h]hX{There are multiple efforts to provide process aggregations in the Linux kernel, mainly for resource-tracking purposes. Such efforts include cpusets, CKRM/ResGroups, UserBeanCounters, and virtual server namespaces. These all require the basic notion of a grouping/partitioning of processes, with newly forked processes ending up in the same group (cgroup) as their parent process.}(hjNhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKXhj=hhubh)}(hXThe kernel cgroup patch provides the minimum essential kernel mechanisms required to efficiently implement such groups. It has minimal impact on the system fast paths, and provides hooks for specific subsystems such as cpusets to provide additional behaviour as desired.h]hXThe kernel cgroup patch provides the minimum essential kernel mechanisms required to efficiently implement such groups. It has minimal impact on the system fast paths, and provides hooks for specific subsystems such as cpusets to provide additional behaviour as desired.}(hj\hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhK_hj=hhubh)}(hXMultiple hierarchy support is provided to allow for situations where the division of tasks into cgroups is distinctly different for different subsystems - having parallel hierarchies allows each hierarchy to be a natural division of tasks, without having to handle complex combinations of tasks that would be present if several unrelated subsystems needed to be forced into the same tree of cgroups.h]hXMultiple hierarchy support is provided to allow for situations where the division of tasks into cgroups is distinctly different for different subsystems - having parallel hierarchies allows each hierarchy to be a natural division of tasks, without having to handle complex combinations of tasks that would be present if several unrelated subsystems needed to be forced into the same tree of cgroups.}(hjjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKehj=hhubh)}(hAt one extreme, each resource controller or subsystem could be in a separate hierarchy; at the other extreme, all subsystems would be attached to the same hierarchy.h]hAt one extreme, each resource controller or subsystem could be in a separate hierarchy; at the other extreme, all subsystems would be attached to the same hierarchy.}(hjxhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKmhj=hhubh)}(hXAs an example of a scenario (originally proposed by vatsa@in.ibm.com) that can benefit from multiple hierarchies, consider a large university server with various users - students, professors, system tasks etc. The resource planning for this server could be along the following lines::h](h4As an example of a scenario (originally proposed by }(hjhhhNhNubh)}(hvatsa@in.ibm.comh]hvatsa@in.ibm.com}(hjhhhNhNubah}(h]h ]h"]h$]h&]refurimailto:vatsa@in.ibm.comuh1hhjubh) that can benefit from multiple hierarchies, consider a large university server with various users - students, professors, system tasks etc. The resource planning for this server could be along the following lines:}(hjhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhKqhj=hhubh literal_block)}(hX CPU : "Top cpuset" / \ CPUSet1 CPUSet2 | | (Professors) (Students) In addition (system tasks) are attached to topcpuset (so that they can run anywhere) with a limit of 20% Memory : Professors (50%), Students (30%), system (20%) Disk : Professors (50%), Students (30%), system (20%) Network : WWW browsing (20%), Network File System (60%), others (20%) / \ Professors (15%) students (5%)h]hX CPU : "Top cpuset" / \ CPUSet1 CPUSet2 | | (Professors) (Students) In addition (system tasks) are attached to topcpuset (so that they can run anywhere) with a limit of 20% Memory : Professors (50%), Students (30%), system (20%) Disk : Professors (50%), Students (30%), system (20%) Network : WWW browsing (20%), Network File System (60%), others (20%) / \ Professors (15%) students (5%)}hjsbah}(h]h ]h"]h$]h&]j[j\uh1jhhhKwhj=hhubh)}(hhBrowsers like Firefox/Lynx go into the WWW network class, while (k)nfsd goes into the NFS network class.h]hhBrowsers like Firefox/Lynx go into the WWW network class, while (k)nfsd goes into the NFS network class.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhj=hhubh)}(huAt the same time Firefox/Lynx will share an appropriate CPU/Memory class depending on who launched it (prof/student).h]huAt the same time Firefox/Lynx will share an appropriate CPU/Memory class depending on who launched it (prof/student).}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhj=hhubh)}(hXWith the ability to classify tasks differently for different resources (by putting those resource subsystems in different hierarchies), the admin can easily set up a script which receives exec notifications and depending on who is launching the browser he can::h]hXWith the ability to classify tasks differently for different resources (by putting those resource subsystems in different hierarchies), the admin can easily set up a script which receives exec notifications and depending on who is launching the browser he can:}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhj=hhubj)}(h?# echo browser_pid > /sys/fs/cgroup///tasksh]h?# echo browser_pid > /sys/fs/cgroup///tasks}hjsbah}(h]h ]h"]h$]h&]j[j\uh1jhhhKhj=hhubh)}(hWith only a single hierarchy, he now would potentially have to create a separate cgroup for every browser launched and associate it with appropriate network and other resource class. This may lead to proliferation of such cgroups.h]hWith only a single hierarchy, he now would potentially have to create a separate cgroup for every browser launched and associate it with appropriate network and other resource class. This may lead to proliferation of such cgroups.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhj=hhubh)}(hAlso let's say that the administrator would like to give enhanced network access temporarily to a student's browser (since it is night and the user wants to do online gaming :)) OR give one of the student's simulation apps enhanced CPU power.h]hAlso let’s say that the administrator would like to give enhanced network access temporarily to a student’s browser (since it is night and the user wants to do online gaming :)) OR give one of the student’s simulation apps enhanced CPU power.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhj=hhubh)}(hPWith ability to write PIDs directly to resource classes, it's just a matter of::h]hQWith ability to write PIDs directly to resource classes, it’s just a matter of:}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhj=hhubj)}(h~# echo pid > /sys/fs/cgroup/network//tasks (after some time) # echo pid > /sys/fs/cgroup/network//tasksh]h~# echo pid > /sys/fs/cgroup/network//tasks (after some time) # echo pid > /sys/fs/cgroup/network//tasks}hjsbah}(h]h ]h"]h$]h&]j[j\uh1jhhhKhj=hhubh)}(hWithout this ability, the administrator would have to split the cgroup into multiple separate ones and then associate the new cgroups with the new resource classes.h]hWithout this ability, the administrator would have to split the cgroup into multiple separate ones and then associate the new cgroups with the new resource classes.}(hj(hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhj=hhubeh}(h](why-are-cgroups-neededj4eh ]h"](1.2 why are cgroups needed ?cgroups-why-neededeh$]h&]uh1hhj]hhhhhKVexpect_referenced_by_name}j<j)sexpect_referenced_by_id}j4j)subh)}(hhh](h)}(h!1.3 How are cgroups implemented ?h]h!1.3 How are cgroups implemented ?}(hjFhhhNhNubah}(h]h ]h"]h$]h&]uh1hhjChhhhhKubh)}(h-Control Groups extends the kernel as follows:h]h-Control Groups extends the kernel as follows:}(hjThhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhjChhubh block_quote)}(hX- Each task in the system has a reference-counted pointer to a css_set. - A css_set contains a set of reference-counted pointers to cgroup_subsys_state objects, one for each cgroup subsystem registered in the system. There is no direct link from a task to the cgroup of which it's a member in each hierarchy, but this can be determined by following pointers through the cgroup_subsys_state objects. This is because accessing the subsystem state is something that's expected to happen frequently and in performance-critical code, whereas operations that require a task's actual cgroup assignments (in particular, moving between cgroups) are less common. A linked list runs through the cg_list field of each task_struct using the css_set, anchored at css_set->tasks. - A cgroup hierarchy filesystem can be mounted for browsing and manipulation from user space. - You can list all the tasks (by PID) attached to any cgroup. h]h bullet_list)}(hhh](h list_item)}(hFEach task in the system has a reference-counted pointer to a css_set. h]h)}(hEEach task in the system has a reference-counted pointer to a css_set.h]hEEach task in the system has a reference-counted pointer to a css_set.}(hjshhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhjoubah}(h]h ]h"]h$]h&]uh1jmhjjubjn)}(hXA css_set contains a set of reference-counted pointers to cgroup_subsys_state objects, one for each cgroup subsystem registered in the system. There is no direct link from a task to the cgroup of which it's a member in each hierarchy, but this can be determined by following pointers through the cgroup_subsys_state objects. This is because accessing the subsystem state is something that's expected to happen frequently and in performance-critical code, whereas operations that require a task's actual cgroup assignments (in particular, moving between cgroups) are less common. A linked list runs through the cg_list field of each task_struct using the css_set, anchored at css_set->tasks. h]h)}(hXA css_set contains a set of reference-counted pointers to cgroup_subsys_state objects, one for each cgroup subsystem registered in the system. There is no direct link from a task to the cgroup of which it's a member in each hierarchy, but this can be determined by following pointers through the cgroup_subsys_state objects. This is because accessing the subsystem state is something that's expected to happen frequently and in performance-critical code, whereas operations that require a task's actual cgroup assignments (in particular, moving between cgroups) are less common. A linked list runs through the cg_list field of each task_struct using the css_set, anchored at css_set->tasks.h]hXA css_set contains a set of reference-counted pointers to cgroup_subsys_state objects, one for each cgroup subsystem registered in the system. There is no direct link from a task to the cgroup of which it’s a member in each hierarchy, but this can be determined by following pointers through the cgroup_subsys_state objects. This is because accessing the subsystem state is something that’s expected to happen frequently and in performance-critical code, whereas operations that require a task’s actual cgroup assignments (in particular, moving between cgroups) are less common. A linked list runs through the cg_list field of each task_struct using the css_set, anchored at css_set->tasks.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhjubah}(h]h ]h"]h$]h&]uh1jmhjjubjn)}(h\A cgroup hierarchy filesystem can be mounted for browsing and manipulation from user space. h]h)}(h[A cgroup hierarchy filesystem can be mounted for browsing and manipulation from user space.h]h[A cgroup hierarchy filesystem can be mounted for browsing and manipulation from user space.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhjubah}(h]h ]h"]h$]h&]uh1jmhjjubjn)}(h cpuset.cpus /bin/echo 1 > cpuset.mems /bin/echo $$ > tasks sh # The subshell 'sh' is now running in cgroup Charlie # The next line should display '/Charlie' cat /proc/self/cgrouph]hXpmount -t tmpfs cgroup_root /sys/fs/cgroup mkdir /sys/fs/cgroup/cpuset mount -t cgroup cpuset -ocpuset /sys/fs/cgroup/cpuset cd /sys/fs/cgroup/cpuset mkdir Charlie cd Charlie /bin/echo 2-3 > cpuset.cpus /bin/echo 1 > cpuset.mems /bin/echo $$ > tasks sh # The subshell 'sh' is now running in cgroup Charlie # The next line should display '/Charlie' cat /proc/self/cgroup}hjsbah}(h]h ]h"]h$]h&]j[j\uh1jhhhMKhjhhubeh}(h]how-do-i-use-cgroupsah ]h"]1.6 how do i use cgroups ?ah$]h&]uh1hhj]hhhhhM8ubeh}(h]id1ah ]h"]1. control groupsah$]h&]uh1hhhhhhhhK'ubh)}(hhh](h)}(h2. Usage Examples and Syntaxh]h2. Usage Examples and Syntax}(hj$hhhNhNubah}(h]h ]h"]h$]h&]uh1hhj!hhhhhMZubh)}(hhh](h)}(h2.1 Basic Usageh]h2.1 Basic Usage}(hj5hhhNhNubah}(h]h ]h"]h$]h&]uh1hhj2hhhhhM]ubh)}(hUCreating, modifying, using cgroups can be done through the cgroup virtual filesystem.h]hUCreating, modifying, using cgroups can be done through the cgroup virtual filesystem.}(hjChhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhM_hj2hhubh)}(hATo mount a cgroup hierarchy with all available subsystems, type::h]h@To mount a cgroup hierarchy with all available subsystems, type:}(hjQhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMbhj2hhubj)}(h$# mount -t cgroup xxx /sys/fs/cgrouph]h$# mount -t cgroup xxx /sys/fs/cgroup}hj_sbah}(h]h ]h"]h$]h&]j[j\uh1jhhhMdhj2hhubh)}(hThe "xxx" is not interpreted by the cgroup code, but will appear in /proc/mounts so may be any useful identifying string that you like.h]hThe “xxx” is not interpreted by the cgroup code, but will appear in /proc/mounts so may be any useful identifying string that you like.}(hjmhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMfhj2hhubh)}(hNote: Some subsystems do not work without some user input first. For instance, if cpusets are enabled the user will have to populate the cpus and mems files for each new cgroup created before that group can be used.h]hNote: Some subsystems do not work without some user input first. For instance, if cpusets are enabled the user will have to populate the cpus and mems files for each new cgroup created before that group can be used.}(hj{hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMihj2hhubh)}(hX&As explained in section `1.2 Why are cgroups needed?` you should create different hierarchies of cgroups for each single resource or group of resources you want to control. Therefore, you should mount a tmpfs on /sys/fs/cgroup and create directories for each cgroup resource or resource group::h](hAs explained in section }(hjhhhNhNubhtitle_reference)}(h`1.2 Why are cgroups needed?`h]h1.2 Why are cgroups needed?}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jhjubh you should create different hierarchies of cgroups for each single resource or group of resources you want to control. Therefore, you should mount a tmpfs on /sys/fs/cgroup and create directories for each cgroup resource or resource group:}(hjhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhMmhj2hhubj)}(hF# mount -t tmpfs cgroup_root /sys/fs/cgroup # mkdir /sys/fs/cgroup/rg1h]hF# mount -t tmpfs cgroup_root /sys/fs/cgroup # mkdir /sys/fs/cgroup/rg1}hjsbah}(h]h ]h"]h$]h&]j[j\uh1jhhhMshj2hhubh)}(hNTo mount a cgroup hierarchy with just the cpuset and memory subsystems, type::h]hMTo mount a cgroup hierarchy with just the cpuset and memory subsystems, type:}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMvhj2hhubj)}(h;# mount -t cgroup -o cpuset,memory hier1 /sys/fs/cgroup/rg1h]h;# mount -t cgroup -o cpuset,memory hier1 /sys/fs/cgroup/rg1}hjsbah}(h]h ]h"]h$]h&]j[j\uh1jhhhMyhj2hhubh)}(hXXWhile remounting cgroups is currently supported, it is not recommend to use it. Remounting allows changing bound subsystems and release_agent. Rebinding is hardly useful as it only works when the hierarchy is empty and release_agent itself should be replaced with conventional fsnotify. The support for remounting will be removed in the future.h]hXXWhile remounting cgroups is currently supported, it is not recommend to use it. Remounting allows changing bound subsystems and release_agent. Rebinding is hardly useful as it only works when the hierarchy is empty and release_agent itself should be replaced with conventional fsnotify. The support for remounting will be removed in the future.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhM{hj2hhubh)}(h(To Specify a hierarchy's release_agent::h]h)To Specify a hierarchy’s release_agent:}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMhj2hhubj)}(ha# mount -t cgroup -o cpuset,release_agent="/sbin/cpuset_release_agent" \ xxx /sys/fs/cgroup/rg1h]ha# mount -t cgroup -o cpuset,release_agent="/sbin/cpuset_release_agent" \ xxx /sys/fs/cgroup/rg1}hjsbah}(h]h ]h"]h$]h&]j[j\uh1jhhhMhj2hhubh)}(hHNote that specifying 'release_agent' more than once will return failure.h]hLNote that specifying ‘release_agent’ more than once will return failure.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMhj2hhubh)}(hXNote that changing the set of subsystems is currently only supported when the hierarchy consists of a single (root) cgroup. Supporting the ability to arbitrarily bind/unbind subsystems from an existing cgroup hierarchy is intended to be implemented in the future.h]hXNote that changing the set of subsystems is currently only supported when the hierarchy consists of a single (root) cgroup. Supporting the ability to arbitrarily bind/unbind subsystems from an existing cgroup hierarchy is intended to be implemented in the future.}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMhj2hhubh)}(hThen under /sys/fs/cgroup/rg1 you can find a tree that corresponds to the tree of the cgroups in the system. For instance, /sys/fs/cgroup/rg1 is the cgroup that holds the whole system.h]hThen under /sys/fs/cgroup/rg1 you can find a tree that corresponds to the tree of the cgroups in the system. For instance, /sys/fs/cgroup/rg1 is the cgroup that holds the whole system.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMhj2hhubh)}(h2If you want to change the value of release_agent::h]h1If you want to change the value of release_agent:}(hj)hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMhj2hhubj)}(hC# echo "/sbin/new_release_agent" > /sys/fs/cgroup/rg1/release_agenth]hC# echo "/sbin/new_release_agent" > /sys/fs/cgroup/rg1/release_agent}hj7sbah}(h]h ]h"]h$]h&]j[j\uh1jhhhMhj2hhubh)}(h#It can also be changed via remount.h]h#It can also be changed via remount.}(hjEhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMhj2hhubh)}(h=If you want to create a new cgroup under /sys/fs/cgroup/rg1::h]h tasksh]h# /bin/echo $$ > tasks}hjsbah}(h]h ]h"]h$]h&]j[j\uh1jhhhMhj2hhubh)}(hQYou can also create cgroups inside your cgroup by using mkdir in this directory::h]hPYou can also create cgroups inside your cgroup by using mkdir in this directory:}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMhj2hhubj)}(h# mkdir my_sub_csh]h# mkdir my_sub_cs}hjsbah}(h]h ]h"]h$]h&]j[j\uh1jhhhMhj2hhubh)}(h$To remove a cgroup, just use rmdir::h]h#To remove a cgroup, just use rmdir:}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMhj2hhubj)}(h# rmdir my_sub_csh]h# rmdir my_sub_cs}hjsbah}(h]h ]h"]h$]h&]j[j\uh1jhhhMhj2hhubh)}(hThis will fail if the cgroup is in use (has cgroups inside, or has processes attached, or is held alive by other subsystem-specific reference).h]hThis will fail if the cgroup is in use (has cgroups inside, or has processes attached, or is held alive by other subsystem-specific reference).}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMhj2hhubeh}(h] basic-usageah ]h"]2.1 basic usageah$]h&]uh1hhj!hhhhhM]ubh)}(hhh](h)}(h2.2 Attaching processesh]h2.2 Attaching processes}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhjhhhhhMubj)}(h# /bin/echo PID > tasksh]h# /bin/echo PID > tasks}hj,sbah}(h]h ]h"]h$]h&]j[j\uh1jhhhMhjhhubh)}(hNote that it is PID, not PIDs. You can only attach ONE task at a time. If you have several tasks to attach, you have to do it one after another::h]hNote that it is PID, not PIDs. You can only attach ONE task at a time. If you have several tasks to attach, you have to do it one after another:}(hj:hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMhjhhubj)}(hV# /bin/echo PID1 > tasks # /bin/echo PID2 > tasks ... # /bin/echo PIDn > tasksh]hV# /bin/echo PID1 > tasks # /bin/echo PID2 > tasks ... # /bin/echo PIDn > tasks}hjHsbah}(h]h ]h"]h$]h&]j[j\uh1jhhhMhjhhubh)}(h4You can attach the current shell task by echoing 0::h]h3You can attach the current shell task by echoing 0:}(hjVhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMhjhhubj)}(h# echo 0 > tasksh]h# echo 0 > tasks}hjdsbah}(h]h ]h"]h$]h&]j[j\uh1jhhhMhjhhubh)}(hX6You can use the cgroup.procs file instead of the tasks file to move all threads in a threadgroup at once. Echoing the PID of any task in a threadgroup to cgroup.procs causes all tasks in that threadgroup to be attached to the cgroup. Writing 0 to cgroup.procs moves all tasks in the writing task's threadgroup.h]hX8You can use the cgroup.procs file instead of the tasks file to move all threads in a threadgroup at once. Echoing the PID of any task in a threadgroup to cgroup.procs causes all tasks in that threadgroup to be attached to the cgroup. Writing 0 to cgroup.procs moves all tasks in the writing task’s threadgroup.}(hjrhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMhjhhubh)}(hNote: Since every task is always a member of exactly one cgroup in each mounted hierarchy, to remove a task from its current cgroup you must move it into a new cgroup (possibly the root cgroup) by writing to the new cgroup's tasks file.h]hNote: Since every task is always a member of exactly one cgroup in each mounted hierarchy, to remove a task from its current cgroup you must move it into a new cgroup (possibly the root cgroup) by writing to the new cgroup’s tasks file.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMhjhhubh)}(hoNote: Due to some restrictions enforced by some cgroup subsystems, moving a process to another cgroup can fail.h]hoNote: Due to some restrictions enforced by some cgroup subsystems, moving a process to another cgroup can fail.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMhjhhubeh}(h]attaching-processesah ]h"]2.2 attaching processesah$]h&]uh1hhj!hhhhhMubh)}(hhh](h)}(h 2.3 Mounting hierarchies by nameh]h 2.3 Mounting hierarchies by name}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhjhhhhhMubh)}(hX,Passing the name= option when mounting a cgroups hierarchy associates the given name with the hierarchy. This can be used when mounting a pre-existing hierarchy, in order to refer to it by name rather than by its set of active subsystems. Each hierarchy is either nameless, or has a unique name.h]hX,Passing the name= option when mounting a cgroups hierarchy associates the given name with the hierarchy. This can be used when mounting a pre-existing hierarchy, in order to refer to it by name rather than by its set of active subsystems. Each hierarchy is either nameless, or has a unique name.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMhjhhubh)}(hThe name should match [\w.-]+h]hThe name should match [w.-]+}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMhjhhubh)}(hWhen passing a name= option for a new hierarchy, you need to specify subsystems manually; the legacy behaviour of mounting all subsystems when none are explicitly specified is not supported when you give a subsystem a name.h]hWhen passing a name= option for a new hierarchy, you need to specify subsystems manually; the legacy behaviour of mounting all subsystems when none are explicitly specified is not supported when you give a subsystem a name.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMhjhhubh)}(hoThe name of the subsystem appears as part of the hierarchy description in /proc/mounts and /proc//cgroups.h]hoThe name of the subsystem appears as part of the hierarchy description in /proc/mounts and /proc//cgroups.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMhjhhubeh}(h]mounting-hierarchies-by-nameah ]h"] 2.3 mounting hierarchies by nameah$]h&]uh1hhj!hhhhhMubeh}(h]usage-examples-and-syntaxah ]h"]2. usage examples and syntaxah$]h&]uh1hhhhhhhhMZubh)}(hhh](h)}(h 3. Kernel APIh]h 3. Kernel API}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1hhjhhhhhMubh)}(hhh](h)}(h 3.1 Overviewh]h 3.1 Overview}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1hhj hhhhhMubh)}(hXEach kernel subsystem that wants to hook into the generic cgroup system needs to create a cgroup_subsys object. This contains various methods, which are callbacks from the cgroup system, along with a subsystem ID which will be assigned by the cgroup system.h]hXEach kernel subsystem that wants to hook into the generic cgroup system needs to create a cgroup_subsys object. This contains various methods, which are callbacks from the cgroup system, along with a subsystem ID which will be assigned by the cgroup system.}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMhj hhubh)}(h1Other fields in the cgroup_subsys object include:h]h1Other fields in the cgroup_subsys object include:}(hj- hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMhj hhubji)}(hhh](jn)}(hsubsys_id: a unique array index for the subsystem, indicating which entry in cgroup->subsys[] this subsystem should be managing. h]h)}(hsubsys_id: a unique array index for the subsystem, indicating which entry in cgroup->subsys[] this subsystem should be managing.h]hsubsys_id: a unique array index for the subsystem, indicating which entry in cgroup->subsys[] this subsystem should be managing.}(hjB hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMhj> ubah}(h]h ]h"]h$]h&]uh1jmhj; hhhhhNubjn)}(hjname: should be initialized to a unique subsystem name. Should be no longer than MAX_CGROUP_TYPE_NAMELEN. h]h)}(hiname: should be initialized to a unique subsystem name. Should be no longer than MAX_CGROUP_TYPE_NAMELEN.h]hiname: should be initialized to a unique subsystem name. Should be no longer than MAX_CGROUP_TYPE_NAMELEN.}(hjZ hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMhjV ubah}(h]h ]h"]h$]h&]uh1jmhj; hhhhhNubjn)}(hQearly_init: indicate if the subsystem needs early initialization at system boot. h]h)}(hPearly_init: indicate if the subsystem needs early initialization at system boot.h]hPearly_init: indicate if the subsystem needs early initialization at system boot.}(hjr hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMhjn ubah}(h]h ]h"]h$]h&]uh1jmhj; hhhhhNubeh}(h]h ]h"]h$]h&]jjuh1jhhhhMhj hhubh)}(hEach cgroup object created by the system has an array of pointers, indexed by subsystem ID; this pointer is entirely managed by the subsystem; the generic cgroup code will never touch this pointer.h]hEach cgroup object created by the system has an array of pointers, indexed by subsystem ID; this pointer is entirely managed by the subsystem; the generic cgroup code will never touch this pointer.}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMhj hhubeh}(h]overviewah ]h"] 3.1 overviewah$]h&]uh1hhjhhhhhMubh)}(hhh](h)}(h3.2 Synchronizationh]h3.2 Synchronization}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1hhj hhhhhMubh)}(hXThere is a global mutex, cgroup_mutex, used by the cgroup system. This should be taken by anything that wants to modify a cgroup. It may also be taken to prevent cgroups from being modified, but more specific locks may be more appropriate in that situation.h]hXThere is a global mutex, cgroup_mutex, used by the cgroup system. This should be taken by anything that wants to modify a cgroup. It may also be taken to prevent cgroups from being modified, but more specific locks may be more appropriate in that situation.}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhM hj hhubh)}(h%See kernel/cgroup.c for more details.h]h%See kernel/cgroup.c for more details.}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMhj hhubh)}(h]Subsystems can take/release the cgroup_mutex via the functions cgroup_lock()/cgroup_unlock().h]h]Subsystems can take/release the cgroup_mutex via the functions cgroup_lock()/cgroup_unlock().}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMhj hhubh)}(hAccessing a task's cgroup pointer may be done in the following ways: - while holding cgroup_mutex - while holding the task's alloc_lock (via task_lock()) - inside an rcu_read_lock() section via rcu_dereference()h]hAccessing a task’s cgroup pointer may be done in the following ways: - while holding cgroup_mutex - while holding the task’s alloc_lock (via task_lock()) - inside an rcu_read_lock() section via rcu_dereference()}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMhj hhubeh}(h]synchronizationah ]h"]3.2 synchronizationah$]h&]uh1hhjhhhhhMubh)}(hhh](h)}(h3.3 Subsystem APIh]h3.3 Subsystem API}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1hhj hhhhhMubh)}(hEach subsystem should:h]hEach subsystem should:}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMhj hhubji)}(hhh](jn)}(h%add an entry in linux/cgroup_subsys.hh]h)}(hj h]h%add an entry in linux/cgroup_subsys.h}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMhj ubah}(h]h ]h"]h$]h&]uh1jmhj hhhhhNubjn)}(h8define a cgroup_subsys object called _cgrp_subsys h]h)}(h7define a cgroup_subsys object called _cgrp_subsysh]h7define a cgroup_subsys object called _cgrp_subsys}(hj0 hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhM hj, ubah}(h]h ]h"]h$]h&]uh1jmhj hhhhhNubeh}(h]h ]h"]h$]h&]jjuh1jhhhhMhj hhubh)}(hEach subsystem may export the following methods. The only mandatory methods are css_alloc/free. Any others that are null are presumed to be successful no-ops.h]hEach subsystem may export the following methods. The only mandatory methods are css_alloc/free. Any others that are null are presumed to be successful no-ops.}(hjJ hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhM"hj hhubh)}(h\``struct cgroup_subsys_state *css_alloc(struct cgroup *cgrp)`` (cgroup_mutex held by caller)h](hliteral)}(h>``struct cgroup_subsys_state *css_alloc(struct cgroup *cgrp)``h]h:struct cgroup_subsys_state *css_alloc(struct cgroup *cgrp)}(hj^ hhhNhNubah}(h]h ]h"]h$]h&]uh1j\ hjX ubh (cgroup_mutex held by caller)}(hjX hhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhM&hj hhubh)}(hXCalled to allocate a subsystem state object for a cgroup. The subsystem should allocate its subsystem state object for the passed cgroup, returning a pointer to the new object on success or a ERR_PTR() value. On success, the subsystem pointer should point to a structure of type cgroup_subsys_state (typically embedded in a larger subsystem-specific object), which will be initialized by the cgroup system. Note that this will be called at initialization to create the root subsystem state for this subsystem; this case can be identified by the passed cgroup object having a NULL parent (since it's the root of the hierarchy) and may be an appropriate place for initialization code.h]hXCalled to allocate a subsystem state object for a cgroup. The subsystem should allocate its subsystem state object for the passed cgroup, returning a pointer to the new object on success or a ERR_PTR() value. On success, the subsystem pointer should point to a structure of type cgroup_subsys_state (typically embedded in a larger subsystem-specific object), which will be initialized by the cgroup system. Note that this will be called at initialization to create the root subsystem state for this subsystem; this case can be identified by the passed cgroup object having a NULL parent (since it’s the root of the hierarchy) and may be an appropriate place for initialization code.O}(hjv hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhM)hj hhubh)}(hE``int css_online(struct cgroup *cgrp)`` (cgroup_mutex held by caller)h](j] )}(h'``int css_online(struct cgroup *cgrp)``h]h#int css_online(struct cgroup *cgrp)}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j\ hj ubh (cgroup_mutex held by caller)}(hj hhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhM5hj hhubh)}(hXeCalled after @cgrp successfully completed all allocations and made visible to cgroup_for_each_child/descendant_*() iterators. The subsystem may choose to fail creation by returning -errno. This callback can be used to implement reliable state sharing and propagation along the hierarchy. See the comment on cgroup_for_each_live_descendant_pre() for details.h]hXeCalled after @cgrp successfully completed all allocations and made visible to cgroup_for_each_child/descendant_*() iterators. The subsystem may choose to fail creation by returning -errno. This callback can be used to implement reliable state sharing and propagation along the hierarchy. See the comment on cgroup_for_each_live_descendant_pre() for details.}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhM8hj hhubh)}(hH``void css_offline(struct cgroup *cgrp);`` (cgroup_mutex held by caller)h](j] )}(h*``void css_offline(struct cgroup *cgrp);``h]h&void css_offline(struct cgroup *cgrp);}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j\ hj ubh (cgroup_mutex held by caller)}(hj hhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhM?hj hhubh)}(hXThis is the counterpart of css_online() and called iff css_online() has succeeded on @cgrp. This signifies the beginning of the end of @cgrp. @cgrp is being removed and the subsystem should start dropping all references it's holding on @cgrp. When all references are dropped, cgroup removal will proceed to the next step - css_free(). After this callback, @cgrp should be considered dead to the subsystem.h]hXThis is the counterpart of css_online() and called iff css_online() has succeeded on @cgrp. This signifies the beginning of the end of @cgrp. @cgrp is being removed and the subsystem should start dropping all references it’s holding on @cgrp. When all references are dropped, cgroup removal will proceed to the next step - css_free(). After this callback, @cgrp should be considered dead to the subsystem.}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMBhj hhubh)}(hD``void css_free(struct cgroup *cgrp)`` (cgroup_mutex held by caller)h](j] )}(h&``void css_free(struct cgroup *cgrp)``h]h"void css_free(struct cgroup *cgrp)}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j\ hj ubh (cgroup_mutex held by caller)}(hj hhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhMIhj hhubh)}(hXQThe cgroup system is about to free @cgrp; the subsystem should free its subsystem state object. By the time this method is called, @cgrp is completely unused; @cgrp->parent is still valid. (Note - can also be called for a newly-created cgroup if an error occurs after this subsystem's create() method has been called for the new cgroup).h]hXSThe cgroup system is about to free @cgrp; the subsystem should free its subsystem state object. By the time this method is called, @cgrp is completely unused; @cgrp->parent is still valid. (Note - can also be called for a newly-created cgroup if an error occurs after this subsystem’s create() method has been called for the new cgroup).}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMLhj hhubh)}(hb``int can_attach(struct cgroup *cgrp, struct cgroup_taskset *tset)`` (cgroup_mutex held by caller)h](j] )}(hD``int can_attach(struct cgroup *cgrp, struct cgroup_taskset *tset)``h]h@int can_attach(struct cgroup *cgrp, struct cgroup_taskset *tset)}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j\ hj ubh (cgroup_mutex held by caller)}(hj hhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhMRhj hhubh)}(hCalled prior to moving one or more tasks into a cgroup; if the subsystem returns an error, this will abort the attach operation. @tset contains the tasks to be attached and is guaranteed to have at least one task in it.h]hCalled prior to moving one or more tasks into a cgroup; if the subsystem returns an error, this will abort the attach operation. @tset contains the tasks to be attached and is guaranteed to have at least one task in it.}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMUhj hhubhdefinition_list)}(hhh]hdefinition_list_item)}(hIf there are multiple tasks in the taskset, then: - it's guaranteed that all are from the same thread group - @tset contains all tasks from the thread group whether or not they're switching cgroups - the first task is the leader h](hterm)}(h1If there are multiple tasks in the taskset, then:h]h1If there are multiple tasks in the taskset, then:}(hj9 hhhNhNubah}(h]h ]h"]h$]h&]uh1j7 hhhM^hj3 ubh definition)}(hhh]ji)}(hhh](jn)}(h7it's guaranteed that all are from the same thread grouph]h)}(hjQ h]h9it’s guaranteed that all are from the same thread group}(hjS hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhM[hjO ubah}(h]h ]h"]h$]h&]uh1jmhjL ubjn)}(hW@tset contains all tasks from the thread group whether or not they're switching cgroupsh]h)}(hW@tset contains all tasks from the thread group whether or not they're switching cgroupsh]hY@tset contains all tasks from the thread group whether or not they’re switching cgroups}(hjj hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhM\hjf ubah}(h]h ]h"]h$]h&]uh1jmhjL ubjn)}(hthe first task is the leader h]h)}(hthe first task is the leaderh]hthe first task is the leader}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhM^hj~ ubah}(h]h ]h"]h$]h&]uh1jmhjL ubeh}(h]h ]h"]h$]h&]jjuh1jhhhhM[hjI ubah}(h]h ]h"]h$]h&]uh1jG hj3 ubeh}(h]h ]h"]h$]h&]uh1j1 hhhM^hj. ubah}(h]h ]h"]h$]h&]uh1j, hj hhhNhNubh)}(hXEach @tset entry also contains the task's old cgroup and tasks which aren't switching cgroup can be skipped easily using the cgroup_taskset_for_each() iterator. Note that this isn't called on a fork. If this method returns 0 (success) then this should remain valid while the caller holds cgroup_mutex and it is ensured that either attach() or cancel_attach() will be called in future.h]hXEach @tset entry also contains the task’s old cgroup and tasks which aren’t switching cgroup can be skipped easily using the cgroup_taskset_for_each() iterator. Note that this isn’t called on a fork. If this method returns 0 (success) then this should remain valid while the caller holds cgroup_mutex and it is ensured that either attach() or cancel_attach() will be called in future.}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhM`hj hhubh)}(hQ``void css_reset(struct cgroup_subsys_state *css)`` (cgroup_mutex held by caller)h](j] )}(h3``void css_reset(struct cgroup_subsys_state *css)``h]h/void css_reset(struct cgroup_subsys_state *css)}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j\ hj ubh (cgroup_mutex held by caller)}(hj hhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhMghj hhubh)}(hXeAn optional operation which should restore @css's configuration to the initial state. This is currently only used on the unified hierarchy when a subsystem is disabled on a cgroup through "cgroup.subtree_control" but should remain enabled because other subsystems depend on it. cgroup core makes such a css invisible by removing the associated interface files and invokes this callback so that the hidden subsystem can return to the initial neutral state. This prevents unexpected resource control from a hidden css and ensures that the configuration is in the initial state when it is made visible again later.h]hXkAn optional operation which should restore @css’s configuration to the initial state. This is currently only used on the unified hierarchy when a subsystem is disabled on a cgroup through “cgroup.subtree_control” but should remain enabled because other subsystems depend on it. cgroup core makes such a css invisible by removing the associated interface files and invokes this callback so that the hidden subsystem can return to the initial neutral state. This prevents unexpected resource control from a hidden css and ensures that the configuration is in the initial state when it is made visible again later.}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMjhj hhubh)}(hf``void cancel_attach(struct cgroup *cgrp, struct cgroup_taskset *tset)`` (cgroup_mutex held by caller)h](j] )}(hH``void cancel_attach(struct cgroup *cgrp, struct cgroup_taskset *tset)``h]hDvoid cancel_attach(struct cgroup *cgrp, struct cgroup_taskset *tset)}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j\ hj ubh (cgroup_mutex held by caller)}(hj hhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhMuhj hhubh)}(hXoCalled when a task attach operation has failed after can_attach() has succeeded. A subsystem whose can_attach() has some side-effects should provide this function, so that the subsystem can implement a rollback. If not, not necessary. This will be called only about subsystems whose can_attach() operation have succeeded. The parameters are identical to can_attach().h]hXoCalled when a task attach operation has failed after can_attach() has succeeded. A subsystem whose can_attach() has some side-effects should provide this function, so that the subsystem can implement a rollback. If not, not necessary. This will be called only about subsystems whose can_attach() operation have succeeded. The parameters are identical to can_attach().}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMxhj hhubh)}(h_``void attach(struct cgroup *cgrp, struct cgroup_taskset *tset)`` (cgroup_mutex held by caller)h](j] )}(hA``void attach(struct cgroup *cgrp, struct cgroup_taskset *tset)``h]h=void attach(struct cgroup *cgrp, struct cgroup_taskset *tset)}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j\ hj ubh (cgroup_mutex held by caller)}(hj hhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhM~hj hhubh)}(hCalled after the task has been attached to the cgroup, to allow any post-attachment activity that requires memory allocations or blocking. The parameters are identical to can_attach().h]hCalled after the task has been attached to the cgroup, to allow any post-attachment activity that requires memory allocations or blocking. The parameters are identical to can_attach().}(hj, hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMhj hhubh)}(h'``void fork(struct task_struct *task)``h]j] )}(hj< h]h#void fork(struct task_struct *task)}(hj> hhhNhNubah}(h]h ]h"]h$]h&]uh1j\ hj: ubah}(h]h ]h"]h$]h&]uh1hhhhMhj hhubh)}(h+Called when a task is forked into a cgroup.h]h+Called when a task is forked into a cgroup.}(hjQ hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMhj hhubh)}(h'``void exit(struct task_struct *task)``h]j] )}(hja h]h#void exit(struct task_struct *task)}(hjc hhhNhNubah}(h]h ]h"]h$]h&]uh1j\ hj_ ubah}(h]h ]h"]h$]h&]uh1hhhhMhj hhubh)}(hCalled during task exit.h]hCalled during task exit.}(hjv hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMhj hhubh)}(h'``void free(struct task_struct *task)``h]j] )}(hj h]h#void free(struct task_struct *task)}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j\ hj ubah}(h]h ]h"]h$]h&]uh1hhhhMhj hhubh)}(h%Called when the task_struct is freed.h]h%Called when the task_struct is freed.}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMhj hhubh)}(h@``void bind(struct cgroup *root)`` (cgroup_mutex held by caller)h](j] )}(h"``void bind(struct cgroup *root)``h]hvoid bind(struct cgroup *root)}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j\ hj ubh (cgroup_mutex held by caller)}(hj hhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhMhj hhubh)}(hX Called when a cgroup subsystem is rebound to a different hierarchy and root cgroup. Currently this will only involve movement between the default hierarchy (which never has sub-cgroups) and a hierarchy that is being created/destroyed (and hence has no sub-cgroups).h]hX Called when a cgroup subsystem is rebound to a different hierarchy and root cgroup. Currently this will only involve movement between the default hierarchy (which never has sub-cgroups) and a hierarchy that is being created/destroyed (and hence has no sub-cgroups).}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMhj hhubeh}(h] subsystem-apiah ]h"]3.3 subsystem apiah$]h&]uh1hhjhhhhhMubeh}(h] kernel-apiah ]h"] 3. kernel apiah$]h&]uh1hhhhhhhhMubh)}(hhh](h)}(h4. Extended attribute usageh]h4. Extended attribute usage}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1hhj hhhhhMubh)}(hcgroup filesystem supports certain types of extended attributes in its directories and files. The current supported types are:h]hcgroup filesystem supports certain types of extended attributes in its directories and files. The current supported types are:}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMhj hhubjc)}(h6- Trusted (XATTR_TRUSTED) - Security (XATTR_SECURITY) h]ji)}(hhh](jn)}(hTrusted (XATTR_TRUSTED)h]h)}(hj h]hTrusted (XATTR_TRUSTED)}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMhj ubah}(h]h ]h"]h$]h&]uh1jmhj ubjn)}(hSecurity (XATTR_SECURITY) h]h)}(hSecurity (XATTR_SECURITY)h]hSecurity (XATTR_SECURITY)}(hj$ hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMhj ubah}(h]h ]h"]h$]h&]uh1jmhj ubeh}(h]h ]h"]h$]h&]jjuh1jhhhhMhj ubah}(h]h ]h"]h$]h&]uh1jbhhhMhj hhubh)}(h-Both require CAP_SYS_ADMIN capability to set.h]h-Both require CAP_SYS_ADMIN capability to set.}(hjD hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMhj hhubh)}(hXLike in tmpfs, the extended attributes in cgroup filesystem are stored using kernel memory and it's advised to keep the usage at minimum. This is the reason why user defined extended attributes are not supported, since any user can do it and there's no limit in the value size.h]hXLike in tmpfs, the extended attributes in cgroup filesystem are stored using kernel memory and it’s advised to keep the usage at minimum. This is the reason why user defined extended attributes are not supported, since any user can do it and there’s no limit in the value size.}(hjR hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMhj hhubh)}(hThe current known users for this feature are SELinux to limit cgroup usage in containers and systemd for assorted meta data like main PID in a cgroup (systemd creates a cgroup per service).h]hThe current known users for this feature are SELinux to limit cgroup usage in containers and systemd for assorted meta data like main PID in a cgroup (systemd creates a cgroup per service).}(hj` hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMhj hhubeh}(h]extended-attribute-usageah ]h"]4. extended attribute usageah$]h&]uh1hhhhhhhhMubh)}(hhh](h)}(h 5. Questionsh]h 5. Questions}(hjy hhhNhNubah}(h]h ]h"]h$]h&]uh1hhjv hhhhhMubj)}(hXQ: what's up with this '/bin/echo' ? A: bash's builtin 'echo' command does not check calls to write() against errors. If you use it in the cgroup file system, you won't be able to tell whether a command succeeded or failed. Q: When I attach processes, only the first of the line gets really attached ! A: We can only return one error code per call to write(). So you should also put only ONE PID.h]hXQ: what's up with this '/bin/echo' ? A: bash's builtin 'echo' command does not check calls to write() against errors. If you use it in the cgroup file system, you won't be able to tell whether a command succeeded or failed. Q: When I attach processes, only the first of the line gets really attached ! A: We can only return one error code per call to write(). So you should also put only ONE PID.}hj sbah}(h]h ]h"]h$]h&]j[j\uh1jhhhMhjv hhubeh}(h] questionsah ]h"] 5. questionsah$]h&]uh1hhhhhhhhMubeh}(h]control-groupsah ]h"]control groupsah$]h&]uh1hhhhhhhhKubeh}(h]h ]h"]h$]h&]sourcehuh1hcurrent_sourceN current_lineNsettingsdocutils.frontendValues)}(hN generatorN datestampN source_linkN source_urlN toc_backlinksentryfootnote_backlinksK sectnum_xformKstrip_commentsNstrip_elements_with_classesN strip_classesN report_levelK halt_levelKexit_status_levelKdebugNwarning_streamN tracebackinput_encoding utf-8-siginput_encoding_error_handlerstrictoutput_encodingutf-8output_encoding_error_handlerj error_encodingutf-8error_encoding_error_handlerbackslashreplace language_codeenrecord_dependenciesNconfigN id_prefixhauto_id_prefixid dump_settingsNdump_internalsNdump_transformsNdump_pseudo_xmlNexpose_internalsNstrict_visitorN_disable_configN_sourceh _destinationN _config_files]7/var/lib/git/docbuild/linux/Documentation/docutils.confafile_insertion_enabled raw_enabledKline_length_limitM'pep_referencesN pep_base_urlhttps://peps.python.org/pep_file_url_templatepep-%04drfc_referencesN rfc_base_url&https://datatracker.ietf.org/doc/html/ tab_widthKtrim_footnote_reference_spacesyntax_highlightlong smart_quotessmartquotes_locales]character_level_inline_markupdoctitle_xform docinfo_xformKsectsubtitle_xform image_loadinglinkembed_stylesheetcloak_email_addressessection_self_linkenvNubreporterNindirect_targets]substitution_defs}substitution_names}refnames}refids}j4]j)asnameids}(j j jjj:j7j<j4j;j8jwjtjjjjjjjjjjjjjjj j j j j j j j js jp j j u nametypes}(j jj:j<j;jwjjjjjjjj j j j js j uh}(j hjj]j7jnj4j=j8j=jtjCjjzjjjjjj!jj2jjjjj jj j j j j j jp j j jv u footnote_refs} citation_refs} autofootnotes]autofootnote_refs]symbol_footnotes]symbol_footnote_refs] footnotes] citations]autofootnote_startKsymbol_footnote_startK id_counter collectionsCounter}j KsRparse_messages]transform_messages]hsystem_message)}(hhh]h)}(hhh]h8Hyperlink target "cgroups-why-needed" is not referenced.}hj2sbah}(h]h ]h"]h$]h&]uh1hhj/ubah}(h]h ]h"]h$]h&]levelKtypeINFOsourcehlineKSuh1j-uba transformerN include_log] decorationNhhub.