| Chapter 4. Use Case ScenariosThis chapter provides use case scenarios that take advantage of the cgroup functionality. 4.1. Prioritizing Database I/ORunning each instance of a database server inside its own dedicated virtual guest allows you to allocate resources per database based on their priority. Consider the following example: a system is running two database servers inside two KVM guests. One of the databases is a high priority database and the other one a low priority database. When both database servers are run simultaneously, the I/O throughput is decreased to accommodate requests from both databases equally; Figure 4.1, "I/O throughput without resource allocation" indicates this scenario - once the low priority database is started (around time 45), I/O throughput is the same for both database servers. To prioritize the high priority database server, it can be assigned to a cgroup with a high number of reserved I/O operations, whereas the low priority database server can be assigned to a cgroup with a low number of reserved I/O operations. To achieve this, follow the steps in Procedure 4.1, "I/O throughput prioritization", all of which are performed on the host system. Procedure 4.1. I/O throughput prioritization Attach the blkio subsystem to the /cgroup/blkio cgroup: ~]# mkdir /cgroup/blkio ~]# mount -t cgroup -o blkio blkio /cgroup/blkio Create a high and low priority cgroup: ~]# mkdir /cgroup/blkio/high_prio ~]# mkdir /cgroup/blkio/low_prio Acquire the PIDs of the processes that represent both virtual guests (in which the database servers are running) and move them to their specific cgroup. In our example, VM_high represents a virtual guest running a high priority database server, and VM_low represents a virtual guest running a low priority database server. For example: ~]# ps -eLf | grep qemu | grep VM_high | awk '{print $4}' | while read pid; do echo $pid >> /cgroup/blkio/high_prio/tasks; done ~]# ps -eLf | grep qemu | grep VM_low | awk '{print $4}' | while read pid; do echo $pid >> /cgroup/blkio/low_prio/tasks; done Set a ratio of 10:1 for the high_prio and low_prio cgroups. Processes in those cgroups (that is, processes running the virtual guests that have been added to those cgroups in the previous step) will immediately use only the resources made available to them. ~]# echo 1000 > /cgroup/blkio/high_prio/blkio.weight ~]# echo 100 > /cgroup/blkio/low_prio/blkio.weight In our example, the low priority cgroup permits the low priority database server to use only about 10% of the I/O operations, whereas the high priority cgroup permits the high priority database server to use about 90% of the I/O operations.
Figure 4.2, "I/O throughput with resource allocation" illustrates the outcome of limiting the low priority database and prioritizing the high priority database. As soon as the database servers are moved to their appropriate cgroups (around time 75), I/O throughput is divided among both servers with the ratio of 10:1. Alternatively, block device I/O throttling can be used for the low priority database to limit its number of read and write operation. For more information on the blkio subsystem, refer to Section 3.1, "blkio". 4.2. Prioritizing Network TrafficWhen running multiple network-related services on a single server system, it is important to define network priorities between these services. Defining these priorities ensures that packets originating from certain services have a higher priority than packets originating from other services. For example, such priorities are useful when a server system simultaneously functions as an NFS and Samba server. The NFS traffic must be of high priority as users expect high throughput. The Samba traffic can be deprioritized to allow better performance of the NFS server. The net_prio subsystem can be used to set network priorities for processes in cgroups. These priorities are then translated into Type Of Service (TOS) bits and embedded into every packet. Follow the steps in Procedure 4.2, "Setting Network Priorities for File Sharing Services" to configure prioritization of two file sharing services (NFS and Samba). Procedure 4.2. Setting Network Priorities for File Sharing Services Attach the net_prio subsystem to the /cgroup/net_prio cgroup: ~]# mkdir /cgroup/net_prio ~]# mount -t cgroup -o net_prio net_prio /cgroup/net_prio Create two cgroups, one for each service: ~]# mkdir /cgroup/net_prio/nfs_high ~]# mkdir /cgroup/net_prio/samba_low To automatically move the nfs services to the nfs_high cgroup, add the following line to the /etc/sysconfig/nfs file: CGROUP_DAEMON="net_prio:nfs_high" The smbd daemon does not have a configuration file in the /etc/sysconfig directory. To automatically move the smbd daemon to the samba_low cgroup, add the following line to the /etc/cgrules.conf file: *:smbd net_prio samba_low Note that this rule moves every smbd daemon, not only /usr/sbin/smbd , into the samba_low cgroup. You can define rules for the nmbd and winbindd daemons to be moved to the samba_low cgroup in a similar way. Start the cgred service to load the configuration from the previous step: ~]# service cgred start Starting CGroup Rules Engine Daemon: [ OK ] For the purposes of this example, let us assume both services use the eth1 network interface. Define network priorities for each cgroup, where 1 denotes low priority and 10 denotes high priority: ~]# echo "eth1 1" > /cgroup/net_prio/samba_low ~]# echo "eth1 10" > /cgroup/net_prio/nfs_high Start the nfs and smb services and check whether their processes have been moved into the correct cgroups: ~]# service smb start Starting SMB services: [ OK ]~]# cat /cgroup/net_prio/samba_low 1612216124~]# service nfs start Starting NFS services: [ OK ]Starting NFS quotas: [ OK ]Starting NFS mountd: [ OK ]Stopping RPC idmapd: [ OK ]Starting RPC idmapd: [ OK ]Starting NFS daemon: [ OK ]~]# cat /cgroup/net_prio/nfs_high 163211632516376 Network traffic originating from NFS now has higher priority than traffic originating from Samba.
4.3. Per-group Division of CPU and Memory ResourcesWhen a large amount of users use a single system, it is practical to provide certain users with more resources than others. Consider the following example: in a hypothetical company, there are three departments - finance, sales, and engineering. Because engineers use the system and its resources for more tasks than the other departments, it is logical that they have more resources available in case all departments are running CPU and memory intensive tasks. Cgroups provide a way to limit the resources per each system group of users. For this example, assume that the following users have been created on the system: ~]$ grep home /etc/passwd martin:x:500:500::/home/martin:/bin/bashjohn:x:501:501::/home/john:/bin/bashmark:x:502:502::/home/mark:/bin/bashpeter:x:503:503::/home/peter:/bin/bashjenn:x:504:504::/home/jenn:/bin/bashmike:x:505:505::/home/mike:/bin/bash These users have been assigned to the following system groups: ~]$ grep -e "50[678]" /etc/group finance:x:506:jenn,johnsales:x:507:mark,martinengineering:x:508:peter,mike For this example to work properly, you must have the libcgroup package installed. Using the /etc/cgconfig.conf and /etc/cgrules.conf files, you can create a hierarchy and a set of rules which determine the amount of resources for each user. To achieve this, follow the steps in Procedure 4.3, "Per-group CPU and memory resource management". Procedure 4.3. Per-group CPU and memory resource management In the /etc/cgconfig.conf file, configure the following subsystems to be mounted and cgroups to be created: mount { cpu = /cgroup/cpu_and_mem; cpuacct = /cgroup/cpu_and_mem; memory = /cgroup/cpu_and_mem;}group finance { cpu { cpu.shares="250"; } cpuacct { cpuacct.usage="0"; } memory { memory.limit_in_bytes="2G"; memory.memsw.limit_in_bytes="3G"; }}group sales { cpu { cpu.shares="250"; } cpuacct { cpuacct.usage="0"; } memory { memory.limit_in_bytes="4G"; memory.memsw.limit_in_bytes="6G"; }}group engineering { cpu { cpu.shares="500"; } cpuacct { cpuacct.usage="0"; } memory { memory.limit_in_bytes="8G"; memory.memsw.limit_in_bytes="16G"; }} When loaded, the above configuration file mounts the cpu , cpuacct , and memory subsystems to a single cpu_and_mem cgroup. For more information on these subsystems, refer to Chapter 3, Subsystems and Tunable Parameters. Next, it creates a hierarchy in cpu_and_mem which contains three cgroups: sales, finance, and engineering. In each of these cgroups, custom parameters are set for each subsystem: cpu - the cpu.shares parameter determines the share of CPU resources available to each process in all cgroups. Setting the parameter to 250 , 250 , and 500 in the finance, sales, and engineering cgroups respectively means that processes started in these groups will split the resources with a 1:1:2 ratio. Note that when a single process is running, it consumes as much CPU as necessary no matter which cgroup it is placed in. The CPU limitation only comes into effect when two or more processes compete for CPU resources.
cpuacct - the cpuacct.usage="0" parameter is used to reset values stored in the cpuacct.usage and cpuacct.usage_percpu files. These files report total CPU time (in nanoseconds) consumed by all processes in a cgroup.
memory - the memory.limit_in_bytes parameter represents the amount of memory that is made available to all processes within a certain cgroup. In our example, processes started in the finance cgroup have 2 GB of memory available, processes in the sales group have 4 GB of memory available, and processes in the engineering group have 8 GB of memory available. The memory.memsw.limit_in_bytes parameter specifies the total amount of memory and swap space processes may use. Should a process in the finance cgroup hit the 2 GB memory limit, it is allowed to use another 1 GB of swap space, thus totaling the configured 3 GB.
To define the rules which the cgrulesengd daemon uses to move processes to specific cgroups, configure the /etc/cgrules.conf in the following way: #<user/group> <controller(s)> <cgroup>@finance cpu,cpuacct,memory finance@sales cpu,cpuacct,memory sales@engineering cpu,cpuacct,memory engineering The above configuration creates rules that assign a specific system group (for example, @finance ) the resource controllers it may use (for example, cpu , cpuacct , memory ) and a cgroup (for example, finance ) which contains all processes originating from that system group. In our example, when the cgrulesengd daemon, started via the service cgred start command, detects a process that is started by a user that belongs to the finance system group (for example, jenn ), that process is automatically moved to the /cgroup/cpu_and_mem/finance/tasks file and is subjected to the resource limitations set in the finance cgroup. Start the cgconfig service to create the hierarchy of cgroups and set the needed parameters in all created cgroups: ~]# service cgconfig start Starting cgconfig service: [ OK ] Start the cgred service to let the cgrulesengd daemon detect any processes started in system groups configured in the /etc/cgrules.conf file: ~]# service cgred start Starting CGroup Rules Engine Daemon: [ OK ] Note that cgred is the name of the service that starts the cgrulesengd daemon. To make all of the changes above persistent across reboots, configure the cgconfig and cgred services to be started by default: ~]# chkconfig cgconfig on ~]# chkconfig cgred on
To test whether this setup works, execute a CPU or memory intensive process and observe the results, for example, using the top utility. To test the CPU resource management, execute the following dd command under each user: ~]$ dd if=/dev/zero of=/dev/null bs=1024k The above command reads the /dev/zero and outputs it to the /dev/null in chunks of 1024 KB. When the top utility is launched, you can see results similar to these: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND8201 peter 20 0 103m 1676 556 R 24.9 0.2 0:04.18 dd8202 mike 20 0 103m 1672 556 R 24.9 0.2 0:03.47 dd8199 jenn 20 0 103m 1676 556 R 12.6 0.2 0:02.87 dd8200 john 20 0 103m 1676 556 R 12.6 0.2 0:02.20 dd8197 martin 20 0 103m 1672 556 R 12.6 0.2 0:05.56 dd8198 mark 20 0 103m 1672 556 R 12.3 0.2 0:04.28 dd⋮ All processes have been correctly assigned to their cgroups and are only allowed to consume CPU resource made available to them. If all but two processes, which belong to the finance and engineering cgroups, are stopped, the remaining resources are evenly split between both processes: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND8202 mike 20 0 103m 1676 556 R 66.4 0.2 0:06.35 dd8200 john 20 0 103m 1672 556 R 33.2 0.2 0:05.08 dd⋮ Alternative methodBecause the cgrulesengd daemon moves a process to a cgroup only after the appropriate conditions set by the rules in /etc/cgrules.conf have been fulfilled, that process may be running for a few milliseconds in an incorrect cgroup. An alternative way to move processes to their specified cgroups is to use the pam_cgroup.so PAM module. This module moves processes to available cgroups according to rules defined in the /etc/cgrules.conf file. Follow the steps in Procedure 4.4, "Using a PAM module to move processes to cgroups" to configure the pam_cgroup.so PAM module. Procedure 4.4. Using a PAM module to move processes to cgroups Install the libcgroup-pam package from the optional Red Hat Enterprise Linux Yum repository: ~]# yum install libcgroup-pam --enablerepo=rhel-6-server-optional-rpms Ensure that the PAM module has been installed and exists: ~]# ls /lib64/security/pam_cgroup.so /lib64/security/pam_cgroup.so Note that on 32-bit systems, the module is placed in the /lib/security directory. Add the following line to the /etc/pam.d/su file to use the pam_cgroup.so module each time the su command is executed: session optional pam_cgroup.so Log out all users that are affected by the cgroup settings in the /etc/cgrules.conf file to apply the above configuration.
When using the pam_cgroup.so PAM module, you may disable the cgred service. Revision History |
---|
Revision 1.0-18 | Thu Feb 21 2013 | Martin Prpič | Red Hat Enterprise Linux 6.4 GA release of the Resource Management Guide. Includes various fixes and new content: | - Final use case scenarios - 584631 | - Documentation for the perf_event controller - 807326 | - Documentation for common cgroup files - 807329 | - Documentation for OOM control and the notification API - 822400, 822401 | - CPU ceiling enforcement documentation - 828991 |
| Revision 1.0-7 | Wed Jun 20 2012 | Martin Prpič | Red Hat Enterprise Linux 6.3 GA release of the Resource Management Guide. | - Added two use cases. | - Added documentation for the net_prio subsystem. |
| Revision 1.0-6 | Tue Dec 6 2011 | Martin Prpič | Red Hat Enterprise Linux 6.2 GA release of the Resource Management Guide. |
| Revision 1.0-5 | Thu May 19 2011 | Martin Prpič | Red Hat Enterprise Linux 6.1 GA release of the Resource Management Guide. |
| Revision 1.0-4 | Tue Mar 1 2011 | Martin Prpič | | Revision 1.0-3 | Wed Nov 17 2010 | Rüdiger Landmann | | Revision 1.0-2 | Thu Nov 11 2010 | Rüdiger Landmann | Remove pre-release feedback instructions |
| Revision 1.0-1 | Wed Nov 10 2010 | Rüdiger Landmann | | Revision 1.0-0 | Tue Nov 9 2010 | Rüdiger Landmann | Feature-complete version for GA |
|
|
| |