Cari di RHE Linux 
    Red Hat Enterprise Linux Manual
Daftar Isi
(Sebelumnya) 7 : Power Management Guide8 : Security-Enhanced Linux (Berikutnya)

Power Management Guide

Chapter 3. Core Infrastructure and Mechanics

Important

To use the cpupower command featured in this chapter, ensure you have the cpupowerutils package installed.

3.1. CPU Idle States

CPUs with the x86 architecture support various states in which parts of the CPU are deactivated or run at lower performance settings. These states, known as C-states, allow systems to save power by partially deactivating CPUs that are not in use. C-states are numbered from C0 upwards, with higher numbers representing decreased CPU functionality and greater power saving. C-States of a given number are broadly similar across processors, although the exact details of the specific feature sets of the state may vary between processor families. C-States 0-3 are defined as follows:
C0
the operating or running state. In this state, the CPU is working and not idle at all.
C1, Halt
a state where the processor is not executing any instructions but is typically not in a lower power state. The CPU can continue processing with practically no delay. All processors offering C-States need to support this state. Pentium 4 processors support an enhanced C1 state called C1E that actually is a state for lower power consumption.
C2, Stop-Clock
a state where the clock is frozen for this processor but it keeps the complete state for its registers and caches, so after starting the clock again it can immediately start processing again. This is an optional state.
C3, Sleep
a state where the processor really goes to sleep and doesn't need to keep it's cache up to date. Waking up from this state takes considerably longer than from C2 due to this. Again this is an optional state.
To view available idle states and other statistics for the CPUidle driver, run the following command:
cpupower idle-info
Recent Intel CPUs with the "Nehalem" microarchitecture feature a new C-State, C6, which can reduce the voltage supply of a CPU to zero, but typically reduces power consumption by between 80% and 90%. The kernel in Red Hat Enterprise Linux 6 includes optimizations for this new C-State.

3.2. Using CPUfreq Governors

One of the most effective ways to reduce power consumption and heat output on your system is to use CPUfreq. CPUfreq - also referred to as CPU speed scaling - allows the clock speed of the processor to be adjusted on the fly. This enables the system to run at a reduced clock speed to save power. The rules for shifting frequencies, whether to a faster or slower clock speed, and when to shift frequencies, are defined by the CPUfreq governor.
The governor defines the power characteristics of the system CPU, which in turn affects CPU performance. Each governor has its own unique behavior, purpose, and suitability in terms of workload. This section describes how to choose and configure a CPUfreq governor, the characteristics of each governor, and what kind of workload each governor is suitable for.

3.2.1. CPUfreq Governor Types

This section lists and describes the different types of CPUfreq governors available in Red Hat Enterprise Linux 6.
cpufreq_performance
The Performance governor forces the CPU to use the highest possible clock frequency. This frequency will be statically set, and will not change. As such, this particular governor offers no power saving benefit. It is only suitable for hours of heavy workload, and even then only during times wherein the CPU is rarely (or never) idle.
cpufreq_powersave
By contrast, the Powersave governor forces the CPU to use the lowest possible clock frequency. This frequency will be statically set, and will not change. As such, this particular governor offers maximum power savings, but at the cost of the lowest CPU performance.
The term "powersave" can sometimes be deceiving, though, since (in principle) a slow CPU on full load consumes more power than a fast CPU that is not loaded. As such, while it may be advisable to set the CPU to use the Powersave governor during times of expected low activity, any unexpected high loads during that time can cause the system to actually consume more power.
The Powersave governor is, in simple terms, more of a "speed limiter" for the CPU than a "power saver". It is most useful in systems and environments where overheating can be a problem.
cpufreq_ondemand
The Ondemand governor is a dynamic governor that allows the CPU to achieve maximum clock frequency when system load is high, and also minimum clock frequency when the system is idle. While this allows the system to adjust power consumption accordingly with respect to system load, it does so at the expense of latency between frequency switching. As such, latency can offset any performance/power saving benefits offered by the Ondemand governor if the system switches between idle and heavy workloads too often.
For most systems, the Ondemand governor can provide the best compromise between heat emission, power consumption, performance, and manageability. When the system is only busy at specific times of the day, the Ondemand governor will automatically switch between maximum and minimum frequency depending on the load without any further intervention.
cpufreq_userspace
The Userspace governor allows userspace programs (or any process running as root) to set the frequency. This governor is normally used in conjunction with the cpuspeed daemon. Of all the governors, Userspace is the most customizable; and depending on how it is configured, it can offer the best balance between performance and consumption for your system.
cpufreq_conservative
Like the Ondemand governor, the Conservative governor also adjusts the clock frequency according to usage (like the Ondemand governor). However, while the Ondemand governor does so in a more aggressive manner (that is from maximum to minimum and back), the Conservative governor switches between frequencies more gradually.
This means that the Conservative governor will adjust to a clock frequency that it deems fitting for the load, rather than simply choosing between maximum and minimum. While this can possibly provide significant savings in power consumption, it does so at an ever greater latency than the Ondemand governor.

Note

You can enable a governor using cron jobs. This allows you to automatically set specific governors during specific times of the day. As such, you can specify a low-frequency governor during idle times (for example after work hours) and return to a higher-frequency governor during hours of heavy workload.
For instructions on how to enable a specific governor, refer to Procedure 3.2, "Enabling a CPUfreq Governor" in Section 3.2.2, "CPUfreq Setup".

3.2.2. CPUfreq Setup

Before selecting and configuring a CPUfreq governor, you need to add the appropriate CPUfreq driver first.

Procedure 3.1. How to Add a CPUfreq Driver

  1. Use the following command to view which CPUfreq drivers are available for your system:
    ls /lib/modules/[kernel version]/kernel/arch/[architecture]/kernel/cpu/cpufreq/
  2. Use modprobe to add the appropriate CPUfreq driver.
    modprobe [CPUfreq driver]
    When using the above command, be sure to remove the .ko filename suffix.

    Important

    When choosing an appropriate CPUfreq driver, always choose acpi-cpufreq over p4-clockmod. While using the p4-clockmod driver reduces the clock frequency of a CPU, it does not reduce the voltage. acpi-cpufreq, on the other hand, reduces voltage along with CPU clock frequency, allowing less power consumption and heat output for each unit reduction in performance.
You can also view which governors are available for use for a specific CPU using:
cpupower frequency-info --governors
Some CPUfreq governors may not be available for you to use. In this case, use modprobe to add the necessary kernel modules that enable the specific CPUfreq governor you wish to use. These kernel modules are available in /lib/modules/[kernel version]/kernel/drivers/cpufreq/.

Procedure 3.2. Enabling a CPUfreq Governor

  1. If a specific governor is not listed as available for your CPU, use modprobe to enable the governor you wish to use. For example, if the ondemand governor is not available for your CPU, use the following command:
    modprobe cpufreq_ondemand
  2. Once a governor is listed as available for your CPU, you can enable it using:
    cpupower frequency-set --governor [governor]

3.2.3. Tuning CPUfreq Policy and Speed

Once you have chosen an appropriate CPUfreq governor, you can view CPU speed and policy information with the cpupower frequency-info command and further tune the speed of each CPU with options for cpupower frequency-set.
For cpupower frequency-info, the following options are available:
  • --freq - Shows the current speed of the CPU according to the CPUfreq core, in KHz.
  • --hwfreq - Shows the current speed of the CPU according to the hardware, in KHz (only available as root).
  • --driver - Shows what CPUfreq driver is used to set the frequency on this CPU.
  • --governors - Shows the CPUfreq governors available in this kernel. If you wish to use a CPUfreq governor that is not listed in this file, refer to Procedure 3.2, "Enabling a CPUfreq Governor" in Section 3.2.2, "CPUfreq Setup" for instructions on how to do so.
  • --affected-cpus - Lists CPUs that require frequency coordination software.
  • --policy - Shows the range of the current CPUfreq policy, in KHz, and the currently active governor.
  • --hwlimits - Lists available frequencies for the CPU, in KHz.
For cpupower frequency-set, the following options are available:
  • --min <freq> and --max <freq> - Set the policy limits of the CPU, in KHz.

    Important

    When setting policy limits, you should set --max before --min.
  • --freq <freq> - Set a specific clock speed for the CPU, in KHz. You can only set a speed within the policy limits of the CPU (as per --min and --max).
  • --governor <gov> - Set a new CPUfreq governor.

Note

If you do not have the cpupowerutils package installed, CPUfreq settings can be viewed in the tunables found in /sys/devices/system/cpu/[cpuid]/cpufreq/. Settings and values can be changed by writing to these tunables. For example, to set the minimum clock speed of cpu0 to 360 KHz, use:
echo 360000 > /sys/devices/system/cpu/cpu0/cpufreq/scaling_min_freq

3.3. CPU Monitors

cpupower features a selection of monitors that provide idle and sleep state statistics and frequency information and report on processor topology. Some monitors are processor-specific, while others are compatible with any processor. Refer to the cpupower-monitor man page for details on what each monitor measures and which systems they are compatible with.
Use the following options with the cpupower monitor command:
  • -l - list all monitors available on your system.
  • -m <monitor1>, <monitor2> - display specific monitors. Their identifiers can be found by running -l.
  • command - display the idle statistics and CPU demands of a specific command.

3.4. CPU Power Saving Policies

cpupower provides ways to regulate your processor's power saving policies.
Use the following options with the cpupower set command:
--perf-bias <0-15>
Allows software on supported Intel processors to more actively contribute to determining the balance between optimum performance and saving power. This does not override other power saving policies. Assigned values range from 0 to 15, where 0 is optimum performance and 15 is optimum power efficiency.
By default, this option applies to all cores. To apply it only to individual cores, add the --cpu <cpulist> option.
--sched-mc <0|1|2>
Restricts the use of power by system processes to the cores in one CPU package before other CPU packages are drawn from. 0 sets no restrictions, 1 initially employs only a single CPU package, and 2 does this in addition to favouring semi-idle CPU packages for handling task wakeups.
--sched-smt <0|1|2>
Restricts the use of power by system processes to the thread siblings of one CPU core before drawing on other cores. 0 sets no restrictions, 1 initially employs only a single CPU package, and 2 does this in addition to favouring semi-idle CPU packages for handling task wakeups.

3.5. Suspend and Resume

When a system is suspended, the kernel calls on drivers to store their states and then unloads them. When the system is resumed, it reloads these drivers, which attempt to reprogram their devices. The drivers' ability to accomplish this task determines whether the system can be resumed successfully.
Video drivers are particularly problematic in this regard, because the Advanced Configuration and Power Interface (ACPI) specification does not require system firmware to be able to reprogram video hardware. Therefore, unless video drivers are able to program hardware from a completely uninitialized state, they may prevent the system from resuming.
Red Hat Enterprise Linux 6 includes greater support for new graphics chipsets, which ensures that suspend and resume will work on a greater number of platforms. In particular, support for NVIDIA chipsets has been greatly improved; in particular for the GeForce 8800 series.

3.6. Tickless Kernel

Previously, the Linux kernel periodically interrupted each CPU on a system at a predetermined frequency - 100 Hz, 250 Hz, or 1000 Hz, depending on the platform. The kernel queried the CPU about the processes that it was executing, and used the results for process accounting and load balancing. Known as the timer tick, the kernel performed this interrupt regardless of the power state of the CPU. Therefore, even an idle CPU was responding to up to 1000 of these requests every second. On systems that implemented power saving measures for idle CPUs, the timer tick prevented the CPU from remaining idle long enough for the system to benefit from these power savings.
The kernel in Red Hat Enterprise Linux 6 runs tickless: that is, it replaces the old periodic timer interrupts with on-demand interrupts. Therefore, idle CPUs are allowed to remain idle until a new task is queued for processing, and CPUs that have entered lower power states can remain in these states longer.

3.7. Active-State Power Management

Active-State Power Management (ASPM) saves power in the Peripheral Component Interconnect Express (PCI Express or PCIe) subsystem by setting a lower power state for PCIe links when the devices to which they connect are not in use. ASPM controls the power state at both ends of the link, and saves power in the link even when the device at the end of the link is in a fully powered-on state.
When ASPM is enabled, device latency increases because of the time required to transition the link between different power states. ASPM has three policies to determine power states:
default
sets PCIe link power states according to the defaults specified by the firmware on the system (for example, BIOS). This is the default state for ASPM.
powersave
sets ASPM to save power wherever possible, regardless of the cost to performance.
performance
disables ASPM to allow PCIe links to operate with maximum performance.
ASPM support can be enabled or disabled by the pcie_aspm kernel parameter, where pcie_aspm=off disables ASPM and pcie_aspm=force enables ASPM, even on devices that do not support ASPM.
ASPM policies are set in /sys/module/pcie_aspm/parameters/policy, but can be also specified at boot time with the pcie_aspm.policy kernel parameter, where, for example, pcie_aspm.policy=performance will set the ASPM performance policy.

Warning - pcie_aspm=force can cause systems to stop responding

If pcie_aspm=force is set, hardware that does not support ASPM can cause the system to stop responding. Before setting pcie_aspm=force, ensure that all PCIe hardware on the system supports ASPM.

3.8. Aggressive Link Power Management

Aggressive Link Power Management (ALPM) is a power-saving technique that helps the disk save power by setting a SATA link to the disk to a low-power setting during idle time (that is when there is no I/O). ALPM automatically sets the SATA link back to an active power state once I/O requests are queued to that link.
Power savings introduced by ALPM come at the expense of disk latency. As such, you should only use ALPM if you expect the system to experience long periods of idle I/O time.
ALPM is only available on SATA controllers that use the Advanced Host Controller Interface (AHCI). For more information about AHCI, refer to http://www.intel.com/technology/serialata/ahci.htm.
When available, ALPM is enabled by default. ALPM has three modes:
min_power
This mode sets the link to its lowest power state (SLUMBER) when there is no I/O on the disk. This mode is useful for times when an extended period of idle time is expected.
medium_power
This mode sets the link to the second lowest power state (PARTIAL) when there is no I/O on the disk. This mode is designed to allow transitions in link power states (for example during times of intermittent heavy I/O and idle I/O) with as small impact on performance as possible.
medium_power mode allows the link to transition between PARTIAL and fully-powered (that is "ACTIVE") states, depending on the load. Note that it is not possible to transition a link directly from PARTIAL to SLUMBER and back; in this case, either power state cannot transition to the other without transitioning through the ACTIVE state first.
max_performance
ALPM is disabled; the link does not enter any low-power state when there is no I/O on the disk.
To check whether your SATA host adapters actually support ALPM you can check if the file /sys/class/scsi_host/host*/link_power_management_policy exists. To change the settings simply write the values described in this section to these files or display the files to check for the current setting.

Important - some settings disable hot plugging

Setting ALPM to min_power or medium_power will automatically disable the "Hot Plug" feature.

3.9. Relatime Drive Access Optimization

The POSIX standard requires that operating systems maintain file system metadata that records when each file was last accessed. This timestamp is called atime, and maintaining it requires a constant series of write operations to storage. These writes keep storage devices and their links busy and powered up. Since few applications make use of the atime data, this storage device activity wastes power. Significantly, the write to storage would occur even if the file was not read from storage, but from cache. For some time, the Linux kernel has supported a noatime option for mount and would not write atime data to file systems mounted with this option. However, simply turning off this feature is problematic because some applications rely on atime data and will fail if it is not available.
The kernel used in Red Hat Enterprise Linux 6 supports another alternative - relatime. Relatime maintains atime data, but not for each time that a file is accessed. With this option enabled, atime data is written to the disk only if the file has been modified since the atime data was last updated (mtime), or if the file was last accessed more than a certain length of time ago (by default, one day).
By default, all filesystems are now mounted with relatime enabled. You can suppress it for any particular file system by mounting that file system with the option norelatime.

3.10. Power Capping

Red Hat Enterprise Linux 6 supports the power capping features found in recent hardware, such as HP Dynamic Power Capping (DPC), and Intel Node Manager (NM) technology. Power capping allows administrators to limit the power consumed by servers, but it also allows managers to plan data centers more efficiently, because the risk of overloading existing power supplies is greatly diminished. Managers can place more servers within the same physical footprint and have confidence that if server power consumption is capped, the demand for power during heavy load will not exceed the power available.
HP Dynamic Power Capping
Dynamic Power Capping is a feature available on select ProLiant and BladeSystem servers that allows system administrators to cap the power consumption of a server or a group of servers. The cap is a definitive limit that the server will not exceed, regardless of its current workload. The cap has no effect until the server reaches its power consumption limit. At that point, a management processor adjusts CPU P-states and clock throttling to limit the power consumed.
Dynamic Power Capping modifies CPU behavior independently of the operating system, however, HP's integrated Lights-Out 2 (iLO2) firmware allows operating systems access to the management processor and therefore applications in user space can query the management processor. The kernel used in Red Hat Enterprise Linux 6 includes a driver for HP iLO and iLO2 firmware, which allows programs to query management processors at /dev/hpilo/dXccbN. The kernel also includes an extension of the hwmon sysfs interface to support power capping features, and a hwmon driver for ACPI 4.0 power meters that use the sysfs interface. Together, these features allow the operating system and user-space tools to read the value configured for the power cap, together with the current power usage of the system.
For further details of HP Dynamic Power Capping, refer to HP Power Capping and HP Dynamic Power Capping for ProLiant Servers, available from http://h20000.www2.hp.com/bc/docs/support/SupportManual/c01549455/c01549455.pdf
Intel Node Manager
Intel Node Manager imposes a power cap on systems, using processor P-states and T-states to limit CPU performance and therefore power consumption. By setting a power management policy, administrators can configure systems to consume less power during times when system loads are low, for example, at night or on weekends.
Intel Node Manager adjusts CPU performance using Operating System-directed configuration and Power Management (OSPM) through the standard Advanced Configuration and Power Interface. When Intel Node Manager notifies the OSPM driver of changes to T-states, the driver makes corresponding changes to processor P-states. Similarly, when Intel Node Manager notifies the OSPM driver of changes to P-states, the driver changes T-states accordingly. These changes happen automatically and require no further input from the operating system. Administrators configure and monitor Intel Node Manager with Intel Data Center Manager (DCM) software.
For further details of Intel Node Manager, refer to Node Manager - A Dynamic Approach To Managing Power In The Data Center, available from http://communities.intel.com/docs/DOC-4766

3.11. Enhanced Graphics Power Management

Red Hat Enterprise Linux 6 saves power on graphics and display devices by eliminating several sources of unnecessary consumption.
LVDS reclocking
Low-voltage differential signaling (LVDS) is an system for carrying electronic signals over copper wire. One significant application of the system is to transmit pixel information to liquid crystal display (LCD) screens in notebook computers. All displays have a refresh rate - the rate at which they receive fresh data from a graphics controller and redraw the image on the screen. Typically, the screen receives fresh data sixty times per second (a frequency of 60 Hz). When a screen and graphics controller are linked by LVDS, the LVDS system uses power on every refresh cycle. When idle, the refresh rate of many LCD screens can be dropped to 30 Hz without any noticeable effect (unlike cathode ray tube (CRT) monitors, where a decrease in refresh rate produces a characteristic flicker). The driver for Intel graphics adapters built into the kernel used in Red Hat Enterprise Linux 6 performs this downclocking automatically, and saves around 0.5 W when the screen is idle.
Enabling memory self-refresh
Synchronous dynamic random access memory (SDRAM) - as used for video memory in graphics adapters - is recharged thousands of times per second so that individual memory cells retain the data that is stored in them. Apart from its main function of managing data as it flows in and out of memory, the memory controller is normally responsible for initiating these refresh cycles. However, SDRAM also has a low-power self-refresh mode. In this mode, the memory uses an internal timer to generate its own refresh cycles, which allows the system to shut down the memory controller without endangering data currently held in memory. The kernel used in Red Hat Enterprise Linux 6 can trigger memory self-refresh in Intel graphics adapters when they are idle, which saves around 0.8 W.
GPU clock reduction
Typical graphical processing units (GPUs) contain internal clocks that govern various parts of their internal circuitry. The kernel used in Red Hat Enterprise Linux 6 can reduce the frequency of some of the internal clocks in Intel and ATI GPUs. Reducing the number of cycles that GPU components perform in a given time saves the power that they would have consumed in the cycles that they did not have to perform. The kernel automatically reduces the speed of these clocks when the GPU is idle, and increases it when GPU activity increases. Reducing GPU clock cycles can save up to 5 W.
GPU powerdown
The Intel and ATI graphics drivers in Red Hat Enterprise Linux 6 can detect when no monitor is attached to an adapter and therefore shut down the GPU completely. This feature is especially significant for servers which do not have monitors attached to them regularly.

3.12. RFKill

Many computer systems contain radio transmitters, including Wi-Fi, Bluetooth, and 3G devices. These devices consume power, which is wasted when the device is not in use.
RFKill is a subsystem in the Linux kernel that provides an interface through which radio transmitters in a computer system can be queried, activated, and deactivated. When transmitters are deactivated, they can be placed in a state where software can reactive them (a soft block) or where software cannot reactive them (a hard block).
The RFKill core provides the application programming interface (API) for the subsystem. Kernel drivers that have been designed to support RFkill use this API to register with the kernel, and include methods for enabling and disabling the device. Additionally, the RFKill core provides notifications that user applications can interpret and ways for user applications to query transmitter states.
The RFKill interface is located at /dev/rfkill, which contains the current state of all radio transmitters on the system. Each device has its current RFKill state registered in sysfs. Additionally, RFKill issues uevents for each change of state in an RFKill-enabled device.
Rfkill is a command-line tool with which you can query and change RFKill-enabled devices on the system. To obtain the tool, install the rfkill package.
Use the command rfkill list to obtain a list of devices, each of which has an index number associated with it, starting at 0. You can use this index number to tell rfkill to block or unblock a device, for example:
rfkill block 0
blocks the first RFKill-enabled device on the system.
You can also use rfkill to block certain categories of devices, or all RFKill-enabled devices. For example:
rfkill block wifi
blocks all Wi-Fi devices on the system. To block all RFKill-enabled devices, run:
rfkill block all
To unblock devices, run rfkill unblock instead of rfkill block. To obtain a full list of device categories that rfkill can block, run rfkill help

3.13. Optimizations in User Space

Reducing the amount of work performed by system hardware is fundamental to saving power. Therefore, although the changes described in Chapter 3, Core Infrastructure and Mechanics permit the system to operate in various states of reduced power consumption, applications in user space that request unnecessary work from system hardware prevent the hardware from entering those states. During the development of Red Hat Enterprise Linux 6, audits were undertaken in the following areas to reduce unnecessary demands on hardware:
Reduced wakeups
Red Hat Enterprise Linux 6 uses a tickless kernel (refer to Section 3.6, "Tickless Kernel"), which allows the CPUs to remain in deeper idle states longer. However, the timer tick is not the only source of excessive CPU wakeups, and function calls from applications can also prevent the CPU from entering or remaining in idle states. Unnecessary function calls were reduced in over 50 applications.
Reduced storage and network IO
Input or output (IO) to storage devices and network interfaces forces devices to consume power. In storage and network devices that feature reduced power states when idle (for example, ALPM or ASPM), this traffic can prevent the device from entering or remaining in an idle state, and can prevent hard drives from spinning down when not in use. Excessive and unnecessary demands on storage have been minimized in several applications. In particular, those demands that prevented hard drives from spinning down.
Initscript audit
Services that start automatically whether required or not have great potential to waste system resources. Services instead should default to "off" or "on demand" wherever possible. For example, the BlueZ service that enables Bluetooth support previously ran automatically when the system started, whether Bluetooth hardware was present or not. The BlueZ initscript now checks that Bluetooth hardware is present on the system before starting the service.

Chapter 4. Use Cases

This chapter describes two types of use case to illustrate the analysis and configuration methods described elsewhere in this guide. The first example considers typical servers and the second is a typical laptop.

4.1. Example - Server

A typical standard server nowadays comes with basically all of the necessary hardware features supported in Red Hat Enterprise Linux 6. The first thing to take into consideration is the kinds of workloads for which the server will mainly be used. Based on this information you can decide which components can be optimized for power savings.
Regardless of the type of server, graphics performance is generally not required. Therefore, GPU power savings can be left turned on.
Webserver
A webserver needs network and disk I/O. Depending on the external connection speed 100 Mbit/s might be enough. If the machine serves mostly static pages, CPU performance might not be very important. Power-management choices might therefore include:
  • no disk or network plugins for tuned.
  • ALPM turned on.
  • ondemand governor turned on.
  • network card limited to 100 Mbit/s.
Compute server
A compute server mainly needs CPU. Power management choices might include:
  • depending on the jobs and where data storage happens, disk or network plugins for tuned; or for batch-mode systems, fully active tuned.
  • depending on utilization, perhaps the performance governor.
Mailserver
A mailserver needs mostly disk I/O and CPU. Power management choices might include:
  • ondemand governor turned on, because the last few percent of CPU performance are not important.
  • no disk or network plugins for tuned.
  • network speed should not be limited, because mail is often internal and can therefore benefit from a 1 Gbit/s or 10 Gbit/s link.
Fileserver
Fileserver requirements are similar to those of a mailserver, but depending on the protocol used, might require more CPU performance. Typically, Samba-based servers require more CPU than NFS, and NFS typically requires more than iSCSI. Even so, you should be able to use the ondemand governor.
Directory server
A directory server typically has lower requirements for disk I/O, especially if equipped with enough RAM. Network latency is important although network I/O less so. You might consider latency network tuning with a lower link speed, but you should test this carefully for your particular network.

4.2. Example - Laptop

One other very common place where power management and savings can really make a difference are laptops. As laptops by design normally already use drastically less energy than workstations or servers the potential for absolute savings are less than for other machines. When in battery mode, though, any saving can help to get a few more minutes of battery life out of a laptop. Although this section focuses on laptops in battery mode, but you certainly can still use some or all of those tunings while running on AC power as well.
Savings for single components usually make a bigger relative difference on laptops than they do on workstations. For example, a 1 Gbit/s network interface running at 100 Mbits/s saves around 3-4 watts. For a typical server with a total power consumption of around 400 watts, this saving is approximately 1 %. On a laptop with a total power consumption of around 40 watts, the power saving on just this one component amounts to 10 % of the total.
Specific power-saving optimizations on a typical laptop include:
  • Configure the system BIOS to disable all hardware that you do not use. For example, parallel or serial ports, card readers, webcams, WiFi, and Bluetooth just to name a few possible candidates.
  • Dim the display in darker environments where you do not need full illumination to read the screen comfortably. Use System+PreferencesPower Management on the GNOME desktop, Kickoff Application Launcher+Computer+System Settings+AdvancedPower Management on the KDE desktop; or gnome-power-manager or xbacklight at the command line; or the function keys on your laptop.
  • Use the laptop-battery-powersave profile of tuned-adm to enable a whole set of power-saving mechanisms. Note that performance and latency for the hard drive and network interface are impacted.
Additionally (or alternatively) you can perform many small adjustments to various system settings:
  • use the ondemand governor (enabled by default in Red Hat Enterprise Linux 6)
  • enable laptop mode (part of the laptop-battery-powersave profile):
    echo 5 > /proc/sys/vm/laptop_mode
  • increase flush time to disk (part of the laptop-battery-powersave profile):
    echo 1500 > /proc/sys/vm/dirty_writeback_centisecs
  • disable nmi watchdog (part of the laptop-battery-powersave profile):
    echo 0 > /proc/sys/kernel/nmi_watchdog
  • enable AC97 audio power-saving (enabled by default in Red Hat Enterprise Linux 6):
    echo Y > /sys/module/snd_ac97_codec/parameters/power_save
  • enable multi-core power-saving (part of the laptop-battery-powersave profile):
    echo 1 > /sys/devices/system/cpu/sched_mc_power_savings
  • enable USB auto-suspend:
    for i in /sys/bus/usb/devices/*/power/autosuspend; do echo 1 > $i; done
    Note that USB auto-suspend does not work correctly with all USB devices.
  • enable minimum power setting for ALPM (part of the laptop-battery-powersave profile):
    echo min_power > /sys/class/scsi_host/host*/link_power_management_policy
  • mount filesystem using relatime (default in Red Hat Enterprise Linux 6):
    mount -o remount,relatime mountpoint
  • activate best power saving mode for hard drives (part of the laptop-battery-powersave profile):
    hdparm -B 1 -S 200 /dev/sd*
  • disable CD-ROM polling (part of the laptop-battery-powersave profile):
    hal-disable-polling --device /dev/scd*
  • reduce screen brightness to 50 or less, for example:
    xbacklight -set 50
  • activate DPMS for screen idle:
    xset +dpms; xset dpms 0 0 300
  • reduce Wi-Fi power levels (part of the laptop-battery-powersave profile):
    for i in /sys/bus/pci/devices/*/power_level ; do echo 5 > $i ; done
  • deactivate Wi-Fi:
    echo 1 > /sys/bus/pci/devices/*/rf_kill
  • limit wired network to 100 Mbit/s (part of the laptop-battery-powersave profile):
    ethtool -s eth0 advertise 0x0F

Tips for Developers

Every good programming textbook covers problems with memory allocation and the performance of specific functions. As you develop your software, be aware of issues that might increase power consumption on the systems on which the software runs. Although these considerations do not affect every line of code, you can optimize your code in areas which are frequent bottlenecks for performance.
Some techniques that are often problematic include:
  • using threads.
  • unnecessary CPU wake-ups and not using wake-ups efficiently. If you must wake up, do everything at once (race to idle) and as quickly as possible.
  • using [f]sync() unnecessarily.
  • unnecessary active polling or using short, regular timeouts. (React to events instead).
  • not using wake-ups efficiently.
  • inefficient disk access. Use large buffers to avoid frequent disk access. Write one large block at a time.
  • inefficient use of timers. Group timers across applications (or even across systems) if possible.
  • excessive I/O, power consumption, or memory usage (including memory leaks)
  • performing unnecessary computation.
The following sections examine some of these areas in greater detail.

A.1. Using Threads

It is widely believed that using threads makes applications perform better and faster, but this is not true in every case.
Python
Python uses the Global Lock Interpreter[1], so threading is profitable only for larger I/O operations. Unladen-swallow [2] is a faster implementation of Python with which you might be able to optimize your code.
Perl
Perl threads were originally created for applications running on systems without forking (such as systems with 32-bit Windows operating systems). In Perl threads, the data is copied for every single thread (Copy On Write). Data is not shared by default, because users should be able to define the level of data sharing. For data sharing the threads::shared module has to be included. However, data is not only then copied (Copy On Write), but the module also creates tied variables for the data, which takes even more time and is even slower. [3]
C
C threads share the same memory, each thread has its own stack, and the kernel does not have to create new file descriptors and allocate new memory space. C can really use the support of more CPUs for more threads. Therefore, to maximize the performance of your threads, use a low-level language like C or C++. If you use a scripting language, consider writing a C binding. Use profilers to identify poorly performing parts of your code. [4]

A.2. Wake-ups

Many applications scan configuration files for changes. In many cases, the scan is performed at a fixed interval, for example, every minute. This can be a problem, because it forces a disk to wake up from spindowns. The best solution is to find a good interval, a good checking mechanism, or to check for changes with inotify and react to events. Inotify can check variety of changes on a file or a directory.
For example:
int fd;fd = inotify_init();int wd;/* checking modification of a file - writing into */wd = inotify_add_watch(fd, "./myConfig", IN_MODIFY);if (wd < 0) {  inotify_cant_be_used();  switching_back_to_previous_checking();}...fd_set rdfs;struct timeval tv;int retval;FD_ZERO(&rdfs);FD_SET(0, &rdfs);tv.tv_sec = 5;value = select(1, &rdfs, NULL, NULL, &tv);if (value == -1)  perror(select);else {  do_some_stuff();}...
The advantage of this approach is the variety of checks that you can perform.
The main limitation is that only a limited number of watches are available on a system. The number can be obtained from /proc/sys/fs/inotify/max_user_watches and although it can be changed, this is not recommended. Furthermore, in case inotify fails, the code has to fall back to a different check method, which usually means many occurrences of #if #define in the source code.
For more information on inotify, refer to the inotify man page.

A.3. Fsync

Fsync is known as an I/O expensive operation, but this is is not completely true.
Firefox used to call the sqlite library each time the user clicked on a link to go to a new page. Sqlite called fsync and because of the file system settings (mainly ext3 with data-ordered mode), there was a long latency when nothing happened. This could take a long time (up to 30 seconds) if another process was copying a large file at the same time.
However, in other cases, where fsync wasn't used at all, problems emerged with the switch to the ext4 file system. Ext3 was set to data-ordered mode, which flushed memory every few seconds and saved it to a disk. But with ext4 and laptop_mode, the interval between saves was longer and data might get lost when the system was unexpectedly switched off. Now ext4 is patched, but we must still consider the design of our applications carefully, and use fsync as appropriate.
The following simple example of reading and writing into a configuration file shows how a backup of a file can be made or how data can be lost:
/* open and read configuration file e.g. ./myconfig */fd = open("./myconfig", O_RDONLY);read(fd, myconfig_buf, sizeof(myconfig_buf));close(fd);...fd = open("./myconfig", O_WRONLY | O_TRUNC | O_CREAT, S_IRUSR | S_IWUSR);write(fd, myconfig_buf, sizeof(myconfig_buf));close(fd);
A better approach would be:
/* open and read configuration file e.g. ./myconfig */fd = open("./myconfig", O_RDONLY);read(fd, myconfig_buf, sizeof(myconfig_buf));close(fd);...fd = open("./myconfig.suffix", O_WRONLY | O_TRUNC | O_CREAT, S_IRUSR | S_IWUSRwrite(fd, myconfig_buf, sizeof(myconfig_buf));fsync(fd); /* paranoia - optional */...close(fd);rename("./myconfig", "./myconfig~"); /* paranoia - optional */rename("./myconfig.suffix", "./myconfig");

Revision History

Revision History
Revision 1.0-23Tue Feb 19 2013Jack Reed
Second version for 6.4 GA release
Revision 1.0-22Mon Feb 18 2013Jack Reed
Version for 6.4 GA release
Revision 1.0-21Wed Nov 7 2012Jack Reed
Correcting minor error
Revision 1.0-20Wed Oct 31 2012Jack Reed
Removing two admonition titles
Revision 1.0-19Wed Oct 31 2012Jack Reed
Correcting minor error in admonition on cpupower alternative
Revision 1.0-18Tue Oct 30 2012Jack Reed
Added cpupower related admonitions, and updated function of --policy option
Revision 1.0-15Thu Oct 18 2012Jack Reed
Added CPU Monitors and CPU Power Saving Policies sections and updated some commands to reflect the new cpupower tool - BZ#860874
Revision 1.0-14Fri Feb 10 2012Jack Reed
Additional corrections to ASPM parameter formatting - BZ#732859
Added new tuned profiles - BZ#701924
Revision 1.0-12Fri Dec 16 2011Jack Reed
Further corrections to ASPM parameter formatting - BZ#732859
Revision 1.0-11Thu Dec 15 2011Jack Reed
Removed dead link from fsync appendix - BZ#706928
Revision 1.0-10Thu Dec 15 2011Jack Reed
Further edits to sample fsync config files - BZ#706928
Revision 1.0-9Wed Dec 14 2011Jack Reed
Removed reference to now-defunct relatime boot parameter - BZ#692859
Revision 1.0-8Mon Dec 12 2011Jack Reed
Corrected typing error - BZ#722798
Edited sample config file in fsync section - BZ#706928
Clarified availability of battery power preferences - BZ#740794
Corrected ASPM policy parameter formatting - BZ#732859
Revision 1.0-7Thu Sep 29 2011Jack Reed
Removed On Battery Power tab information from GNOME Power Manager section - BZ#740794
Revision 1.0-6Tue May 24 2011Rüdiger Landmann
rebuild
Revision 1.0-2Fri Oct 22 2010Rüdiger Landmann
correct minor errors in text
Revision 1.0-1Thu Oct 7 2010Rüdiger Landmann
remove "draft" tag
Revision 1.0-0Thu Oct 7 2010Rüdiger Landmann
GA release
(Sebelumnya) 7 : Power Management Guide8 : Security-Enhanced Linux (Berikutnya)