Chapter 6. Profiling

Developers profile programs to focus attention on the areas of the program that have the largest impact on performance. The types of data collected include what section of the program consumes the most processor time, and where memory is allocated. Profiling collects data from the actual program execution. Thus, the quality of the data collect is influenced by the actual tasks being performed by the program. The tasks performed during profiling should be representative of actual use; this ensures that problems arising from realistic use of the program are addressed during development.

Red Hat Enterprise Linux 6 includes a number of different tools (Valgrind, OProfile, perf, and SystemTap) to collect profiling data. Each tool is suitable for performing specific types of profile runs, as described in the following sections.

6.1. Valgrind

Valgrind is an instrumentation framework for building dynamic analysis tools that can be used to profile applications in detail. Valgrind tools are generally used to automatically detect many memory management and threading problems. The Valgrind suite also includes tools that allow the building of new profiling tools as required.

Valgrind provides instrumentation for user-space binaries to check for errors, such as the use of uninitialized memory, improper allocation/freeing of memory, and improper arguments for systemcalls. Its profiling tools can be used by normal users on most binaries; however, compared to other profilers, Valgrind profile runs are significantly slower. To profile a binary, Valgrind rewrites its executable and instruments the rewritten binary. Valgrind's tools are most useful for looking for memory-related issues in user-space programs; it is not suitable for debugging time-specific issues or kernel-space instrumentation/debugging.

Previously, Valgrind did not support IBM System z architecture. However, as of 6.1, this support has been added, meaning Valgrind now supports all hardware architectures that are supported by Red Hat Enterprise Linux 6.x.

6.1.1. Valgrind Tools

The Valgrind suite is composed of the following tools:

memcheck

This tool detects memory management problems in programs by checking all reads from and writes to memory and intercepting all system calls to malloc, new, free, and delete. memcheck is perhaps the most used Valgrind tool, as memory management problems can be difficult to detect using other means. Such problems often remain undetected for long periods, eventually causing crashes that are difficult to diagnose.

cachegrind

cachegrind is a cache profiler that accurately pinpoints sources of cache misses in code by performing a detailed simulation of the I1, D1 and L2 caches in the CPU. It shows the number of cache misses, memory references, and instructions accruing to each line of source code; cachegrind also provides per-function, per-module, and whole-program summaries, and can even show counts for each individual machine instructions.

callgrind

Like cachegrind, callgrind can model cache behavior. However, the main purpose of callgrind is to record callgraphs data for the executed code.

massif

massif is a heap profiler; it measures how much heap memory a program uses, providing information on heap blocks, heap administration overheads, and stack sizes. Heap profilers are useful in finding ways to reduce heap memory usage. On systems that use virtual memory, programs with optimized heap memory usage are less likely to run out of memory, and may be faster as they require less paging.

helgrind

In programs that use the POSIX pthreads threading primitives, helgrind detects synchronization errors. Such errors are:

Misuses of the POSIX pthreads API
Potential deadlocks arising from lock ordering problems
Data races (that is, accessing memory without adequate locking)

Valgrind also allows you to develop your own profiling tools. In line with this, Valgrind includes the lackey tool, which is a sample that can be used as a template for generating your own tools.

6.1.2. Using Valgrind

The valgrind package and its dependencies install all the necessary tools for performing a Valgrind profile run. To profile a program with Valgrind, use:

valgrind --tool=toolname program

Refer to Section 6.1.1, "Valgrind Tools" for a list of arguments for toolname. In addition to the suite of Valgrind tools, none is also a valid argument for toolname; this argument allows you to run a program under Valgrind without performing any profiling. This is useful for debugging or benchmarking Valgrind itself.

You can also instruct Valgrind to send all of its information to a specific file. To do so, use the option --log-file=filename. For example, to check the memory usage of the executable file hello and send profile information to output, use:

valgrind --tool=memcheck --log-file=output hello

Refer to Section 6.1.4, "Valgrind Documentation" for more information on Valgrind, along with other available documentation on the Valgrind suite of tools.

6.1.3. Valgrind Plug-in for Eclipse

The Valgrind plug-in for Eclipse integrates several Valgrind tools into Eclipse. This allows Eclipse users to seamlessly include profiling capabilities into their workflow. At present, the Valgrind plug-in for Eclipse supports three Valgrind tools:

Memcheck
Massif
Cachegrind

To launch a Valgrind profile run, navigate to Run > Profile. This will open the Profile As dialog, from which you can select a tool for a profile run.

Figure 6.1. Profile As

To configure each tool for a profile run, navigate to Run > Profile Configuration. This will open the Profile Configuration menu.

Figure 6.2. Profile Configuration

The Valgrind plug-in for Eclipse is provided by the eclipse-valgrind package. For more information about this plug-in, refer to Valgrind Integration User Guide in the Eclipse Help Contents.

6.1.4. Valgrind Documentation

For more extensive information on Valgrind, refer to man valgrind. Red Hat Enterprise Linux 6 also provides a comprehensive Valgrind Documentation book, available as PDF and HTML in:

file:///usr/share/doc/valgrind-version/valgrind_manual.pdf
file:///usr/share/doc/valgrind-version/html/index.html

The Valgrind Integration User Guide in the Eclipse Help Contentsalso provides detailed information on the setup and usage of the Valgrind plug-in for Eclipse. This guide is provided by the eclipse-valgrind package.

6.2. OProfile

OProfile is a system-wide Linux profiler, capable of running at low overhead. It consists of a kernel driver and a daemon for collecting raw sample data, along with a suite of tools for parsing that data into meaningful information. OProfile is generally used by developers to determine which sections of code consume the most amount of CPU time, and why.

During a profile run, OProfile uses the processor's performance monitoring hardware. Valgrind rewrites the binary of an application, and in turn instruments it. OProfile, on the other hand,profiles a running application as-is. It sets up the performance monitoring hardware to take a sample every x number of events (for example, cache misses or branch instructions). Each sample also contains information on where it occurred in the program.

OProfile's profiling methods consume less resources than Valgrind. However, OProfile requires root privileges. OProfile is useful for finding "hot-spots" in code, and looking for their causes (for example, poor cache performance, branch mispredictions).

Using OProfile involves starting the OProfile daemon (oprofiled), running the program to be profiled, collecting the system profile data, and parsing it into a more understandable format. OProfile provides several tools for every step of this process.

6.2.1. OProfile Tools

The most useful OProfile commands include the following:

opcontrol: This tool is used to start/stop the OProfile daemon and configure a profile session.
opreport: The opreport command outputs binary image summaries, or per-symbol data, from OProfile profiling sessions.
opannotate: The opannotate command outputs annotated source and/or assembly from the profile data of an OProfile session.
oparchive: The oparchive command generates a directory populated with executable, debug, and OProfile sample files. This directory can be moved to another machine (via tar), where it can be analyzed offline.
opgprof: Like opreport, the opgprof command outputs profile data for a given binary image from an OProfile session. The output of opgprof is in gprof format.

For a complete list of OProfile commands, refer to man oprofile. For detailed information on each OProfile command, refer to its corresponding man page. Refer to Section 6.2.4, "OProfile Documentation" for other available documentation on OProfile.

6.2.2. Using OProfile

The oprofile package and its dependencies install all the necessary utilities for executing OProfile. To instruct OProfile to profile all the applications running on the system and to group the samples for the shared libraries with the application using the library, run the following command:

# opcontrol --no-vmlinux --separate=library --start

You can also start the OProfile daemon without collecting system data. To do so, use the option --start-daemon. The --stop option halts data collection, while --shutdown terminates the OProfile daemon.

Use opreport, opannotate, or opgprof to display the collected profiling data. By default, the data collected by the OProfile daemon is stored in /var/lib/oprofile/samples/.

OProfile conflict with Performance Counters for Linux (PCL) tools

Both OProfile and Performance Counters for Linux (PCL) use the same hardware Performance Monitoring Unit (PMU). If the PCL or the NMI watchdog timer are using the hardware PMU, a message like the following occurs when starting OProfile:

# opcontrol --startUsing default event: CPU_CLK_UNHALTED:100000:0:1:1Error: counter 0 not available nmi_watchdog using this resource ? Try:opcontrol --deinitecho 0 > /proc/sys/kernel/nmi_watchdog

Stop any perf commands running on the system, then turn off the NMI watchdog and reload the OProfile kernel driver with the following commands:

# opcontrol --deinit

# echo 0 > /proc/sys/kernel/nmi_watchdog

6.2.3. OProfile Plug-in For Eclipse

The OProfile suite of tools provide powerful call profiling capabilities; as a plug-in, these capabilities are well ported into the Eclipse user interface. The OProfile Plug-in provides the following benefits:

Targeted Profiling

The OProfile Plug-in will allow Eclipse users to profile a specific binary, include related shared libraries/kernel modules, and even exclude binaries. This produces very targeted, detailed usage results on each binary, function, and symbol, down to individual line numbers in the source code.

User Interface Fully Integrated into CDT

The plug-in displays enriched OProfile results through Eclipse, just like any other plug-in. Double-clicking on a source line in the results brings users directly to the corresponding line in the Eclipse editor. This allows users to build, profile, and edit code through a single interface, making profiling a convenient experience for Eclipse users. In addition, profile runs are launched and configured the same way as C/C++ applications within Eclipse.

Fully Customizable Profiling Options

The Eclipse interface allows users to configure their profile run using all options available in the OProfile command line utility. The plug-in supports event configuration based on processor debugging registers (that is, counters), as well as interrupt-based profiling for kernels or processors that do not support hardware counters.

Ease of Use

The OProfile Plug-in provides generally useful defaults for all options, usable for a majority of profiling runs. In addition, it also features a "one-click profile" that executes a profile run using these defaults. Users can profile applications from start to finish, or select specific areas of code through a manual control dialog.

To launch a Valgrind profile run, navigate to Run > Profile. This will open the Profile As dialog, from which you can select a tool for a profile run.

Figure 6.3. Profile As

To configure each tool for a profile run, navigate to Run > Profile Configuration. This will open the Profile Configuration menu.

Figure 6.4. Profile Configuration

The OProfile plug-in for Eclipse is provided by the eclipse-oprofile package. For more information about this plug-in, refer to OProfile Integration User Guide in the Eclipse Help Contents (also provided by eclipse-profile).

6.2.4. OProfile Documentation

For a more extensive information on OProfile, refer to man oprofile. Red Hat Enterprise Linux 6 also provides two comprehensive guides to OProfile in file:///usr/share/doc/oprofile-version/:

OProfile Manual: A comprehensive manual with detailed instructions on the setup and use of OProfile is found at file:///usr/share/doc/oprofile-version/oprofile.html
OProfile Internals: Documentation on the internal workings of OProfile, useful for programmers interested in contributing to the OProfile upstream, can be found at file:///usr/share/doc/oprofile-version/internals.html

The OProfile Integration User Guide in the Eclipse Help Contents also provides detailed information on the setup and usage of the OProfile plug-in for Eclipse. This guide is provided by the eclipse-oprofile package.

6.3. SystemTap

SystemTap is a useful instrumentation platform for probing running processes and kernel activity on the Linux system. To execute a probe:

Write SystemTap scripts that specify which system events (for example, virtual file system reads, packet transmissions) should trigger specified actions (for example, print, parse, or otherwise manipulate data).
SystemTap translates the script into a C program, which it compiles into a kernel module.
SystemTap loads the kernel module to perform the actual probe.

SystemTap scripts are useful for monitoring system operation and diagnosing system issues with minimal intrusion into the normal operation of the system. You can quickly instrument running system test hypotheses without having to recompile and re-install instrumented code. To compile a SystemTap script that probes kernel-space, SystemTap uses information from three different kernel information packages:

kernel-variant-devel-version
kernel-variant-debuginfo-version
kernel-debuginfo-common-arch-version

Note

The kernel information package in Red Hat Enterprise Linux 6 is now named kernel-debuginfo-common-arch-version. It was originally kernel-debuginfo-common-version in Red Hat Enterprise Linux 5.

These kernel information packages must match the kernel to be probed. In addition, to compile SystemTap scripts for multiple kernels, the kernel information packages of each kernel must also be installed.

An important new feature has been added as of Red Hat Enterprise Linux 6.1: the --remote option. This allows users to build the SystemTap module locally, and then execute it remotely via SSH. The syntax to use this is --remote [USER@]HOSTNAME; set the execution target to the specified SSH host, optionally using a different username. This option may be repeated to target multiple execution targets. Passes 1-4 are completed locally as normal to build the script, and then pass 5 copies the module to the target and runs it.

The following sections describe other new SystemTap features available in the Red Hat Enterprise Linux 6 release.

6.3.1. SystemTap Compile Server

SystemTap in Red Hat Enterprise Linux 6 supports a compile server and client deployment. With this setup, the kernel information packages of all client systems in the network are installed on just one compile server host (or a few). When a client system attempts to compile a kernel module from a SystemTap script, it remotely accesses the kernel information it requires from the centralized compile server host.

A properly configured and maintained SystemTap compile server host offers the following benefits:

The system administrator can verify the integrity of kernel information packages before making the packages available to users.
The identity of a compile server can be authenticated using the Secure Socket Layer (SSL). SSL provides an encrypted network connection that prevents eavesdropping or tampering during transmission.
Individual users can run their own servers and authorize them for their own use as trusted.
System administrators can authorize one or more servers on the network as trusted for use by all users.
A server that has not been explicitly authorized is ignored, preventing any server impersonations and similar attacks.

6.3.2. SystemTap Support for Unprivileged Users

For security purposes, users in an enterprise setting are rarely given privileged (that is, root or sudo) access to their own machines. In addition, full SystemTap functionality should also be restricted to privileged users, as this can provide the ability to completely take control of a system.

SystemTap in Red Hat Enterprise Linux 6 features a new option to the SystemTap client: --unprivileged. This option allows an unprivileged user to run stap. Of course, several restrictions apply to unprivileged users that attempt to run stap.

Note

An unprivileged user is a member of the group stapusr but is not a member of the group stapdev (and is not root).

Before loading any kernel modules created by unprivileged users, SystemTap verifies the integrity of the module using standard digital (cryptographic) signing techniques. Each time the --unprivileged option is used, the server checks the script against the constraints imposed for unprivileged users. If the checks are successful, the server compiles the script and signs the resulting module using a self-generated certificate. When the client attempts to load the module, staprun first verifies the signature of the module by checking it against a database of trusted signing certificates maintained and authorized by root.

Once a signed kernel module is successfully verified, staprun is assured that:

The module was created using a trusted systemtap server implementation.
The module was compiled using the --unprivileged option.
The module meets the restrictions required for use by an unprivileged user.
The module has not been tampered with since it was created.

6.3.3. SSL and Certificate Management

SystemTap in Red Hat Enterprise Linux 6 implements authentication and security via certificates and public/private key pairs. It is the responsibility of the system administrator to add the credentials (that is, certificates) of compile servers to a database of trusted servers. SystemTap uses this database to verify the identity of a compile server that the client attempts to access. Likewise, SystemTap also uses this method to verify kernel modules created by compile servers using the --unprivileged option.

6.3.3.1. Authorizing Compile Servers for Connection

The first time a compile server is started on a server host, the compile server automatically generates a certificate. This certificate verifies the compile server's identity during SSL authentication and module signing.

In order for clients to access the compile server (whether on the same server host or from a client machine), the system administrator must add the compile server's certificate to a database of trusted servers. Each client host intending to use compile servers maintains such a database. This allows individual users to customize their database of trusted servers, which can include a list of compile servers authorized for their own use only.

6.3.3.2. Authorizing Compile Servers for Module Signing (for Unprivileged Users)

Unprivileged users can only load signed, authorized SystemTap kernel modules. For modules to be recognized as such, they have to be created by a compile server whose certificate appears in a database of trusted signers; this database must be maintained on each host where the module will be loaded.

6.3.3.3. Automatic Authorization

Servers started using the stap-server initscript are automatically authorized to receive connections from all clients on the same host.

Servers started by other means are automatically authorized to receive connections from clients on the same host run by the user who started the server. This was implemented with convenience in mind; users are automatically authorized to connect to a server they started themselves, provided that both client and server are running on the same host.

Whenever root starts a compile server, all clients running on the same host automatically recognize the server as authorized. However, Red Hat advises that you refrain from doing so.

Similarly, a compile server initiated through stap-server is automatically authorized as a trusted signer on the host in which it runs. If the compile server was initiated through other means, it is not automatically authorized as such.

6.3.4. SystemTap Documentation

For more detailed information about SystemTap, refer to the following books (also provided by Red Hat):

SystemTap Beginner's Guide
SystemTap Tapset Reference
SystemTap Language Reference (documentation supplied by IBM)

The SystemTap Beginner's Guide and SystemTap Tapset Reference are also available locally when you install the systemtap package:

file:///usr/share/doc/systemtap-version/SystemTap_Beginners_Guide/index.html
file:///usr/share/doc/systemtap-version/SystemTap_Beginners_Guide.pdf
file:///usr/share/doc/systemtap-version/tapsets/index.html
file:///usr/share/doc/systemtap-version/tapsets.pdf

Section 6.3.1, "SystemTap Compile Server", Section 6.3.2, "SystemTap Support for Unprivileged Users", and Section 6.3.3, " SSL and Certificate Management" are all excerpts from the SystemTap Support for Unprivileged Users and Server Client Deployment whitepaper. This whitepaper also provides more details on each feature, along with a case study to help illustrate their application in a real-world environment.

6.4. Performance Counters for Linux (PCL) Tools and perf

Performance Counters for Linux (PCL) is a new kernel-based subsystem that provides a framework for collecting and analyzing performance data. These events will vary based on the performance monitoring hardware and the software configuration of the system. Red Hat Enterprise Linux 6 includes this kernel subsystem to collect data and the user-space tool perf to analyze the collected performance data.

The PCL subsystem can be used to measure hardware events, including retired instructions and processor clock cycles. It can also measure software events, including major page faults and context switches. For example, PCL counters can compute the Instructions Per Clock (IPC) from a process's counts of instructions retired and processor clock cycles. A low IPC ratio indicates the code makes poor use of the CPU. Other hardware events can also be used to diagnose poor CPU performance.

Performance counters can also be configured to record samples. The relative frequency of samples can be used to identify which regions of code have the greatest impact on performance.

6.4.1. Perf Tool Commands

Useful perf commands include the following:

perf stat: This perf command provides overall statistics for common performance events, including instructions executed and clock cycles consumed. Options allow selection of events other than the default measurement events.
perf record: This perf command records performance data into a file which can be later analyzed using perf report.
perf report: This perf command reads the performance data from a file and analyzes the recorded data.
perf list: This perf command lists the events available on a particular machine. These events will vary based on the performance monitoring hardware and the software configuration of the system.

Use perf help to obtain a complete list of perf commands. To retrieve man page information on each perf command, use perf help command.

6.4.2. Using Perf

Using the basic PCL infrastructure for collecting statistics or samples of program execution is relatively straightforward. This section provides simple examples of overall statistics and sampling.

To collect statistics on make and its children, use the following command:

# perf stat -- make all

The perf command collects a number of different hardware and software counters. It then prints the following information:

Performance counter stats for 'make all':  244011.782059  task-clock-msecs #  0.925 CPUs   53328  context-switches #  0.000 M/sec 515  CPU-migrations   #  0.000 M/sec 1843121  page-faults  #  0.008 M/sec   789702529782  cycles   #   3236.330 M/sec  1050912611378  instructions #  1.331 IPC 275538938708  branches #   1129.203 M/sec 2888756216  branch-misses #  1.048 % 4343060367  cache-references # 17.799 M/sec  428257037  cache-misses #  1.755 M/sec  263.779192511  seconds time elapsed

The perf tool can also record samples. For example, to record data on the make command and its children, use:

# perf record -- make all

This prints out the file in which the samples are stored, along with the number of samples collected:

[ perf record: Woken up 42 times to write data ][ perf record: Captured and wrote 9.753 MB perf.data (~426109 samples) ]

As of Red Hat Enterprise Linux 6.4, a new functionality to the {} group syntax has been added that allows the creation of event groups based on the way they are specified on the command line.

The current --group or -g options remain the same; if it is specified for record, stat, or top command, all the specified events become members of a single group with the first event as a group leader.

The new {} group syntax allows the creation of a group like:

# perf record -e '{cycles, faults}' ls

The above results in a single event group containing cycles and faults events, with the cycles event as the group leader.

All groups are created with regards to threads and CPUs. As such, recording an event group within two threads on a server with four CPUs will create eight separate groups.

It is possible to use a standard event modifier for a group. This spans over all events in the group and updates each event modifier settings.

# perf record -r '{faults:k,cache-references}:p'

The above command results in the :kp modifier being used for faults, and the :p modifier being used for the cache-references event.

Performance Counters for Linux (PCL) Tools conflict with OProfile

Both OProfile and Performance Counters for Linux (PCL) use the same hardware Performance Monitoring Unit (PMU). If OProfile is currently running while attempting to use the PCL perf command, an error message like the following occurs when starting OProfile:

Error: open_counter returned with 16 (Device or resource busy). /bin/dmesg may provide additional information.Fatal: Not all events could be opened.

To use the perf command, first shut down OProfile:

# opcontrol --deinit

You can then analyze perf.data to determine the relative frequency of samples. The report output includes the command, object, and function for the samples. Use perf report to output an analysis of perf.data. For example, the following command produces a report of the executable that consumes the most time:

# perf report --sort=comm

The resulting output:

# Samples: 1083783860000## Overhead  Command# ........  ...............# 48.19% xsltproc 44.48% pdfxmltex 6.01% make 0.95% perl 0.17%   kernel-doc 0.05%  xmllint 0.05%  cc1 0.03%   cp 0.01% xmlto 0.01%   sh 0.01%  docproc 0.01%   ld 0.01%  gcc 0.00%   rm 0.00%  sed 0.00%   git-diff-files 0.00% bash 0.00%   git-diff-index

The column on the left shows the relative frequency of the samples. This output shows that make spends most of this time in xsltproc and the pdfxmltex. To reduce the time for the make to complete, focus on xsltproc and pdfxmltex. To list the functions executed by xsltproc, run:

# perf report -n --comm=xsltproc

This generates:

comm: xsltproc# Samples: 472520675377## Overhead  Samples Shared Object  Symbol# ........ ..........  .............................  ......# 45.54%215179861044  libxml2.so.2.7.6   [.] xmlXPathCmpNodesExt 11.63%54959620202  libxml2.so.2.7.6   [.] xmlXPathNodeSetAdd__internal_alias 8.60%40634845107  libxml2.so.2.7.6   [.] xmlXPathCompOpEval 4.63%21864091080  libxml2.so.2.7.6   [.] xmlXPathReleaseObject 2.73%12919672281  libxml2.so.2.7.6   [.] xmlXPathNodeSetSort__internal_alias 2.60%12271959697  libxml2.so.2.7.6   [.] valuePop 2.41%11379910918  libxml2.so.2.7.6   [.] xmlXPathIsNaN__internal_alias 2.19%10340901937  libxml2.so.2.7.6   [.] valuePush__internal_alias

6.5. ftrace

The ftrace framework provides users with several tracing capabilities, accessible through an interface much simpler than SystemTap's. This framework uses a set of virtual files in the debugfs file system; these files enable specific tracers. The ftrace function tracer outputs each function called in the kernel in real time; other tracers within the ftrace framework can also be used to analyze wakeup latency, task switches, kernel events, and the like.

You can also add new tracers for ftrace, making it a flexible solution for analyzing kernel events. The ftrace framework is useful for debugging or analyzing latencies and performance issues that take place outside of user-space. Unlike other profilers documented in this guide, ftrace is a built-in feature of the kernel.

6.5.1. Using ftrace

The Red Hat Enterprise Linux 6 kernels have been configured with the CONFIG_FTRACE=y option. This option provides the interfaces required by ftrace. To use ftrace, mount the debugfs file system as follows:

mount -t debugfs nodev /sys/kernel/debug

All the ftrace utilities are located in /sys/kernel/debug/tracing/. View the /sys/kernel/debug/tracing/available_tracers file to find out what tracers are available for your kernel:

cat /sys/kernel/debug/tracing/available_tracers

power wakeup irqsoff function sysprof sched_switch initcall nop

To use a specific tracer, write it to /sys/kernel/debug/tracing/current_tracer. For example, wakeup traces and records the maximum time it takes for the highest-priority task to be scheduled after the task wakes up. To use it:

echo wakeup > /sys/kernel/debug/tracing/current_tracer

To start or stop tracing, write to /sys/kernel/debug/tracing/tracing_on, as in:

echo 1 > /sys/kernel/debug/tracing/tracing_on (enables tracing)

echo 0 > /sys/kernel/debug/tracing/tracing_on (disables tracing)

The results of the trace can be viewed from the following files:

/sys/kernel/debug/tracing/trace: This file contains human-readable trace output.
/sys/kernel/debug/tracing/trace_pipe: This file contains the same output as /sys/kernel/debug/tracing/trace, but is meant to be piped into a command. Unlike /sys/kernel/debug/tracing/trace, reading from this file consumes its output.

6.5.2. ftrace Documentation

The ftrace framework is fully documented in the following files:

ftrace - Function Tracer: file:///usr/share/doc/kernel-doc-version/Documentation/trace/ftrace.txt
function tracer guts: file:///usr/share/doc/kernel-doc-version/Documentation/trace/ftrace-design.txt

Developer Guide

Chapter 6. Profiling

6.1. Valgrind

6.1.1. Valgrind Tools

6.1.2. Using Valgrind

6.1.3. Valgrind Plug-in for Eclipse

6.1.4. Valgrind Documentation

6.2. OProfile

6.2.1. OProfile Tools

6.2.2. Using OProfile

6.2.3. OProfile Plug-in For Eclipse

6.2.4. OProfile Documentation

6.3. SystemTap

6.3.1. SystemTap Compile Server

6.3.2. SystemTap Support for Unprivileged Users

6.3.3. SSL and Certificate Management

6.3.3.1. Authorizing Compile Servers for Connection

6.3.3.2. Authorizing Compile Servers for Module Signing (for Unprivileged Users)

6.3.3.3. Automatic Authorization

6.3.4. SystemTap Documentation

6.4. Performance Counters for Linux (PCL) Tools and perf

6.4.1. Perf Tool Commands

6.4.2. Using Perf

6.5. ftrace

6.5.1. Using ftrace

6.5.2. ftrace Documentation

Chapter 7. Red Hat Developer Toolset

7.1. What is Red Hat Developer Toolset?

7.2. What Does Red Hat Developer Toolset Offer?

7.3. Features and Improvements Provided by Red Hat Developer Toolset

7.3.1. GNU Compiler Collection (GCC)

7.3.2. GNU Debugger (GDB)

7.3.3. Binutils

7.4. Which Platforms Are Supported?

7.5. For More Information