Cari di RHE Linux 
    RHE Linux User Manual
Daftar Isi
(Sebelumnya) 20 : Index - Storage Administr ...21 : Chapter 3. Understanding ... (Berikutnya)

SystemTap Beginners Guide

Introduction to SystemTap

Edition 2

Red Hat, Inc.

Don Domingo

Engineering Services and Operations Content Services

William Cohen

Engineering Services and Operations Performance Tools

Edited by

Jacquelynn East

Red Hat Engineering Content Services

Legal Notice

This documentation is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License version 2 as published by the Free Software Foundation.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
For more details see the file COPYING in the source distribution of Linux.

Daftar Isi

Preface
1. Document Conventions
1.1. Typographic Conventions
1.2. Pull-quote Conventions
1.3. Notes and Warnings
2. Getting Help and Giving Feedback
2.1. Do You Need Help?
2.2. We Need Feedback!
1. Introduction
1.1. Documentation Goals
1.2. SystemTap Capabilities
2. Using SystemTap
2.1. Installation and Setup
2.1.1. Installing SystemTap
2.1.2. Installing Required Kernel Information RPMs
2.1.3. Initial Testing
2.2. Generating Instrumentation for Other Computers
2.3. Running SystemTap Scripts
2.3.1. SystemTap Flight Recorder Mode
3. Understanding How SystemTap Works
3.1. Architecture
3.2. SystemTap Scripts
3.2.1. Event
3.2.2. Systemtap Handler/Body
3.3. Basic SystemTap Handler Constructs
3.3.1. Variables
3.3.2. Conditional Statements
3.3.3. Command-Line Arguments
3.4. Associative Arrays
3.5. Array Operations in SystemTap
3.5.1. Assigning an Associated Value
3.5.2. Reading Values From Arrays
3.5.3. Incrementing Associated Values
3.5.4. Processing Multiple Elements in an Array
3.5.5. Clearing/Deleting Arrays and Array Elements
3.5.6. Using Arrays in Conditional Statements
3.5.7. Computing for Statistical Aggregates
3.6. Tapsets
4. Useful SystemTap Scripts
4.1. Network
4.1.1. Network Profiling
4.1.2. Tracing Functions Called in Network Socket Code
4.1.3. Monitoring Incoming TCP Connections
4.1.4. Monitoring Network Packets Drops in Kernel
4.2. Disk
4.2.1. Summarizing Disk Read/Write Traffic
4.2.2. Tracking I/O Time For Each File Read or Write
4.2.3. Track Cumulative IO
4.2.4. I/O Monitoring (By Device)
4.2.5. Monitoring Reads and Writes to a File
4.2.6. Monitoring Changes to File Attributes
4.3. Profiling
4.3.1. Counting Function Calls Made
4.3.2. Call Graph Tracing
4.3.3. Determining Time Spent in Kernel and User Space
4.3.4. Monitoring Polling Applications
4.3.5. Tracking Most Frequently Used System Calls
4.3.6. Tracking System Call Volume Per Process
4.4. Identifying Contended User-Space Locks
5. Understanding SystemTap Errors
5.1. Parse and Semantic Errors
5.2. Run Time Errors and Warnings
6. References
A. Revision History
Index

Preface

1. Document Conventions

This manual uses several conventions to highlight certain words and phrases and draw attention to specific pieces of information.
In PDF and paper editions, this manual uses typefaces drawn from the Liberation Fonts set. The Liberation Fonts set is also used in HTML editions if the set is installed on your system. If not, alternative but equivalent typefaces are displayed. Note: Red Hat Enterprise Linux 5 and later includes the Liberation Fonts set by default.

1.1. Typographic Conventions

Four typographic conventions are used to call attention to specific words and phrases. These conventions, and the circumstances they apply to, are as follows.
Mono-spaced Bold
Used to highlight system input, including shell commands, file names and paths. Also used to highlight keys and key combinations. For example:
To see the contents of the file my_next_bestselling_novel in your current working directory, enter the cat my_next_bestselling_novel command at the shell prompt and press Enter to execute the command.
The above includes a file name, a shell command and a key, all presented in mono-spaced bold and all distinguishable thanks to context.
Key combinations can be distinguished from an individual key by the plus sign that connects each part of a key combination. For example:
Press Enter to execute the command.
Press Ctrl+Alt+F2 to switch to a virtual terminal.
The first example highlights a particular key to press. The second example highlights a key combination: a set of three keys pressed simultaneously.
If source code is discussed, class names, methods, functions, variable names and returned values mentioned within a paragraph will be presented as above, in mono-spaced bold. For example:
File-related classes include filesystem for file systems, file for files, and dir for directories. Each class has its own associated set of permissions.
Proportional Bold
This denotes words or phrases encountered on a system, including application names; dialog box text; labeled buttons; check-box and radio button labels; menu titles and sub-menu titles. For example:
Choose SystemPreferencesMouse from the main menu bar to launch Mouse Preferences. In the Buttons tab, click the Left-handed mouse check box and click Close to switch the primary mouse button from the left to the right (making the mouse suitable for use in the left hand).
To insert a special character into a gedit file, choose ApplicationsAccessoriesCharacter Map from the main menu bar. Next, choose SearchFind . . . . . . from the Character Map menu bar, type the name of the character in the Search field and click Next. The character you sought will be highlighted in the Character Table. Double-click this highlighted character to place it in the Text to copy field and then click the Copy button. Now switch back to your document and choose EditPaste from the gedit menu bar.
The above text includes application names; system-wide menu names and items; application-specific menu names; and buttons and text found within a GUI interface, all presented in proportional bold and all distinguishable by context.
Mono-spaced Bold Italic or Proportional Bold Italic
Whether mono-spaced bold or proportional bold, the addition of italics indicates replaceable or variable text. Italics denotes text you do not input literally or displayed text that changes depending on circumstance. For example:
To connect to a remote machine using ssh, type ssh username@domain.name at a shell prompt. If the remote machine is example.com and your username on that machine is john, type ssh [email protected].
The mount -o remount file-system command remounts the named file system. For example, to remount the /home file system, the command is mount -o remount /home.
To see the version of a currently installed package, use the rpm -q package command. It will return a result as follows: package-version-release.
Note the words in bold italics above - username, domain.name, file-system, package, version and release. Each word is a placeholder, either for text you enter when issuing a command or for text displayed by the system.
Aside from standard usage for presenting the title of a work, italics denotes the first use of a new and important term. For example:
Publican is a DocBook publishing system.

1.2. Pull-quote Conventions

Terminal output and source code listings are set off visually from the surrounding text.
Output sent to a terminal is set in mono-spaced roman and presented thus:
books Desktop   documentation  drafts  mss photos   stuff  svnbooks_tests  Desktop1  downloads  images  notes  scripts  svgs
Source-code listings are also set in mono-spaced roman but add syntax highlighting as follows:
package org.jboss.book.jca.ex1;import javax.naming.InitialContext;public class ExClient{   public static void main(String args[]) throws Exception   {  InitialContext iniCtx = new InitialContext();  Object ref = iniCtx.lookup("EchoBean");  EchoHome   home   = (EchoHome) ref;  Echo   echo   = home.create();  System.out.println("Created Echo");  System.out.println("Echo.echo('Hello') = " + echo.echo("Hello"));   }}

1.3. Notes and Warnings

Finally, we use three visual styles to draw attention to information that might otherwise be overlooked.

Note

Notes are tips, shortcuts or alternative approaches to the task at hand. Ignoring a note should have no negative consequences, but you might miss out on a trick that makes your life easier.

Important

Important boxes detail things that are easily missed: configuration changes that only apply to the current session, or services that need restarting before an update will apply. Ignoring a box labeled 'Important' will not cause data loss but may cause irritation and frustration.

Warning

Warnings should not be ignored. Ignoring warnings will most likely cause data loss.

2. Getting Help and Giving Feedback

2.1. Do You Need Help?

If you experience difficulty with a procedure described in this documentation, visit the Red Hat Customer Portal at http://access.redhat.com. Through the customer portal, you can:
  • search or browse through a knowledgebase of technical support articles about Red Hat products.
  • submit a support case to Red Hat Global Support Services (GSS).
  • access other product documentation.
Red Hat also hosts a large number of electronic mailing lists for discussion of Red Hat software and technology. You can find a list of publicly available mailing lists at https://www.redhat.com/mailman/listinfo. Click on the name of any mailing list to subscribe to that list or to access the list archives.

2.2. We Need Feedback!

If you find a typographical error in this manual, or if you have thought of a way to make this manual better, we would love to hear from you! Please submit a report in Bugzilla: http://bugzilla.redhat.com/ against the product Red_Hat_Enterprise_Linux.
When submitting a bug report, be sure to mention the manual's identifier: doc-SystemTap_Beginners_Guide
If you have a suggestion for improving the documentation, try to be as specific as possible when describing it. If you have found an error, please include the section number and some of the surrounding text so we can find it easily.

Chapter 1. Introduction

SystemTap is a tracing and probing tool that allows users to study and monitor the activities of the operating system (particularly, the kernel) in fine detail. It provides information similar to the output of tools like netstat, ps, top, and iostat; however, SystemTap is designed to provide more filtering and analysis options for collected information.

1.1. Documentation Goals

SystemTap provides the infrastructure to monitor the running Linux system for detailed analysis. This can assist administrators and developers in identifying the underlying cause of a bug or performance problem.
Without SystemTap, monitoring the activity of a running kernel would require a tedious instrument, recompile, install, and reboot sequence. SystemTap is designed to eliminate this, allowing users to gather the same information by simply running user-written SystemTap scripts.
However, SystemTap was initially designed for users with intermediate to advanced knowledge of the kernel. This makes SystemTap less useful to administrators or developers with limited knowledge of and experience with the Linux kernel. Moreover, much of the existing SystemTap documentation is similarly aimed at knowledgeable and experienced users. This makes learning the tool similarly difficult.
To lower these barriers the SystemTap Beginners Guide was written with the following goals:
  • To introduce users to SystemTap, familiarize them with its architecture, and provide setup instructions for all kernel types.
  • To provide pre-written SystemTap scripts for monitoring detailed activity in different components of the system, along with instructions on how to run them and analyze their output.

1.2. SystemTap Capabilities

  • Flexibility: SystemTap's framework allows users to develop simple scripts for investigating and monitoring a wide variety of kernel functions, system calls, and other events that occur in kernel-space. With this, SystemTap is not so much a tool as it is a system that allows you to develop your own kernel-specific forensic and monitoring tools.
  • Ease-Of-Use: as mentioned earlier, SystemTap allows users to probe kernel-space events without having to resort to the lengthy instrument, recompile, install, and reboot the kernel process.
Most of the SystemTap scripts enumerated in Chapter 4, Useful SystemTap Scripts demonstrate system forensics and monitoring capabilities not natively available with other similar tools (such as top, oprofile, or ps). These scripts are provided to give readers extensive examples of the application of SystemTap, which in turn will educate them further on the capabilities they can employ when writing their own SystemTap scripts.

Chapter 2. Using SystemTap

This chapter instructs users how to install SystemTap, and provides an introduction on how to run SystemTap scripts.

2.1. Installation and Setup

To deploy SystemTap, SystemTap packages along with the corresponding set of -devel, -debuginfo and -debuginfo-common-arch packages for the kernel need to be installed. To use SystemTap on more than one kernel where a system has multiple kernels installed, install the -devel and -debuginfo packages for each of those kernel versions.
These procedures will be discussed in detail in the following sections.

Important

Many users confuse -debuginfo with -debug. Remember that the deployment of SystemTap requires the installation of the -debuginfo package of the kernel, not the -debug version of the kernel.

2.1.1. Installing SystemTap

To deploy SystemTap, install the following RPMs:
  • systemtap
  • systemtap-runtime
Assuming that yum is installed in the system, these two rpms can be installed with yum install systemtap systemtap-runtime. Install the required kernel information RPMs before using SystemTap.

2.1.2. Installing Required Kernel Information RPMs

SystemTap needs information about the kernel in order to place instrumentation in it (i.e. probe it). This information, which allows SystemTap to generate the code for the instrumentation, is contained in the matching -devel, -debuginfo, and -debuginfo-common-arch packages for the kernel. The necessary -devel and -debuginfo packages for the ordinary "vanilla" kernel are as follows:
  • kernel-debuginfo
  • kernel-debuginfo-common-arch
  • kernel-devel
Likewise, the necessary packages for the PAE kernel would be kernel-PAE-debuginfo, kernel-PAE-debuginfo-common-arch ,and kernel-PAE-devel.
To determine what kernel your system is currently using, use:
uname -r
For example, if you wish to use SystemTap on kernel version 2.6.32-53.el6 on an i686 machine, then you would need to download and install the following RPMs:
  • kernel-debuginfo-2.6.32-53.el6.i686.rpm
  • kernel-debuginfo-common-i686-2.6.32-53.el6.i686.rpm
  • kernel-devel-2.6.32-53.el6.i686.rpm

Important

The version, variant, and architecture of the -devel, -debuginfo and -debuginfo-common-arch packages must match the kernel to be probed with SystemTap exactly.
To obtain a list of the channels SystemTap needs on the system, use the following script:
#! /bin/bashpkg=`rpm -q --whatprovides "redhat-release"`releasever=`rpm -q --qf "%{version}" $pkg`variant=`echo $releasever | tr -d "[:digit:]" | tr "[:upper:]" "[:lower:]" `if test -z "$variant"; then  echo "No Red Hat Enterprise Linux variant (workstation/client/server) found."  exit 1fiversion=`echo $releasever | tr -cd "[:digit:]"`base=`uname -i`echo "rhel-$base-$variant-$version"echo "rhel-$base-$variant-$version-debuginfo"echo "rhel-$base-$variant-optional-$version-debuginfo"echo "rhel-$base-$variant-optional-$version"
After the channels have been added, install the required -devel, debuginfo, and debuginfo-install arch packages for the kernel using the command debuginfo-install kernelname-version. Replace kernelname with the appropriate kernel variant name (for example, kernel-PAE), and version with the target kernel's version. For example, to install the required kernel information packages for the kernel-PAE-2.6.32-53.el6 kernel, run: debuginfo-install kernel-PAE-2.6.32-53.el6

2.1.3. Initial Testing

If the kernel to be probed with SystemTap is currently being used, it is possible to immediately test whether the deployment was successful. If a different kernel is to be probed, reboot and load the appropriate kernel.
To start the test, run the command stap -v -e 'probe vfs.read {printf("read performed\n"); exit()}'. This command simply instructs SystemTap to print read performed then exit properly once a virtual file system read is detected. If the SystemTap deployment was successful, you should get output similar to the following:
Pass 1: parsed user script and 45 library script(s) in 340usr/0sys/358real ms.Pass 2: analyzed script: 1 probe(s), 1 function(s), 0 embed(s), 0 global(s) in 290usr/260sys/568real ms.Pass 3: translated to C into "/tmp/stapiArgLX/stap_e5886fa50499994e6a87aacdc43cd392_399.c" in 490usr/430sys/938real ms.Pass 4: compiled C into "stap_e5886fa50499994e6a87aacdc43cd392_399.ko" in 3310usr/430sys/3714real ms.Pass 5: starting run.read performedPass 5: run completed in 10usr/40sys/73real ms.
The last three lines of the output (i.e. beginning with Pass 5) indicate that SystemTap was able to successfully create the instrumentation to probe the kernel, run the instrumentation, detect the event being probed (in this case, a virtual file system read), and execute a valid handler (print text then close it with no errors).

2.2. Generating Instrumentation for Other Computers

When users run a SystemTap script, a kernel module is built out of that script. SystemTap then loads the module into the kernel, allowing it to extract the specified data directly from the kernel (refer to Procedure 3.1, "SystemTap Session" in Section 3.1, "Architecture" for more information).
Normally, SystemTap scripts can only be run on systems where SystemTap is deployed (as in Section 2.1, "Installation and Setup"). This could mean that to run SystemTap on ten systems, SystemTap needs to be deployed on all those systems. In some cases, this may be neither feasible nor desired. For instance, corporate policy may prohibit an administrator from installing RPMs that provide compilers or debug information on specific machines, which will prevent the deployment of SystemTap.
To work around this, use cross-instrumentation. Cross-instrumentation is the process of generating SystemTap instrumentation modules from a SystemTap script on one computer to be used on another computer. This process offers the following benefits:
  • The kernel information packages for various machines can be installed on a single host machine.
  • Each target machine only needs one RPM to be installed to use the generated SystemTap instrumentation module: systemtap-runtime.

Note

For the sake of simplicity, the following terms will be used throughout this section:
  • instrumentation module - the kernel module built from a SystemTap script; i.e. the SystemTap module is built on the host system, and will be loaded on the target kernel of target system.
  • host system - the system on which the instrumentation modules (from SystemTap scripts) are compiled, to be loaded on target systems.
  • target system - the system in which the instrumentation module is being built (from SystemTap scripts).
  • target kernel - the kernel of the target system. This is the kernel which loads/runs the instrumentation module.

Procedure 2.1. Configuring a Host System and Target Systems

  1. Install the systemtap-runtime RPM on each target system.
  2. Determine the kernel running on each target system by running uname -r on each target system.
  3. Install SystemTap on the host system. The instrumentation module will be built for the target systems on the host system. For instructions on how to install SystemTap, refer to Section 2.1.1, "Installing SystemTap".
  4. Using the target kernel version determined earlier, install the target kernel and related RPMs on the host system by the method described in Section 2.1.2, "Installing Required Kernel Information RPMs". If multiple target systems use different target kernels, repeat this step for each different kernel used on the target systems.
After performing Procedure 2.1, "Configuring a Host System and Target Systems", the instrumentation module (for any target system) can now be built on the host system.
To build the instrumentation module, run the following command on the host system (be sure to specify the appropriate values):
stap -r kernel_version script -m module_name -p4
Here, kernel_version refers to the version of the target kernel (the output of uname -r on the target machine), script refers to the script to be converted into an instrumentation module, and module_name is the desired name of the instrumentation module.

Note

To determine the architecture notation of a running kernel, run uname -m.
Once the instrumentation module is compiled, copy it to the target system and then load it using:
staprun module_name.ko
For example, to create the instrumentation module simple.ko from a SystemTap script named simple.stp for the target kernel 2.6.32-53.el6, use the following command:
stap -r 2.6.32-53.el6 -e 'probe vfs.read {exit()}' -m simple -p4
This will create a module named simple.ko. To use the instrumentation module simple.ko, copy it to the target system and run the following command (on the target system):
staprun simple.ko

Important

The host system must be the same architecture and running the same distribution of Linux as the target system in order for the built instrumentation module to work.

2.3. Running SystemTap Scripts

SystemTap scripts are run through the command stap. stap can run SystemTap scripts from standard input or from file.
Running stap and staprun requires elevated privileges to the system. However, not all users can be granted root access just to run SystemTap. In some cases, for instance, a non-privileged user may need to to run SystemTap instrumentation on their machine.
To allow ordinary users to run SystemTap without root access, add them to both of these user groups:
stapdev
Members of this group can use stap to run SystemTap scripts, or staprun to run SystemTap instrumentation modules.
Running stap involves compiling SystemTap scripts into kernel modules and loading them into the kernel. This requires elevated privileges to the system, which are granted to stapdev members. Unfortunately, such privileges also grant effective root access to stapdev members. As such, only grant stapdev group membership to users who can be trusted with root access.
stapusr
Members of this group can only use staprun to run SystemTap instrumentation modules. In addition, they can only run those modules from /lib/modules/kernel_version/systemtap/. Note that this directory must be owned only by the root user, and must only be writable by the root user.

Running SystemTap Scripts

In order to run SystemTap scripts a user must be in both the stapdev and stapusr groups.
Below is a list of commonly used stap options:
-v
Makes the output of the SystemTap session more verbose. This option (for example, stap -vvv script.stp) can be repeated to provide more details on the script's execution. It is particularly useful if errors are encountered when running the script. This option is particularly useful if you encounter any errors in running the script.
For more information about common SystemTap script errors, refer to Chapter 5, Understanding SystemTap Errors.
-o filename
Sends the standard output to file (filename).
-S size,count
Limit files to size megabytes and limit the number of files kept around to count. The file names will have a sequence number suffix. This option implements logrotate operations for SystemTap.
When used with -o, the -S will limit the size of log files.
-x process ID
Sets the SystemTap handler function target() to the specified process ID. For more information about target(), refer to SystemTap Functions.
-c command
Sets the SystemTap handler function target() to the specified command. The full path to the specified command must be used; for example, instead of specifying cp, use /bin/cp (as in stap script -c /bin/cp). For more information about target(), refer to SystemTap Functions.
-e 'script'
Use script string rather than a file as input for systemtap translator.
-F
Use SystemTap's Flight recorder mode and make the script a background process. For more information about flight recorder mode, refer to Section 2.3.1, "SystemTap Flight Recorder Mode".
stap can also be instructed to run scripts from standard input using the switch -. To illustrate:

Example 2.1. Running Scripts From Standard Input

echo "probe timer.s(1) {exit()}" | stap -

Example 2.1, "Running Scripts From Standard Input" instructs stap to run the script passed by echo to standard input. Any stap options to be used should be inserted before the - switch; for instance, to make the example in Example 2.1, "Running Scripts From Standard Input" more verbose, the command would be:
echo "probe timer.s(1) {exit()}" | stap -v -
For more information about stap, refer to man stap.
To run SystemTap instrumentation (i.e. the kernel module built from SystemTap scripts during a cross-instrumentation), use staprun instead. For more information about staprun and cross-instrumentation, refer to Section 2.2, "Generating Instrumentation for Other Computers".

Note

The stap options -v and -o also work for staprun. For more information about staprun, refer to man staprun.

2.3.1. SystemTap Flight Recorder Mode

SystemTap's flight recorder mode allows a SystemTap script to be ran for long periods and just focus on recent output. The flight recorder mode (the -F option) limits the amount of output generated. There are two variations of the flight recorder mode: in-memory and file mode. In both cases the SystemTap script runs as a background process.

2.3.1.1. In-memory Flight Recorder

When flight recorder mode (the -F option) is used without a file name, SystemTap uses a buffer in kernel memory to store the output of the script. Next, SystemTap instrumentation module loads and the probes start running, then instrumentation will detatch and be put in the background. When the interesting event occurs, the instrumentation can be reattached and the recent output in the memory buffer and any continuing output can be seen. The following command starts a script using the flight recorder in-memory mode:
stap -F /usr/share/doc/systemtap-version/examples/io/iotime.stp
Once the script starts, a message that provides the command to reconnect to the running script will appear:
Disconnecting from systemtap module.To reconnect, type "staprun -A stap_5dd0073edcb1f13f7565d8c343063e68_19556"
When the interesting event occurs, reattach to the currently running script and output the recent data in the memory buffer, then get the continuing output with the following command:
staprun -A stap_5dd0073edcb1f13f7565d8c343063e68_19556
By default, the kernel buffer is 1MB in size, but it can be increased with the -s option specifying the size in megabytes (rounded up to the next power over 2) for the buffer. For example -s2 on the SystemTap command line would specify 2MB for the buffer.

2.3.1.2. File Flight Recorder

The flight recorder mode can also store data to files. The number and size of the files kept is controlled by the -S option followed by two numerical arguments separated by a comma. The first argument is the maximum size in megabytes for the each output file. The second argument is the number of recent files to keep. The file name is specified by the -o option followed by the name. SystemTap adds a number suffix to the file name to indicate the order of the files. The following will start SystemTap in file flight recorder mode with the output going to files named /tmp/pfaults.log.[0-9]+ with each file 1MB or smaller and keeping latest two files:
stap -F -o /tmp/pfaults.log -S 1,2  pfaults.stp
The number printed by the command is the process ID. Sending a SIGTERM to the process will shutdown the SystemTap script and stop the data collection. For example if the previous command listed the 7590 as the process ID, the following command whould shutdown the systemtap script:
kill -s SIGTERM 7590
Only the most recent two file generated by the script are kept and the older files are been removed. Thus, ls -sh /tmp/pfaults.log.* shows the only two files:
1020K /tmp/pfaults.log.5 44K /tmp/pfaults.log.6
One can look at the highest number file for the latest data, in this case /tmp/pfaults.log.6.
(Sebelumnya) 20 : Index - Storage Administr ...21 : Chapter 3. Understanding ... (Berikutnya)