summaryrefslogtreecommitdiffstats
path: root/stap.1.in
diff options
context:
space:
mode:
Diffstat (limited to 'stap.1.in')
-rw-r--r--stap.1.in615
1 files changed, 615 insertions, 0 deletions
diff --git a/stap.1.in b/stap.1.in
new file mode 100644
index 00000000..dc421cdd
--- /dev/null
+++ b/stap.1.in
@@ -0,0 +1,615 @@
+.TH STAP 1 @DATE@ "Red Hat"
+.SH NAME
+stap \- systemtap script translator/driver
+.SH SYNOPSIS
+
+.br
+.B stap
+[
+.I OPTIONS
+]
+.I FILENAME
+.br
+.B stap
+[
+.I OPTIONS
+]
+.B \-
+.br
+.B stap
+[
+.I OPTIONS
+]
+.BI \-e " SCRIPT"
+
+.SH DESCRIPTION
+
+The
+.IR stap
+program is the front-end to the Systemtap tool. It accepts probing
+instructions (written in a simple scripting language), translates
+those instructions into C code, compiles this C code, and loads the
+resulting kernel module into a running Linux kernel to perform the
+requested system trace/probe functions. You can supply the script in
+a named file, from standard input, or from the command line.
+.PP
+The language, which is described in a later section, is strictly typed,
+declaration free, procedural, and inspired by
+.IR dtrace
+and
+.IR awk .
+It allows source code points or events in the kernel to be associated
+with handlers, which are subroutines that are executed synchronously. It is
+somewhat similar conceptually to "breakpoint command lists" in the
+.IR gdb
+debugger.
+.PP
+This manual corresponds to version @VERSION@.
+
+.SH OPTIONS
+The systemtap translator supports the following options. Any other option
+prints a list of supported options.
+.\" undocumented for now:
+.\" -t test mode
+.\" -r RELEASE
+.TP
+.B \-v
+Verbose mode. Produces more informative output.
+.TP
+.B \-k
+Keep the temporary directory after all processing. This may be useful
+in order to examine the generated C code, or to reuse the compiled
+kernel object.
+.TP
+.B \-g
+Guru mode. Enables parsing of unsafe expert-level constructs like
+embedded C.
+.TP
+.BI \-p " NUM"
+Stop after pass NUM. The passes are numbered 1-5: parse, elaborate,
+translate, compile, run. See the
+.B PROCESSING
+section for details.
+.TP
+.BI \-I " DIR"
+Add the given directory to the tapset search directory. See the
+description of pass 2 for details.
+.TP
+.BI \-R " DIR"
+Look for the systemtap runtime sources in the given directory.
+.TP
+.BI \-m " MODULE"
+Use the given name for the generated kernel object module, instead
+of a unique randomized name.
+.TP
+.BI \-o " FILE"
+Send standard output to named file.
+
+.SH SCRIPT LANGUAGE
+
+The systemtap script language resembles
+.IR awk .
+There are two main outermost constructs: probes and functions. Within
+these, statements and expressions use C-like operator syntax and
+precedence.
+
+.SS GENERAL SYNTAX
+Whitespace is ignored. Three forms of comments are supported:
+.RS
+.br
+.BR # " ... shell style, to the end of line"
+.br
+.BR // " ... C++ style, to the end of line"
+.br
+.BR /* " ... C style ... " */
+.RE
+Literals are either strings enclosed in double-quotes (soon supporting
+the usual C escape codes with backslashes), or integers (in decimal,
+hexadecimal, or octal, using the same notation as in C). All strings
+are limited in length to some reasonable value (a few hundred bytes).
+Integers are 64-bit signed quantities, although the parser also accepts
+(and wraps around) values above positive 2**63.
+
+.SS VARIABLES
+Identifiers for variables and functions are an alphanumeric sequence,
+and may include "_" and "$" characters. They may not start with a
+plain digit, as in C. Each variable is by default local to the probe
+or function statement block within which it is mentioned, and therefore
+its scope and lifetime is limited to a particular probe or function
+invocation.
+.\" XXX add statistics type here once it's supported
+.PP
+Scalar variables are implicitly typed as either string or integer.
+Associative arrays also have a string or integer value, and a
+a tuple of strings and/or integers serving as a key.
+The translator performs
+.I type inference
+on all identifiers, including array indexes and function parameters.
+Inconsistent type-related use of identifiers signals an error.
+.PP
+Variables may be declared global, so that they are shared amongst all
+probes and live as long as the entire systemtap session. There is one
+namespace for all global variables, regardless of which script file
+they are found within. A global declaration may be written at the
+outermost level anywhere, not within a block of code. The following
+declaration marks "var1" and "var2" as global. The translator will
+infer for each its value type, and if it is used as an array, its key
+types.
+.RS
+.BR global " var1" , " var2"
+.RE
+.\" XXX add statistics type here once it's supported
+
+.SS STATEMENTS
+Statements enable procedural control flow. They may occur within
+functions and probe handlers. The total number of statements executed
+in response to any single probe event is limited to some number
+defined by a macro in the translated C code, and is in the
+neighbourhood of 1000.
+.TP
+EXP
+Execute the string- or integer-valued expression and throw away
+the value.
+.TP
+.BR { " STMT1 STMT2 ... " }
+Execute each statement in sequence in this block. Note that
+separators or terminators are generally not necessary between statements.
+.TP
+.BR ;
+Null statement, do nothing. It is useful as an optional separator between
+statements to improve syntax-error detection and to handle certain
+grammar ambiguities.
+.TP
+.BR if " (EXP) STMT1 [ " else " STMT2 ]"
+Compare integer-valued EXP to zero. Execute the first (non-zero)
+or second STMT (zero).
+.TP
+.BR while " (EXP) STMT"
+While integer-valued EXP evaluates to non-zero, execute STMT.
+.TP
+.BR for " (EXP1; EXP2; EXP2) STMT"
+Execute EXP2 as initialization. While EXP1 is non-zero, execute
+STMT, then the iteration expression EXP1.
+.TP
+.BR foreach " (VAR " in " ARRAY) STMT"
+Loop over each element of the named global array, assigning current
+key to VAR. The array may not be modified within the statement.
+.TP
+.BR foreach " ([VAR1, VAR2, ...] " in " ARRAY) STMT"
+Same as above, used when the array is indexed with a tuple of keys.
+.TP
+.BR break ", " continue
+Exit or iterate the innermost nesting loop
+.RB ( while " or " for " or " foreach )
+statement.
+.TP
+.BR return " EXP"
+Return EXP value from enclosing function. A return value is mandatory,
+since void functions are not supported.
+.TP
+.BR next
+Return now from enclosing probe handler.
+
+.SS EXPRESSIONS
+Systemtap supports a number of operators that have the same general syntax,
+semantics, and precedence as in C and awk. Arithmetic is performed as per
+C rules. Division by zero is detected and results in an error.
+.TP
+binary numeric operators
+.B * / % + - >> << & ^ | && ||
+.TP
+binary string operators
+.B .
+(string concatenation)
+.TP
+numeric assignment operators
+.B = *= /= %= += -= >>= <<= &= ^= |=
+.TP
+string assignment operators
+.B = .=
+.TP
+unary numeric operators
+.B - ! ~ ++ --
+.TP
+binary numeric or string comparison operators
+.B < > <= >= == !=
+.TP
+ternary operator
+.RB cond " ? " exp1 " : " exp2
+.TP
+grouping operator
+.BR ( " exp " )
+.TP
+function call
+.RB "fn " ( "[ arg1, arg2, ... ]" )
+
+.SS PROBES
+The main construct in the scripting language identifies probes.
+Probes associate abstract events with a statement block ("probe
+handler") that is to be executed when those events occur. The
+general syntax is as follows:
+.RS
+.br
+.nh
+.nf
+.BR probe " PROBEPOINT [" , " PROBEPOINT] " { " [STMT ...] " }
+.hy
+.fi
+.RE
+.PP
+Events are specified in a special syntax called "probe points". One
+family refers to specific points in a kernel, which are identified by
+module, source file, line number, function name, C label name, or some
+combination of these. This kind of "synchronous" event is deemed to
+occur when any processor executes an instruction matched by the
+specification. Other families of probe points refer to "asynchronous"
+events such as timers/counters rolling over, where there is no fixed
+execution point that is related. Each probe point specification may
+match multiple physical locations, all of which are then probed. A
+probe declaration may also contain several comma-separated
+specifications, all of which are probed.
+.PP
+Here is a list of probe point families currently supported. The
+.B .function
+variant places a probe near the beginning of the named function, so that
+parameters are available as context variables. The
+.B .return
+variant places a probe at the moment of return from the named function, so
+the return value is available as the "$retvalue" context variable.
+The
+.B .statement
+variant places a probe at the exact spot, exposing those local variables
+that are visible there.
+.RS
+.nf
+.br
+kernel.function(PATTERN)
+.br
+kernel.function(PATTERN).return
+.br
+module(MPATTERN).function(PATTERN)
+.br
+module(MPATTERN).function(PATTERN).return
+.br
+kernel.statement(PATTERN)
+.br
+module(MPATTERN).statement(PATTERN)
+.fi
+.RE
+.PP
+In the above list, MPATTERN stands for a string literal that aims to
+identify the loaded kernel module of interest. It may include "*" and
+"?" wildcards. PATTERN stands for a string literal that aims to
+identify a point in the program. It is made up of three parts. The
+first part is the name of a function, as would appear in the
+.I nm
+program's output. This part may use the "*" and "?" wildcarding
+operators to match multiple names. The second part is optional, and
+begins with the "@" character. It is followed by a source file name
+wildcard pattern, such as
+.IR mm/slab* .
+Finally, the third part is optional if the file name part was given,
+and identifies the line number in the source file, preceded by a ":".
+As an alternative, PATTERN may be a numeric constant, indicating an
+(module-relative or kernel-absolute) address.
+.PP
+Here are some example probe points:
+.TP
+kernel.function("*init*"), kernel.function("*exit*")
+refers to all kernel functions with "init" or "exit" in the name.
+.TP
+kernel.function("*@kernel/sched.c:240")
+refers to any functions within the "kernel/sched.c" file that span
+line 240.
+.TP
+module("usb*").function("*sync*").return
+refers to the moment of return from all functions with "sync" in the
+name in any of the USB drivers.
+.TP
+kernel.statement(0xc0044852)
+refers to the first byte of the statement whose compiled instructions
+include the given address in the kernel.
+
+.PP
+When any matching event occurs, the probe handler is run within that
+context. For events that are defined by execution of specific parts
+of code, this context may include variables defined in the source code
+at that spot. These "target variables" are presented to the script as
+variables whose names are prefixed with "$". They may be read/written
+only if the kernel's compiler preserved them despite optimization.
+This is the same constraint that a debugger user faces when working
+with optimized code. Asynchronous probes have very little context.
+.PP
+In addition, "probe aliases" may be defined. Probe aliases look
+similar to probe definitions, but instead of activating a probe at the
+given point, it defines a new probe point name to alias an existing
+one. This is identified by the "=" assignment operator. In addition,
+the probe handler defined with an alias is implicitly added as a
+prologue to any probe that refers to the alias. For example:
+.RS
+.nf
+.nh
+probe syscall("read") = kernel.function("sys_read") {
+ fildes = $fd
+}
+.hy
+.fi
+.RE
+defines a new probe point
+.nh
+.IR syscall("read") ,
+.hy
+which expands to
+.nh
+.IR kernel.function("sys_read") ,
+.hy
+with the given assignment as a prologue. Another probe definition
+may use the alias like this:
+.RS
+.nf
+probe syscall("read") {
+ printk ("reading fd=" . string (fildes))
+}
+.fi
+.RE
+
+.SS FUNCTIONS
+Systemtap scripts may define subroutines to factor out common work.
+Functions take any number of scalar (integer or string) arguments, and
+must return a single scalar (integer or string). An example function
+declaration looks like this:
+.RS
+.nf
+function thisfn (arg1, arg2) {
+ return arg1 + arg2
+}
+.fi
+.RE
+Note the usual absence of type declarations, which are instead
+inferred by the translator. Because a return value type is required,
+each function must contain at least one
+.I return
+statement. Functions may call others or themselves recursively, up to
+a fixed nesting limit. This limit is defined by a macro in the
+translated C code and is in the neighbourhood of 30.
+
+.SS EMBEDDED C
+When in guru mode, the translator accepts embedded code in the
+script. Such code is enclosed between
+.IR %{
+and
+.IR %}
+markers, and is transcribed verbatim, without analysis, in some
+sequence, into the generated C code. At the outermost level, this may
+be useful to add
+.IR #include
+instructions, and any auxiliary definitions for use by other embedded
+code.
+.PP
+The other place where embedded code is permitted is as a function body.
+In this case, the script language body is replaced entirely by a piece
+of C code enclosed again between
+.IR %{ " and " %}
+markers.
+This C code may do anything reasonable and safe. There are a number
+of undocumented but complex safety constraints on concurrency,
+resource consumption, and runtime limits, so this is an advanced
+technique.
+.PP
+The memory locations set aside for input and output values
+are made available to it using a macro
+.IR THIS .
+Here are some examples:
+.RS
+.br
+.nf
+function add_one (val) %{
+ THIS->__retvalue = THIS->val + 1;
+%}
+function add_one_str (val) %{
+ strncpy (THIS->__retvalue, THIS->val, MAXSTRINGLEN);
+ strncat (THIS->__retvalue, "one", MAXSTRINGLEN);
+%}
+.fi
+.RE
+The function argument and return value types have to be inferred by
+the translator from the call sites in order for this to work. The
+user should examine C code generated for ordinary script-language
+functions in order to write compatible embedded-C ones.
+
+.SS BUILT-INS
+A set of builtin functions and probe aliases are provided by the
+scripts installed under the
+.nh
+.IR /usr/share/systemtap/tapset
+.hy
+directory.
+
+.SH PROCESSING
+The translator begins pass 1 by parsing the given input script,
+and all scripts (files named
+.IR *.stp )
+found in a tapset directory. The directories listed
+with
+.BR -I
+are processed in sequence. For each directory, a number of subdirectories
+are also searched. These subdirectories are derived from the selected
+kernel version (the
+.BR -R
+option),
+in order to allow more kernel-version-specific scripts to override less
+specific ones. For example, for a kernel version
+.IR 2.6.12-23.FC3
+the following patterns would be searched, in sequence:
+.IR 2.6.12-23.FC3/*.stp ,
+.IR 2.6.12/*.stp ,
+.IR 2.6/*.stp ,
+and finally
+.IR *.stp
+Stopping the translator after pass 1 causes it to print the parse trees.
+
+.PP
+In pass 2, the translator analyzes the input script to resolve symbols
+and types. References to variables, functions, and probe aliases that
+are unresolved internally are satisfied by searching through the
+parsed tapset scripts. If any tapset script is selected because it
+defines an unresolved symbol, then the entirety of that script is
+added to the translator's resolution queue. This process iterates
+until all symbols are resolved and a subset of tapset scripts is
+selected.
+.PP
+Next, all probe point descriptions are validated
+against the wide variety supported by the translator. Probe points that
+refer to code locations ("synchronous probe points") require the
+appropriate kernel debugging information to be installed. In the
+associated probe handlers, target-side variables (whose names begin
+with "$") are found and have their run-time locations decoded.
+.PP
+Finally, all variable, function, parameter, array, and index types are
+inferred from context (literals and operators). Stopping the
+translator after pass 2 causes it to list all the probes, functions,
+and variables, along with all inferred types. Any inconsistent or
+unresolved types cause an error.
+
+.PP
+In pass 3, the translator writes C code that represents the actions
+of all selected script files, and creates a
+.IR Makefile
+to build that into a kernel object. These files are placed into a
+temporary directory. Stopping the translator at this point causes
+it to print the contents of the C file.
+
+.PP
+In pass 4, the translator invokes the Linux kernel build system to
+create the actual kernel object file. This involves running
+.IR make
+in the temporary directory, and requires a kernel module build
+system (headers, config and Makefiles) to be installed in the usual
+spot
+.IR /lib/modules/VERSION/build .
+Stopping the translator after pass 4 is the last chance before
+running the kernel object. This may be useful if you want to
+archive the file.
+
+.PP
+In pass 5, the translator invokes the systemtap auxiliary program
+.I stpd
+program for the given kernel object. This program arranges to load
+the module then communicates with it, copying trace data from the
+kernel into temporary files, until the user sends an interrupt signal.
+Any run-time error encountered by the probe handlers, such as running
+out of memory, division by zero, exceeding nesting or runtime limits,
+results in an error condition that prevents further probes from
+running. Finally, stpd unloads the module, and cleans up.
+
+.SH EXAMPLES
+To trace entry and exit from a function, use a pair of probes:
+.RS
+.br
+.nf
+probe kernel.function("foo") { log ("enter") }
+probe kernel.function("foo").return { log ("exit") }
+.fi
+.RE
+
+To list the probeable functions in the kernel, use
+.RS
+.br
+.nf
+stap -p2 -e 'probe kernel.function("*") {}'
+.fi
+.RE
+
+.SH SAFETY AND SECURITY
+Systemtap is an administrative tool. It exposes kernel internal data
+structures and potentially private user information. It acquires root
+privileges to actually run the kernel objects it builds using the
+.IR sudo
+command applied to the
+.IR stpd
+program. The latter is a part of the Systemtap package, dedicated to
+module loading and unloading (but only in the white zone), and
+kernel-to-user data transfer. Since
+.IR stpd
+does not perform any additional security checks on the kernel objects
+it is given, it would be unwise for a system administrator to give
+even targeted
+.IR sudo
+privileges to untrusted users.
+.PP
+The translator asserts certain safety constraints. It aims to ensure
+that no handler routine can run for very long, allocate memory,
+perform unsafe operations, or in unintentionally interfere with the
+kernel. Use of guru mode constructs such as embedded C can violate
+these constraints, leading to kernel crash or data corruption.
+
+.SH FILES
+.\" consider autoconf-substituting these directories
+.TP
+/tmp/stapXXXXXX
+Temporary directory for systemtap files, including translated C code
+and kernel object.
+.TP
+/usr/share/systemtap/tapset
+The automatic tapset search directory, unless overridden by
+the
+.I SYSTEMTAP_TAPSET
+environment variable.
+.TP
+/usr/share/systemtap/runtime
+The runtime sources, unless overridden by the
+.I SYSTEMTAP_RUNTIME
+environment variable.
+.TP
+/lib/modules/VERSION/build
+The location of kernel module building infrastructure.
+.TP
+/usr/lib/debug/lib/modules/VERSION
+The location of kernel debugging information when packaged into the
+.IR kernel-debuginfo
+RPM.
+.TP
+/usr/libexec/systemtap/stpd
+The auxiliary program supervising module loading, interaction, and
+unloading.
+
+.SH SEE ALSO
+.IR dtrace (1),
+.IR dprobes (1),
+.IR awk (1),
+.IR sudo (8),
+.IR elfutils (3),
+.IR gdb (1)
+
+.SH BUGS
+There are numerous missing features and possibly numerous bugs. Use
+the Bugzilla link off of the project web page:
+.nh
+.BR http://sources.redhat.com/systemtap/ .
+.hy
+
+.SH AUTHORS
+The
+.IR stap
+translator was written by Frank Ch. Eigler and Graydon Hoare. The
+kernel-side runtime library and the user-level
+.IR stpd
+daemon was written by Martin Hunt and Tom Zanussi. Contact them
+using the public mailing list:
+.nh
+.BR <systemtap@sources.redhat.com> .
+.hy
+
+.SH ACKNOWLEDGEMENTS
+The script language design was inspired by Sun's
+.IR dtrace .
+The primary probing mechanism uses IBM's
+.IR kprobes ,
+and
+.IR relayfs
+packages, which were improved and ported by IBM and Intel staff.
+The elfutils library from Ulrich Drepper and Roland McGrath is used
+to process dwarf debugging information. Many project members contributed
+to the overall design and priorities of the system, including Will Cohen,
+Jim Keniston, Vara Prasad, and Brad Chen.
+