summaryrefslogtreecommitdiffstats
path: root/stap.1
blob: f313790f3855ffe09773eeeb878e03b79b07197c (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
.\" t
.TH STAP 1 "July 28 2005" "Red Hat"
.SH NAME
stap \- systemtap script translator/driver
.SH SYNOPSIS

.br
.B stap
[
.IR OPTIONS
]
.RI FILENAME
.br
.B stap
[
.IR OPTIONS
]
.BI -
.br
.B stap
[
.IR OPTIONS
]
.BI -e " SCRIPT "

.SH DESCRIPTION

The
.IR stap
program is the front-end to the Systemtap tool.  
It accepts probing instructions (written in a simple scripting language), translates
those instructions into C code, compiles this C code, and loads the
resulting kernel module into a running Linux kernel to perform the
requested system trace/probe functions.  
You can supply the script in a named file, from standard input, or from the command line.
.PP
The language, which is described in a later section, is strictly typed,
declaration free, procedural, and inspired by
.IR dtrace 
and
.IR awk .
It allows source code points or events in the kernel to be associated
with handlers, which are subroutines that are executed synchronously.  It is
somewhat similar conceptually to "breakpoint command lists" in the
.IR gdb
debugger.

.SH OPTIONS


.SH SCRIPT LANGUAGE

The systemtap script language resembles 
.IR awk .
There are two main outermost constructs: probes and functions.  Within
these, statements and expressions use C-like operator syntax and
precedence.

.SS GENERAL SYNTAX
Whitespace is ignored.  Three forms of comments are supported:
.RS
.br
# ... shell style, to the end of line
.br
// ... C++ style, to the end of line 
.br
/* ... C style ... */
.RE
Literals are either strings enclosed in double-quotes (soon supporting
the usual C escape codes with backslashes), or integers (in decimal,
hexadecimal, or octal, using the same notation as in C).  All strings
are limited in length to some reasonable value (a few hundred bytes).
Integers are 64-bit signed quantities, although the parser also accepts
(and wraps around) values above positive 2**63.  

.SS VARIABLES
Identifiers for variables and functions are an alphanumeric sequence,
and may include "_" and "$" characters.  They may not start with a
plain digit, as in C.  Each variable is by default local to the probe
or function statement block within which it is mentioned, and therefore
its scope and lifetime is limited to a particular probe or function
invocation.  Variables may be declared global using a top-level
declaration, in which case they are shared amongst all probes and live
as long as the entire systemtap session.
.PP
Scalar variables are implicitly typed as either string or integer.
Associative arrays, which must be declared global, may have a string
or integer value, and a tuple of strings and/or integers serving as a
key.
.\" XXX add statistics type here once it's supported

.SS STATEMENTS
Statements enable procedural control flow.  They may occur within
functions and probe handlers.

.TP
EXP
Execute the string- or integer-valued expression and throw away
the value.
.TP
.BR { " STMT1 STMT2 ... " }
Execute each statement in sequence in this block.  Note that no
separators or terminators are necessary between statements.
.TP
.BR ;
Null statement, do nothing.  It is useful as an optional separator between
statements to improve the display of syntax-error reports.
.TP
.BR if " (EXP) STMT1 [ " else " STMT2 ]"
Compare integer-valued EXP to zero.  Execute the first (non-zero)
or second STMT (zero).
.TP
.BR while " (EXP) STMT"
While integer-valued EXP evaluates to non-zero, execute STMT.
.TP
.BR for " (EXP1; EXP2; EXP2) STMT"
Execute EXP2 as initialization.  While EXP1 is non-zero, execute
STMT, then the iteration expression EXP1.
.TP
.BR foreach " (VAR " in " ARRAY) STMT"
Loop over each element of the named global array, assigning current
key to VAR.  The array may not be modified within the statement.
.TP
.BR foreach " ((VAR1, VAR2, ...) " in " ARRAY) STMT"
Same as above, used when the array is indexed with a tuple of keys.
.TP
.BR break ", " continue
Exit or iterate the innermost nesting loop
.RB ( while " or " for " or " foreach )
statement.
.TP
.BR return " EXP"
Return EXP value from enclosing function.  A return value is mandatory,
since void functions are not supported.
.TP
.BR next
Return from enclosing probe handler.

.SS EXPRESSIONS
Systemtap supports a number of operators that have the same general syntax,
semantics, and precedence as in C and awk. 


.SS PROBES
The main construct in the scripting language identifies probes.
Probes associate abstract events with a statement block ("probe
handler") that is to be executed when those events occur.
.PP
Events are specified in a special syntax called "probe points".  One
family refers to specific points in a kernel, which are identified by module,
source file, line number, function name, C label name, or some
combination of these.  This kind of "synchronous" event is deemed to
occur when any processor executes an instruction matched by the
specification.  Other families of probe points refer to "asynchronous"
events such as timers/counters rolling over, where there is no fixed
execution point that is related.
.PP
When any matching event occurs, the probe handler is run within that
context.  For events that are defined by execution of specific parts
of code, this context may include variables defined in the source code
at that spot.  These "target variables" are presented to the script as
variables whose names are prefixed with "$".  They may be read/written
only if the kernel's compiler preserved them despite optimization.
This is the same constraint that a debugger user faces when working
with optimized code.  Asynchronous probes have very little context.
.PP
In addition, "probe aliases" may be defined.  Probe aliases look
similar to probe definitions, but instead of activating a probe at the
given point, it defines a new probe point name to alias an existing
one.  This is identified by the "=" assignment operator.  In addition,
the probe handler defined with an alias is implicitly added as a
prologue to any probe that refers to the alias.  For example:
.RS
.nf
probe syscall("read") = kernel.function("sys_read") {
  fildes = $fd
}
.fi
.RE
defines a new probe point
.IR syscall("read") ,
which expands to
.IR kernel.function("sys_read") ,
with the given assignment as a prologue.  Another probe definition
may use the alias like this:
.RS
.nf
probe syscall("read") {
  printk ("reading fd=" . decimal (fildes))
}
.fi
.RE

.SS FUNCTIONS

.SS GLOBALS

.SS EMBEDDED C
When in guru mode, the translator accepts embedded code in the
script.  Such code is enclosed between
.IR %{
and
.IR %}
markers, and is transcribed verbatim, without analysis, in some
sequence, into the generated C code.  At the outermost level, this may
be useful to add
.IR #include
instructions, and any auxiliary definitions for use by other embedded
code.  The other place where embedded code is permitted is as a
function body.

.SS BUILT-INS
A set of builtin functions and probe aliases are provided by the
scripts installed under the
.IR /usr/share/systemtap/tapset
directory.

.SH PROCESSING
The translator begins pass 1 by parsing the given input script,
and all scripts (files named
.IR *.stp )
found in a tapset directory.  The directories listed
with
.BR -I
are processed in sequence.  For each directory, a number of subdirectories
are also searched.  These subdirectories are derived from the selected
kernel version (the
.BR -R
option),
in order to allow more kernel-version-specific scripts to override less
specific ones.  For example, for a kernel version
.IR 2.6.12-23.FC3
the following patterns would be searched, in sequence:
.IR 2.6.12-23.FC3/*.stp ,
.IR 2.6.12/*.stp ,
.IR 2.6/*.stp ,
and finally
.IR *.stp
Stopping the translator after pass 1 causes it to print the parse trees. 

.PP
In pass 2, the translator analyzes the input script to resolve symbols
and types.  References to variables, functions, and probe aliases that
are unresolved internally are satisfied by searching through the
parsed tapset scripts.  If any tapset script is selected because it
defines an unresolved symbol, then the entirety of that script is
added to the translator's resolution queue.  This process iterates
until all symbols are resolved and a subset of tapset scripts is
selected.
.PP
Next, all probe point descriptions are validated 
against the wide variety supported by the translator.  Probe points that
refer to code locations ("synchronous probe points") require the
appropriate kernel debugging information to be installed.  In the
associated probe handlers, target-side variables (whose names begin
with "$") are found and have their run-time locations decoded.
.PP
Finally, all variable, function, parameter, array, and
index types are inferred from context (literals and operators).
Stopping the translator after pass 2 causes it to list all the probes,
functions, and variables, along with all types.  Any conflicting,
inconsistent, or unresolved types cause an error.

.PP
In pass 3, the translator writes C code that represents the actions
of all selected script files, and creates a
.IR Makefile
to build that into a kernel object.  These files are placed into a
temporary directory.  Stopping the translator at this point causes
it to print the contents of the C file.

.PP
In pass 4, the translator invokes the Linux kernel build system to
create the actual kernel object file.  This involves running
.IR make
in the temporary directory, and requires a kernel module build
system (headers, config and Makefiles) to be installed in the usual
spot
.IR /lib/modules/VERSION/build .
Stopping the translator after pass 4 is the last chance before
running the kernel object.  This may be useful if you want to
archive the file.

.PP
In pass 5, the translator invokes the systemtap "daemon"
.IR stpd
program for the given kernel object.  This program arranges to load
the module then communicates with it, copying trace data from the
kernel into temporary files, until the user sends an interrupt signal.
Finally, it unloads the module, and cleans up.

.SH EXAMPLES
To trace entry and exit from a function, use a pair of probes:
.RS
.br
probe kernel.function("foo") { log ("enter") }
probe kernel.function("foo").return { log ("exit") }
.RE

To list the probeable functions in the kernel, use
.RS
.br
stap -p2 -e 'probe kernel.function("*") {}'
.RE



.SH SAFETY AND SECURITY
Systemtap is an administrative tool at this time.  It exposes kernel
internal data structures and potentially private user information.
It acquires root privileges to actually run the kernel objects it
builds using the
.IR sudo
command applied to the
.IR stpd
program.  The latter is a part of the Systemtap package, dedicated to
module loading and unloading (but only in the white zone), and
kernel-to-user data transfer.  Since 
.IR stpd
does not perform any additional security checks on the kernel objects
it is given, it would be unwise for a system administrator to give
even targeted
.IR sudo
privileges to untrusted users.
.PP
The translator asserts certain safety constraints.  It aims to ensure
that no handler routine can run for very long, allocate memory,
perform unsafe operations, or in unintentionally interfere with the
kernel.

.SH ENVIRONMENT VARIABLES
The
.B SYSTEMTAP_RUNTIME
environment variable provides a default for the
.B \-R
option.  Similarly, the
.B SYSTEMTAP_TAPSET
environment variable provides a default for the
.B \-I
option.

.SH SEE ALSO
.IR dtrace (1)
.IR dprobes (1)
.IR awk (1)
.IR sudo (8)
.IR elfutils (3)
.IR gdb (1)

.SH BUGS
There are numerous missing features and possibly numerous bugs.  Use
the Bugzilla link off of the project web page
.BR http://sources.redhat.com/systemtap/ ,
or the mailing list
.BR systemtap@sources.redhat.com .

.SH AUTHORS
The
.IR stap
translator was written by Frank Ch. Eigler and Graydon Hoare.  The
kernel-side runtime library and the user-level
.IR stpd
daemon was written by Martin Hunt and Tom Zanussi.

.SH ACKNOWLEDGEMENTS
The script language design was inspired by Sun's 
.IR dtrace ,
and refined by numerous participants on the project mailing list.
The current probing mechanism uses IBM's
.IR kprobes ,
and
.IR relayfs
packages, which were improved and ported by IBM and Intel staff.  Many
project members contributed to the overall design and priorities of
the system, including Will Cohen, Jim Keniston, Vara Prasad, and Brad
Chen.