1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
|
* What's new
- A serious problem associated with user-space probing in shared libraries
was corrected, making it now possible to experiment with probe shared
libraries. Assuming dwarf debugging information is installed, use this
twist on the normal syntax:
probe process("/lib64/libc-2.8.so").function("....") { ... }
This would probe all threads that call into that library. Running
"stap -c CMD" or "stap -x PID" naturally restricts this to the target
command+descendants only.
- For scripts that sometimes terminate with excessive "skipped" probes,
rerunning the script with "-t" (timing) will print more details about
the skippage reasons.
- Symbol tables and unwind (backtracing) data support were formerly
compiled in for all probed modules as identified by the script
(kernel; module("name"); process("file")) plus those listed by the
stap "-d BINARY" option. Now, this data is included only if the systemtap
script uses tapset functions like probefunc() or backtrace() that require
such information. This shrinks the probe modules considerably for the rest.
- Per-pass verbosity control is available with the new "--vp {N}+" option.
"stap --vp 040" adds 4 units of -v verbosity only to pass 2. This is useful
for diagnosing errors from one pass without excessive verbosity from others.
- Most probe handlers now run with interrupts enabled, for improved
system responsiveness and less probing overhead. This may result
in more skipped probes, for example if a reentrant probe handler
is attempted from within an interrupt handler. It may also make the
systemtap overload detection facility more likely to be triggered, as
interrupt handlers' run time would be included in the self-assessed
overhead of running probe handlers.
* What's new in version 0.8
- Cache limiting is now available. If the compiled module cache size is
over a limit specified in the $SYSTEMTAP_DIR/cache/cache_mb_limit file,
some old cache entries will be unlinked. See man stap(1) for more.
- Error and warning messages are now followed by source context displaying
the erroneous line/s and a handy '^' in the following line pointing to the
appropriate column.
- A bug reporting tool "stap-report" is now available which will quickly
retrieve much of the information requested here:
http://sourceware.org/systemtap/wiki/HowToReportBugs
- The translator can resolve members of anonymous structs / unions:
given struct { int foo; struct { int bar; }; } *p;
this now works: $p->bar
- The stap "-F" flag activates "flight recorder" mode, which consists of
translating the given script as usual, but implicitly launching it into
the background with staprun's existing "-L" (launch) option. A user
can later reattach to the module with "staprun -A MODULENAME".
- Additional context variables are available on user-space syscall probes.
- $argN ($arg1, $arg2, ... $arg6) in process(PATH_OR_PID).syscall
gives you the argument of the system call.
- $return in process(PATH_OR_PID).syscall.return gives you the return
value of the system call.
- Target process mode (stap -c CMD or -x PID) now implicitly restricts all
"process.*" probes to the given child process. (It does not affect
kernel.* or other probe types.) The CMD string is normally run directly,
rather than via a /bin/sh -c subshell, since then utrace/uprobe probes
receive a fairly "clean" event stream. If metacharacters like
redirection operators were present in CMD, then "sh -c CMD" is still
used, and utrace/uprobe probes will receive events from the shell.
% stap -e 'probe process.syscall, process.end {
printf("%s %d %s\n", execname(), pid(), pp())}'\
-c ls
ls 2323 process.syscall
ls 2323 process.syscall
ls 2323 process.end
- Probe listing mode is improved: "-L" lists available script-level variables
% stap -L 'syscall.*open*'
syscall.mq_open name:string name_uaddr:long filename:string mode:long u_attr_uaddr:long oflag:long argstr:string
syscall.open name:string filename:string flags:long mode:long argstr:string
syscall.openat name:string filename:string flags:long mode:long argstr:string
- All user-space-related probes support $PATH-resolved executable
names, so
probe process("ls").syscall {}
probe process("./a.out").syscall {}
work now, instead of just
probe process("/bin/ls").syscall {}
probe process("/my/directory/a.out").syscall {}
- Prototype symbolic user-space probing support:
# stap -e 'probe process("ls").function("*").call {
log (probefunc()." ".$$parms)
}' \
-c 'ls -l'
This requires:
- debugging information for the named program
- a version of utrace in the kernel that is compatible with the "uprobes"
kernel module prototype. This includes RHEL5 and older Fedora, but not
yet current lkml-track utrace; a "pass 4a"-time build failure means
your system cannot use this yet.
- Prototype systemtap client and compile server are now available.
These allow you to compile a systemtap module on a host other than
the one which it will be run, providing the client and server
are compatible. Other than using a server for passes 1 through
4, the client behaves like the 'stap' front end itself. This
means, among other things, that the client will automatically
load the resulting module on the local host unless -p[1234]
was specified.
This client/server implementation is a prototype. It provides
NO NETWORK SECURITY OF ANY KIND and should be used only
among trusted hosts on a trusted network.
See stap-server(8) for more details.
- Global variables which are written to but never read are now
automatically displayed when the session does a shutdown. For example:
global running_tasks
probe timer.profile {running_tasks[pid(),tid()] = execname()}
probe timer.ms(8000) {exit()}
- A formatted string representation of the variables, parameters, or local
variables at a probe point is now supported via the special $$vars,
$$parms, and $$locals context variables, which expand to a string
containing a list "var1=0xdead var2=0xbeef var3=?". (Here, var3 exists
but is for some reason unavailable.) In return probes only, $$return
expands to an empty string for a void function, or "return=0xf00".
* What's new in version 0.7
- .statement("func@file:*") and .statement("func@file:M-N") probes are now
supported to allow matching a range of lines in a function. This allows
tracing the execution of a function.
- Scripts relying on probe point wildcards like "syscall.*" that expand
to distinct kprobes are processed significantly faster than before.
- The vector of script command line arguments is available in a
tapset-provided global array argv[]. It is indexed 1 ... argc,
another global. This can substitute for of preprocessor
directives @NNN that fail at parse time if there are not
enough arguments.
printf("argv: %s %s %s", argv[1], argv[2], argv[3])
- .statement("func@file+line") probes are now supported to allow a
match relative to the entry of the function incremented by line
number. This allows using the same systemtap script if the rest
of the file.c source only changes slightly.
- A probe listing mode is available.
% stap -l vm.*
vm.brk
vm.mmap
vm.munmap
vm.oom_kill
vm.pagefault
vm.write_shared
- More user-space probe types are added:
probe process(PID).begin { }
probe process("PATH").begin { }
probe process(PID).thread.begin { }
probe process("PATH").thread.begin { }
probe process(PID).end { }
probe process("PATH").end { }
probe process(PID).thread.end { }
probe process("PATH").thread.end { }
probe process(PID).syscall { }
probe process("PATH").syscall { }
probe process(PID).syscall.return { }
probe process("PATH").syscall.return { }
- Globals now accept ; terminators
global odds, evens;
global little[10], big[5];
* What's new in version 0.6
- A copy of the systemtap tutorial and language reference guide
are now included.
- There is a new format specifier, %m, for the printf family of
functions. It functions like %s, except that it does not stop when
a nul ('\0') byte is encountered. The number of bytes output is
determined by the precision specifier. The default precision is 1.
For example:
printf ("%m", "My String") // prints one character: M
printf ("%.5", myString) // prints 5 bytes beginning at the start
// of myString
- The %b format specifier for the printf family of functions has been enhanced
as follows:
1) When the width and precision are both unspecified, the default is %8.8b.
2) When only one of the width or precision is specified, the other defaults
to the same value. For example, %4b == %.4b == %4.4b
3) Nul ('\0') bytes are used for field width padding. For example,
printf ("%b", 0x1111deadbeef2222) // prints all eight bytes
printf ("%4.2b", 0xdeadbeef) // prints \0\0\xbe\xef
- Dynamic width and precision are now supported for all printf family format
specifiers. For example:
four = 4
two = 2
printf ("%*.*b", four, two, 0xdeadbbeef) // prints \0\0\xbe\xef
printf ("%*d", four, two) // prints <space><space><space>2
- Preprocessor conditional expressions can now include wildcard style
matches on kernel versions.
%( kernel_vr != "*xen" %? foo %: bar %)
- Prototype support for user-space probing is showing some progress.
No symbolic notations are supported yet (so no probing by function names,
file names, process names, and no access to $context variables), but at
least it's something:
probe process(PID).statement(ADDRESS).absolute { }
This will set a uprobe on the given process-id and given virtual address.
The proble handler runs in kernel-space as usual, and can generally use
existing tapset functions.
- Crash utility can retrieve systemtap's relay buffer from a kernel dump
image by using staplog which is a crash extension module. To use this
feature, type commands as below from crash(8)'s command line:
crash> extend staplog.so
crash> help systemtaplog
Then, you can see more precise help message.
- You can share a relay buffer amoung several scripts and merge outputs from
several scripts by using "-DRELAY_HOST" and "-DRELAY_GUEST" options.
For example:
# run a host script
% stap -ve 'probe begin{}' -o merged.out -DRELAY_HOST &
# wait until starting the host.
% stap -ve 'probe begin{print("hello ");exit()}' -DRELAY_GUEST
% stap -ve 'probe begin{print("world\n");exit()}' -DRELAY_GUEST
Then, you'll see "hello world" in merged.out.
- You can add a conditional statement for each probe point or aliase, which
is evaluated when the probe point is hit. If the condition is false, the
whole probe body(including aliases) is skipped. For example:
global switch = 0;
probe syscall.* if (switch) { ... }
probe procfs.write {switch = strtol($value,10)} /* enable/disable ctrl */
- Systemtap will warn you if your script contains unused variables or
functions. This is helpful in case of misspelled variables. If it
doth protest too much, turn it off with "stap -w ...".
- You can add error-handling probes to a script, which are run if a
script was stopped due to errors. In such a case, "end" probes are
not run, but "error" ones are.
probe error { println ("oops, errors encountered; here's a report anyway")
foreach (coin in mint) { println (coin) } }
- In a related twist, one may list probe points in order of preference,
and mark any of them as "sufficient" beyond just "optional". Probe
point sequence expansion stops if a sufficient-marked probe point has a hit.
This is useful for probes on functions that may be in a module (CONFIG_FOO=m)
or may have been compiled into the kernel (CONFIG_FOO=y), but we don't know
which. Instead of
probe module("sd").function("sd_init_command") ? ,
kernel.function("sd_init_command") ? { ... }
which might match neither, now one can write this:
probe module("sd").function("sd_init_command") ! , /* <-- note excl. mark */
kernel.function("sd_init_command") { ... }
- New security model. To install a systemtap kernel module, a user
must be one of the following: the root user; a member of the
'stapdev' group; or a member of the 'stapusr' group. Members of the
stapusr group can only use modules located in the
/lib/modules/VERSION/systemtap directory (where VERSION is the
output of "uname -r").
- .statement("...@file:line") probes now apply heuristics to allow an
approximate match for the line number. This works similarly to gdb,
where a breakpoint placed on an empty source line is automatically
moved to the next statement. A silly bug that made many $target
variables inaccessible to .statement() probes was also fixed.
- LKET has been retired. Please let us know on <systemtap@sourceware.org>
if you have been a user of the tapset/tools, so we can help you find
another way.
- New families of printing functions println() and printd() have been added.
println() is like print() but adds a newline at the end;
printd() is like a sequence of print()s, with a specified field delimiter.
* What's new since version 0.5.14?
- The way in which command line arguments for scripts are substituted has
changed. Previously, $1 etc. would interpret the corresponding command
line argument as an numeric literal, and @1 as a string literal. Now,
the command line arguments are pasted uninterpreted wherever $1 etc.
appears at the beginning of a token. @1 is similar, but is quoted as
a string. This change does not modify old scripts, but has the effect
of permitting substitution of arbitrary token sequences.
# This worked before, and still does:
% stap -e 'probe timer.s($1) {}' 5
# Now this also works:
% stap -e 'probe syscall.$1 {log(@1)}' open
# This won't crash, just signal a recursion error:
% stap -e '$1' '$1'
# As before, $1... is recognized only at the beginning of a token
% stap -e 'probe begin {foo$1=5}'
* What's new since version 0.5.13?
- The way in which systemtap resolves function/inline probes has changed:
.function(...) - now refers to all functions, inlined or not
.inline(...) - is deprecated, use instead:
.function(...).inline - filters function() to only inlined instances
.function(...).call - filters function() to only non-inlined instances
.function(...).return - as before, but now pairs best with .function().call
.statement() is unchanged.
* What's new since version 0.5.12?
- When running in -p4 (compile-only) mode, the compiled .ko file name
is printed on standard output.
- An array element with a null value such as zero or an empty string
is now preserved, and will show up in a "foreach" loop or "in" test.
To delete such an element, the scripts needs to use an explicit
"delete array[idx]" statement rather than something like "array[idx]=0".
- The new "-P" option controls whether prologue searching heuristics
will be activated for function probes. This was needed to get correct
debugging information (dwarf location list) data for $target variables.
Modern compilers (gcc 4.1+) tend not to need this heuristic, so it is
no longer default. A new configure flag (--enable-prologues) restores
it as a default setting, and is appropriate for older compilers (gcc 3.*).
- Each systemtap module prints a one-line message to the kernel informational
log when it starts. This line identifies the translator version, base
address of the probe module, a broken-down memory consumption estimate, and
the total number of probes. This is meant as a debugging / auditing aid.
- Begin/end probes are run with interrupts enabled (but with
preemption disabled). This will allow begin/end probes to be
longer, to support generating longer reports.
- The numeric forms of kernel.statement() and kernel.function() probe points
are now interpreted as relocatable values - treated as relative to the
_stext symbol in that kernel binary. Since some modern kernel images
are relocated to a different virtual address at startup, such addresses
may shift up or down when actually inserted into a running kernel.
kernel.statement(0xdeadbeef): validated, interpreted relative to _stext,
may map to 0xceadbeef at run time.
In order to specify unrelocated addresses, use the new ".absolute"
probe point suffix for such numeric addresses. These are only
allowed in guru mode, and provide access to no $target variables.
They don't use debugging information at all, actually.
kernel.statement(0xfeedface).absolute: raw, unvalidated, guru mode only
* What's new since version 0.5.10?
- Offline processing of debugging information, enabling general
cross-compilation of probe scripts to remote hosts, without
requiring identical module/memory layout. This slows down
compilation/translation somewhat.
- Kernel symbol table data is loaded by staprun at startup time
rather than compiled into the module.
- Support the "limit" keyword for foreach iterations:
foreach ([x,y] in ary limit 5) { ... }
This implicitly exits after the fifth iteration. It also enables
more efficient key/value sorting.
- Support the "maxactive" keyword for return probes:
probe kernel.function("sdfsdf").maxactive(848) { ... }
This allows up to 848 concurrently outstanding entries to
the sdfsdf function before one returns. The default maxactive
number is smaller, and can result in missed return probes.
- Support accessing of saved function arguments from within
return probes. These values are saved by a synthesized
function-entry probe.
- Add substantial version/architecture checking in compiled probes to
assert correct installation of debugging information and correct
execution on a compatible kernel.
- Add probe-time checking for sufficient free stack space when probe
handlers are invoked, as a safety improvement.
- Add an optional numeric parameter for begin/end probe specifications,
to order their execution.
probe begin(10) { } /* comes after */ probe begin(-10) {}
- Add an optional array size declaration, which is handy for very small
or very large ones.
global little[5], big[20000]
- Include some example scripts along with the documentation.
- Change the start-time allocation of probe memory to avoid causing OOM
situations, and to abort cleanly if free kernel memory is short.
- Automatically use the kernel DWARF unwinder, if present, for stack
tracebacks.
- Many minor bug fixes, performance, tapset, and error message
improvements.
|