Main Page | Modules | Directories | File List | Globals | Related Pages

README

00001 /** @mainpage SystemTap Runtime
00002 
00003 @section intro_sec Introduction
00004 
00005 This document describes the implementation of the SystemTap Runtime. It is intended for developers
00006 of the SystemTap Language translator or, possibly TapSet authors. These functions are not directly
00007 available from the SystemTap Language.
00008 
00009 The SystemTap Runtime Library consists of all functions
00010 and code fragments needed by the compiler/translator to
00011 include in building a kernel module using kprobes. It
00012 also include I/O code to transmit its output from the kernel to userspace.
00013  
00014 In addition to the library, the runtime includes a SystemTap user-space daemon
00015 (stpd).  Stpd grabs data sent from the I/O code in the runtime and displays it
00016 and/or saves it to files. Stpd (or a script invoking it) will handle other issues like
00017 inserting and removing modules.
00018 
00019 Stpd and the I/O code make use of both relayfs and netlink for communication.  For
00020 kernels without relayfs builtin, it is provided as a standalone module under the runtime directory.
00021 
00022 @section design_sec Design
00023 @subsection impl_sec Implementation
00024 The library is written in C and is really not a library but a collection of code
00025 That can be conditionally included in a modules. It may become a library later, but for now
00026 there are some advantages to being able to change the sizes of static items with simple #defines.
00027 
00028 @subsection map_sec Maps (Associative Arrays)
00029 Maps are implemented as hash lists. It is not expected that users will
00030 attempt to collect so much data in kernel space that performance problems will require
00031 more complex solutions such as AVL trees.
00032 
00033 Maps are created with _stp_map_new().  Each map can hold only one type of 
00034 data; int64, string, or statistics.  Each element belonging to a map can have up to 2 keys
00035 and a value.  Implemented key types are strings and longs.
00036         
00037 To simplify the implementation, the functions to set the key and the functions to set the data are separated.
00038 That means we need only 4 functions to set the key and 3 functions to set the value. 
00039 
00040 For example:
00041 \code
00042 /* create a map with a max of 100 elements */
00043 MAP mymap = map_new(100, INT64);
00044 
00045 /* mymap[birth year] = 2000 */
00046 map_key_str (mymap, "birth year");
00047 map_set_int64 (mymap, 2000);
00048 \endcode
00049 
00050 All elements have a default value of 0 (or NULL).  Elements are only saved to the map when their value is set
00051 to something nonzero.  This means that querying for the existance of a key is inexpensive because
00052 no element is created, just a hash table lookup.
00053 
00054 @subsection list_sec Lists
00055 A list is a special map which has internally ascending long integer keys.  Adding a value to
00056 a list does not require setting a key first. Create a list with _stp_list_new(). Add to it
00057 with _stp_list_add_str() and _stp_list_add_int64().  Clear it with _stp_list_clear().
00058 
00059 @subsection string_sec Strings
00060 One of the biggest restrictions the library has is that it cannot allocate things like strings off the stack.
00061 It is also not a good idea to dynamically allocate space for strings with kmalloc().  That leaves us with 
00062 statically allocated space for strings. This is what is implemented in the String module.  Strings use
00063 preallocated per-cpu buffers and are safe to use (unlike C strings).
00064 
00065 @subsection io_sec I/O
00066 Generally things are written to a "print buffer" using the internal
00067 functions _stp_print_xxx().
00068 \code
00069 _stp_print ("Output is: ");
00070 _stp_printf ("pid is %d ", current->pid);
00071 _stp_printf ("name is %s", current->comm);
00072 \endcode
00073 before the probe returns it must call _stp_print_flush().  This
00074 timestamps the accumulated print buffer and sends it to relayfs.
00075 When relayfs fills an internal buffer, the user-space daemon is notified
00076 data is ready and reads a bug per-cpu chunk, which contains a line like:
00077 \verbatim
00078 [123456.000002] Output is: pid is 1234 name is bash
00079 \endverbatim
00080 
00081 The user-daemon (stpd) saves this data to a file named something like
00082 "stpd_cpu2".  When the user hits ^c, a timer expires, or the probe
00083 module notifies stpd (through a netlink command channel) that it wants
00084 to terminate, stpd does "system(rmmod)" then collects the last output
00085 before exiting.
00086 As an option, if we don't need bulk per-cpu data, we can put
00087 \code
00088 #define STP_NETLINK_ONLY
00089 \endcode
00090 at the top of the module and all output will go over a netlink channel.
00091 In the SystemTap language, we will provide some simple functions to control the buffering policy, which
00092 will control the use of netlink and parameters to relayfs and stpd.
00093 
00094 @section status_sec Status
00095 @li Maps are implemented and tested. Histograms are not yet finished.
00096 @li Copy_From_User functions are done.
00097 @li If maps overflow or memory runs out for some reason, globals are set but nothing is done yet.
00098 I expect to implement a function to tell the system to either ignore it or unload the module and quit.
00099 @li Stack functions need much improvement.
00100 
00101 @section probe_sec Example Probes
00102 
00103 Working sample probe code using the runtime is in runtime/probes.
00104 <a href="dir_000000.html"> Browse probes.</a>
00105 
00106 @section todo_sec ToDo 
00107 \link todo Click Here for Complete List \endlink
00108 
00109 @section links Links
00110 <a href="http://sources.redhat.com/systemtap/">SystemTap Project Page</a>
00111  */