diff options
Diffstat (limited to 'runtime/relayfs/relayfs.txt')
-rw-r--r-- | runtime/relayfs/relayfs.txt | 206 |
1 files changed, 206 insertions, 0 deletions
diff --git a/runtime/relayfs/relayfs.txt b/runtime/relayfs/relayfs.txt new file mode 100644 index 00000000..ada829bb --- /dev/null +++ b/runtime/relayfs/relayfs.txt @@ -0,0 +1,206 @@ + +relayfs - a high-speed data relay filesystem +============================================ + +relayfs is a filesystem designed to provide an efficient mechanism for +tools and facilities to relay large and potentially sustained streams +of data from kernel space to user space. + +The main abstraction of relayfs is the 'channel'. A channel consists +of a set of per-cpu kernel buffers each represented by a file in the +relayfs filesystem. Kernel clients write into a channel using +efficient write functions which automatically log to the current cpu's +channel buffer. User space applications mmap() the per-cpu files and +retrieve the data as it becomes available. + +The format of the data logged into the channel buffers is completely +up to the relayfs client; relayfs does however provide hooks which +allow clients to impose some stucture on the buffer data. Nor does +relayfs implement any form of data filtering - this also is left to +the client. The purpose is to keep relayfs as simple as possible. + +This document provides an overview of the relayfs API. The details of +the function parameters are documented along with the functions in the +filesystem code - please see that for details. + + +The relayfs user space API +========================== + +relayfs implements basic file operations for user space access to +relayfs channel buffer data. Here are the file operations that are +available and some comments regarding their behavior: + +open() enables user to open an _existing_ buffer. + +mmap() results in channel buffer being mapped into the caller's + memory space. + +poll() POLLIN/POLLRDNORM/POLLERR supported. User applications are + notified when sub-buffer boundaries are crossed. + +close() decrements the channel buffer's refcount. When the refcount + reaches 0 i.e. when no process or kernel client has the buffer + open, the channel buffer is freed. + + +In order for a user application to make use of relayfs files, the +relayfs filesystem must be mounted. For example, + + mount -t relayfs relayfs /mnt/relay + +NOTE: relayfs doesn't need to be mounted for kernel clients to create + or use channels - it only needs to be mounted when user space + applications need access to the buffer data. + + +The relayfs kernel API +====================== + +Here's a summary of the API relayfs provides to in-kernel clients: + + + channel management functions: + + relay_open(base_filename, parent, subbuf_size, n_subbufs, + overwrite, callbacks) + relay_close(chan) + relay_flush(chan) + relay_reset(chan) + relayfs_create_dir(name, parent) + relayfs_remove_dir(dentry) + relay_commit(buf, reserved, count) + relay_subbufs_consumed(chan, cpu, subbufs_consumed) + + write functions: + + relay_write(chan, data, length) + __relay_write(chan, data, length) + relay_reserve(chan, length) + + callbacks: + + subbuf_start(buf, subbuf, prev_subbuf_idx, prev_subbuf) + deliver(buf, subbuf_idx, subbuf) + buf_mapped(buf, filp) + buf_unmapped(buf, filp) + buf_full(buf, subbuf_idx) + + +A relayfs channel is made of up one or more per-cpu channel buffers, +each implemented as a circular buffer subdivided into one or more +sub-buffers. + +relay_open() is used to create a channel, along with its per-cpu +channel buffers. Each channel buffer will have an associated file +created for it in the relayfs filesystem, which can be opened and +mmapped from user space if desired. The files are named +basename0...basenameN-1 where N is the number of online cpus, and by +default will be created in the root of the filesystem. If you want a +directory structure to contain your relayfs files, you can create it +with relayfs_create_dir() and pass the parent directory to +relay_open(). Clients are responsible for cleaning up any directory +structure they create when the channel is closed - use +relayfs_remove_dir() for that. + +The total size of each per-cpu buffer is calculated by multiplying the +number of sub-buffers by the sub-buffer size passed into relay_open(). +The idea behind sub-buffers is that they're basically an extension of +double-buffering to N buffers, and they also allow applications to +easily implement random-access-on-buffer-boundary schemes, which can +be important for some high-volume applications. The number and size +of sub-buffers is completely dependent on the application and even for +the same application, different conditions will warrant different +values for these parameters at different times. Typically, the right +values to use are best decided after some experimentation; in general, +though, it's safe to assume that having only 1 sub-buffer is a bad +idea - you're guaranteed to either overwrite data or lose events +depending on the channel mode being used. + +relayfs channels can be opened in either of two modes - 'overwrite' or +'no-overwrite'. In overwrite mode, writes continuously cycle around +the buffer and will never fail, but will unconditionally overwrite old +data regardless of whether it's actually been consumed. In +no-overwrite mode, writes will fail i.e. data will be lost, if the +number of unconsumed sub-buffers equals the total number of +sub-buffers in the channel. In this mode, the client is reponsible +for notifying relayfs when sub-buffers have been consumed via +relay_subbufs_consumed(). A full buffer will become 'unfull' and +logging will continue once the client calls relay_subbufs_consumed() +again. When a buffer becomes full, the buf_full() callback is invoked +to notify the client. In both modes, the subbuf_start() callback will +notify the client whenever a sub-buffer boundary is crossed. This can +be used to write header information into the new sub-buffer or fill in +header information reserved in the previous sub-buffer. One piece of +information that's useful to save in a reserved header slot is the +number of bytes of 'padding' for a sub-buffer, which is the amount of +unused space at the end of a sub-buffer. The padding count for each +sub-buffer is contained in an array in the rchan_buf struct passed +into the subbuf_start() callback: rchan_buf->padding[prev_subbuf_idx] +can be used to to get the padding for the just-finished sub-buffer. +subbuf_start() is also called for the first sub-buffer in each channel +buffer when the channel is created. The mode is specified to +relay_open() using the overwrite parameter. + +kernel clients write data into the current cpu's channel buffer using +relay_write() or __relay_write(). relay_write() is the main logging +function - it uses local_irqsave() to protect the buffer and should be +used if you might be logging from interrupt context. If you know +you'll never be logging from interrupt context, you can use +__relay_write(), which only disables preemption. These functions +don't return a value, so you can't determine whether or not they +failed - the assumption is that you wouldn't want to check a return +value in the fast logging path anyway, and that they'll always succeed +unless the buffer is full and in no-overwrite mode, in which case +you'll be notified via the buf_full() callback. + +relay_reserve() is used to reserve a slot in a channel buffer which +can be written to later. This would typically be used in applications +that need to write directly into a channel buffer without having to +stage data in a temporary buffer beforehand. Because the actual write +may not happen immediately after the slot is reserved, applications +using relay_reserve() can call relay_commit() to notify relayfs when +the slot has actually been written. When all the reserved slots have +been committed, the deliver() callback is invoked to notify the client +that a guaranteed full sub-buffer has been produced. Because the +write is under control of the client and is separated from the +reserve, relay_reserve() doesn't protect the buffer at all - it's up +to the client to provide the appropriate synchronization when using +relay_reserve(). + +The client calls relay_close() when it's finished using the channel. +The channel and its associated buffers are destroyed when there are no +longer any references to any of the channel buffers. relay_flush() +forces a sub-buffer switch on all the channel buffers, and can be used +to finalize and process the last sub-buffers before the channel is +closed. + +Some applications may want to keep a channel around and re-use it +rather than open and close a new channel for each use. relay_reset() +can be used for this purpose - it resets a channel to its initial +state without reallocating channel buffer memory or destroying +existing mappings. It should however only be called when it's safe to +do so i.e. when the channel isn't currently being written to. + +Finally, there are a couple of utility callbacks that can be used for +different purposes. buf_mapped() is called whenever a channel buffer +is mmapped from user space and buf_unmapped() is called when it's +unmapped. The client can use this notification to trigger actions +within the kernel application, such as enabling/disabling logging to +the channel. + + +Credits +======= + +The ideas and specs for relayfs came about as a result of discussions +on tracing involving the following: + +Michel Dagenais <michel.dagenais@polymtl.ca> +Richard Moore <richardj_moore@uk.ibm.com> +Bob Wisniewski <bob@watson.ibm.com> +Karim Yaghmour <karim@opersys.com> +Tom Zanussi <zanussi@us.ibm.com> + +Also thanks to Hubertus Franke for a lot of useful suggestions and bug +reports. |