.TH LKET 5 @DATE@ "IBM" .SH NAME LKET \- Linux Kernel Event Trace tool based on SystemTap .\" macros .de SAMPLE .br .RS .nf .nh .. .de ESAMPLE .hy .fi .RE .. .SH DESCRIPTION The Linux Kernel Event Trace (LKET) tool is an extension to the tapsets library available on SystemTap. Its goal is to utilize the dynamic probing capabilities provided through SystemTap to create a set of standard hooks that probe pre-defined places in the kernel. It can be used to collect important information that can be used as a starting point to analyze a performance problem in the system. The LKET tapsets are designed to only trace the events selected by the user. Once the data has been collected, it is then post-processed according to the need of the user. Trace data can be processed in various different ways to generate simple to complex reports. .SH BINARY TRACING By default, LKET will log the trace data in binary format. To get a better performance for binary tracing, the "\-b" option should be turned on for stap and thus \-M option has to be added to stop staprun merging per-cpu files. You could use the command .I lket\-b2a to convert the binary trace data generated by LKET into readable data in ascii format. .I lket\-b2a uses the pre-cpu binary trace data files(stpd_cpu*) as inputs, and generates an output file named .IR lket.out . or dump the trace data into MySQL database. See .IR lket\-b2a (1) manual page for more detail. If you want LKET to log trace data in ASCII format directly, you should: .SAMPLE stap \-D ASCII_TRACE ... .ESAMPLE .B *Notes* that in order to make .I LKET able to work in binary tracing mode, all strings logged by .I LKET should be NULL-terminated, which means you have to use "%0s" instead of "%s" for both user appended extra printing statements and _lket_trace() which is called in .I LKET tapsets. .SH EVENT REGISTER LKET provides a way to log the metadata of the trace data by events registering. Two functions is provided: .P .IP .SB void _register_sys_event (char *event_desc, int grpid, int hookid, char *fmt, char *field_name) .IP .SB register_user_event(grpid:long, hookid:long, fmt:string, names:string) .P .I event_desc is a string representation of the event, e.g: syscall.entry, scsi.iocompleted. .I grpid and .I hookid is the groupid and hookid of the event to be registered. .I fmt contains a set of fomat tokens seperated by ":". The valid format tokens are: .B UINT8, .B UINT16, .B UINT32, .B UINT64 and .B STRING which represents 8-bit, 16-bit, 32-bit, 64-bit binary data and NULL-terminated respectively. .I names contains a set of names seperated by ":". The names contains in .I names should match the format tokens contains in .I fmt .B _register_sys_event is a c function which is used to register the newly added trace hooks in LKET tapsets. For example, supposing you want to add a new event hook to trace the entry of sys_open, and you want this event hook to log the fd, flag and mode paremeters for you. You should add: .SAMPLE _register_sys_event("iosyscall.open.entry", _GROUP_IOSYSCALL, _HOOKID_IOSYSCALL_OPEN_ENTRY, "STRING:INT32:INT32", "filename:flags:mode"); .ESAMPLE into the function .B register_sys_events in LKET/register_event.stp .B register_user_event is a SystemTap script function which is used for user to add extra trace data for a event hook. See the section .B CUSTOMIZED TRACE DATA for more detail .SH CUSTOMIZED TRACE DATA LKET provides a set of event hooks that log the predefined trace data for you, but LKET also make you able to log extra trace data for a event. LKET provides a way to do this without modifying the tapset of that event hook. You can simply use printf to trace extra data. For example, supposing you want to trace sk_buff\->mac_len and sk_buff\->priority besides the sk_buff\->len, sk_buff\->protocol and sk_buff\->truesize for the .B netdev event hooks: .SAMPLE probe register_event { register_user_event(GROUP_NETDEV, HOOKID_NETDEV_TRANSMIT, "INT32:INT32", "mac_len:priority") } probe addevent.netdev.transmit { printf("%4b%4b", $skb\->mac_len, $skb\->priority) } .ESAMPLE .SH EXAMPLES Here are some examples of using LKET: .TP Trace all events provided by LKET: stap \-e "probe addevent.* {}" \-bM .TP Trace all available events by skipping those unavaiabled on current system: stap \-e "probe addevent.* ? {}" \-bM .TP Trace all system calls: stap \-e "probe addevent.syscall {}" \-bM .TP Trace the entry of all system calls: stap \-e "probe addevent.syscall.entry {}" \-bM .TP Trace netdev transmition and log extra data of mac_len and priority: stap \-e "probe addevent.netdev.transmit { printf(\\"%4b%4b\\", $skb\->mac_len, $skb\->priority) }" \-bM .P You can press "Ctrl+c" to stop the tracing. Then you will find there are one or more per-cpu data files (stpd_cpu*) on current directory. You can use .I lket\-b2a to convert these binary trace data files into readable ascii format or dump them into database. See .IR lket\-b2a (1) man page for more detail. .SH EVENT HOOKS AND TRACE DATA FORMAT The following sections enumerate the variety of event hooks implemented in LKET and their trace data format. The trace data generated by different event hooks contain common data as well as some data specific to that event hook. the INT8, INT16, INT32, INT64 and STRING appeared in trace data format represents 8-bit, 16-bit, 32-bit, 64-bit binary data and NULL-terminated string respectively. The data common(i.e. .I common_data in the following subsecions) to all event hooks is: .RS .B timestamp(INT64),(tid<<32|pid)(INT64),(ppid<<32|groupID<<24|hookID<<16|cpu_id<<8)(INT64) .RE Each event hook group is a collection of those hooks that have similarities of what they could trace. And the ID of each event hook (HookID) is defined in the context of its corresponding group. .SS EVENT REGISTER Event register is not actually an event. It is used to log the metadata of the trace data, including the extra trace data appended by user. See .B EVENT REGISTER and .B CUSTOMIZED TRACE DATA for more details. .P .TP .B register_sys_event This is a function used to register event hooks available in LKET. It should be called from register_event.stp:register_sys_events(). .TP .B register_user_event This is a function used to log the metadata of the extra trace data appended by user for a specific event. It should be called in the probe .I register_event .SS SYSTEM CALLS You could use .I addevent.syscall to trace the entry and return of all system calls. It contains two sub event hooks: .P .TP .B addevent.syscall.entry Trace entry of all system calls. Data format is: .I common_data, syscall_name(STRING) .TP .B addevent.syscall.return Trace return of all system calls. Data format is: .I common_data, syscall_name(STRING) .SS PROCESS CREATION This group contains three sub event hooks. All of them are turned on by default. You can use the flags stoptrace_fork and stoptrace_exec to stop tracing fork/execve in your script, e.g.: .SAMPLE probe begin { stoptrace_fork = 1 stoptrace_exec = 1 } ... .ESAMPLE .P .TP .B process_snapshot() This event hook isn't a probe definition but a function. It is called by LKET silently to take a snapshot of all running processes. Data format is: .I common_data, tid(INT32), pid(INT32), ppid(INT32), process_name(STRING) .P .TP .B lket_internal.process.fork Trace fork of processes Data format is: .I common_data, new_tid(INT32), new_pid(INT32), ppid(INT32) .TP .B lket_internal.process.execve Trace execve of new processes Data format is: .I common_data, tid(INT32), pid(INT32), ppid(INT32), new_process_name(STRING) .SS SIGNAL You could use .I addevent.signal to trace signal activities. It contains the following events: .P .TP .B addevent.signal.send.entry Trace when a signal is sent to a process Data format is: .I common_data, sig(INT8), shared(INT8), send2queue(INT8), pid(INT32) .TP .B addevent.signal.send.return Trace when returning from sending signal Data format is: .I common_data, return(INT8) .TP .B addevent.signal.syskill.entry Trace when sys_kill is called to send a signal to a process. Data format is: .I common_data, pid(INT32), sig(INT8) .TP .B addevent.signal.syskill.return Trace when return from sys_kill Data format is: .I common_data, return(INT8) .TP .B addevent.signal.systgkill.entry Trace when sys_tgkill is called to send a signal to one specific thread Data format is: .I common_data, tid(INT32), pid(INT32), sig(INT8) .TP .B addevent.signal.systgkill.return Trace when returning from sys_tgkill Data format is: .I common_data, return(INT8) .TP .B addevent.signal.systkill.entry Trace when sys_tkill is called to send a signal to a single process. Data format is: .I common_data, pid(INT32), sig(INT8) .TP .B addevent.signal.systkill.return Trace when returning from sys_tkill. Data format is: .I common_data, return(INT8) .TP .B addevent.signal.pending.entry Trace when examine the set of signals that are pending for delivery. Data format is: .I common_data, sigset_addr(INT32), setsize(INT32) .TP .B addevent.signal.pending.return Trace when returning from signal.pending Data format is: .I common_data, return(INT8) .TP .B addevent.signal.do_action.entry Trace when a thread is about to examine and change a signal action Data format is: .I common_data, sig(INT8), handler(INT64) .TP .B addevent.signal.do_action.return Trace when returning from signal.do_action Data format is: .I common_data, return(INT8) .TP .B addevent.signal.procmask.entry Trace when a thread is about to examine and change blocked signals Data format is: .I common_data, how(INT8), sigset(INT64) .TP .B addevent.signal.procmask.return Trace when returning from signal.procmask Data format is .I common_data, return(INT8) .TP .B addevent.signal.flush.entry Trace when flush all pending signals for a task Data format is: .I common_data, pid(INT32) .SS IO SCHEDULER ACTIVITIES You could use .I addevent.ioscheduler to trace the IO scheduler activities. It contains three sub event hooks: .P .TP .B addevent.ioscheduler.elv_add_request Trace when a request is added to the request queue Data format is: .I common_data, elevator_name(STRING), disk_major(INT8), disk_minor(INT8), .I request_addr(INT64), request_flags(INT64) .TP .B addevent.ioscheduler.elv_next_request.entry Trace when try to retrieve a request from request queue Data format is: .I common_data, elevator_name(STRING) .TP .B addevent.ioscheduler.elv_next_request.return Trace when return from retrieving a request from request queue Data format is: .I common_data, disk_major(INT8), disk_minor(INT8), .I request_addr(INT64), request_flags(INT64) .TP .B addevent.ioscheduler.elv_completed_request Trace when a request is completed Data format is: .I common_data, elevator_name(STRING), disk_major(INT8), disk_minor(INT8), .I request_addr(INT64), request_flags(INT64) .SS TASK SCHEDULE ACTIVITIES You could use .I addevent.tskdispatch to trace the task scheduler activities. It contains two sub event hooks: .P .TP .B addevent.tskdispatch.ctxswitch Trace the process context switch Data format is: .I common_data, prev_pid(INT32), next_pid(INT32), prev_state(INT8) .TP .B addevent.tskdispatch.cpuidle Trace when cpu goes idle Data format is: .I common_data, current_pid(INT32) .SS SCSI ACTIVITIES You could use .I addevent.scsi to trace the scsi layer activities. It contains four sub event hooks: .P .TP .B addevent.scsi.ioentry mid-layer prepares a IO request Data format is: .I common_data, disk_major(INT8), disk_minor(INT8), device_state(INT8), request_addr(INT64) .TP .B addevent.scsi.iodispatching Dispatch a command to the low-level driver Data format is: .I common_data, host(INT8), channel(INT8), lun(INT8), dev_id(INT8), .I device_state(INT8), data_direction(INT8), reqbuf_addr(INT64), .I reqbuf_len(INT32), request_addr(INT64) .TP .B addevent.scsi.iodone I/O is done by low-level driver Data format is: .I common_data, host(INT8), channel(INT8), lun(INT8), dev_id(INT8), .I device_state(INT8), data_direction(INT8), request_addr(INT64) .TP .B addevent.scsi.iocompleted mid-layer processed the completed IO Data format is: .I common_data, host(INT8), channel(INT8), lun(INT8), dev_id(INT8), .I device_state(INT8), data_direction(INT8), request_addr(INT64), .I bytes_done(INT32) .SS PAGE FAULT You could use .I addevent.pagefault to trace page fault events. It contains only one sub event hooks: .P .TP .B addevent.pagefault Data format is: .I common_data, memory_address(INT64), write_access(INT8) .SS NETWORK DEVICE ACTIVITIES You could use .I addevent.netdev to trace the network device activities. It contains two sub event hooks: .P .TP .B addevent.netdev.receive network device receives a packet Data format is: .I common_data, netdev_name(STRING), data_length(INT32), protocol(INT16), .I buffer_length(INT32) .TP .BR addevent.netdev.transmit A packet will be sent out by network device Data format is: .I common_data, netdev_name(STRING), data_length(INT32), protocol(INT16), .I buffer_length(INT32) .SS IO SYSCALLS You could use .I addevent.iosyscall to trace the detail activities of io related system calls. It contains 16 entry hooks and 16 corresponding return hooks. All the return hooks will only log the common_data and the return value. So in the following subsections, only the entry hooks will be listed: .P .TP .B addevent.iosyscall.open.entry the entry of sys_open Data format is: .I common_data, filename(STRING), flags(INT32), mode(INT32) .TP .B addevent.iosyscall.close.entry the entry of sys_close Data format is: .I common_data, fd(INT64) .TP .B addevent.iosyscall.read.entry the entry of sys_read Data format is: .I common_data, fd(INT64), buf_addr(INT64), count(INT64) .TP .B addevent.iosyscall.write.entry the entry of sys_write Data format is: .I common_data, fd(INT64), buf_addr(INT64), count(INT64) .TP .B addevent.iosyscall.readv.entry the entry of sys_readv Data format is: .I common_data, fd(INT64), vector_addr(INT64), count(INT64) .TP .B addevent.iosyscall.writev.entry the entry of sys_writev Data format is: .I common_data, fd(INT64), vector_addr(INT64), count(INT64) .TP .B addevent.iosyscall.pread64.entry the entry of sys_pread64 Data format is: .I common_data, fd(INT64), buff_addr(INT64), count(INT64), offset(INT64) .TP .B addevent.iosyscall.pwrite64.entry the entry of sys_pwrite64 Data format is: .I common_data, fd(INT64), buff_addr(INT64), count(INT64), offset(INT64) .TP .B addevent.iosyscall.readahead.entry the entry of sys_readahead Data format is: .I common_data, fd(INT64), offset(INT64), count(INT64) .TP .B addevent.iosyscall.senfile.entry the entry of sys_sendfile and sys_sendfile64 Data format is: .I common_data, out_fd(INT64), in_fd(INT64), offset_uaddr(INT64), count(INT64) .TP .B addevent.iosyscall.lseek.entry the entry of sys_lseek Data format is: .I common_data, fd(INT64), offset(INT64), whence(INT8) .TP .B addevent.iosyscall.llseek.entry the entry of sys_llseek Data format is: .I common_data, fd(INT64), offset_high(INT64), offset_low(INT64), .I result_addr(INT64), whence(INT8) .TP .B addevent.iosyscall.sync.entry the entry of sys_sync Data format is: .I common_data .TP .B addevent.iosyscall.fsync.entry the entry of sys_fsync Data format is: .I common_data, fd(INT64) .TP .B addevent.iosyscall.fdatasync.entry the entry of sys_fdatasync Data format is: .I common_data, fd(INT64) .TP .B addevent.iosyscall.flock.entry the entry of sys_flock Data format is: .I common_data, fd(INT64), operation(INT32) .SS Asynchronous IO You could use .I addevent.aio to trace the detail activities of AIO related calls(most of them are AIO system calls). It contains 6 entry hooks and 6 corresponding return hooks. All the return hooks will only log the common_data and the return value. So in the following subsections, only the entry hooks will be listed: .P .TP .B addevent.aio.io_setup.entry Fired by calling io_setup from user space. The corresponding system call is sys_io_setup, which will create an aio_context capable of receiving at least maxevents. Data format is: .I common_data, nr_events(INT32), ctxp_uaddr(INT64) .TP .B addevent.aio.io_submit.entry Fired by calling io_submit from user space. The corresponding system call is sys_io_submit which will queue the nr iocbs pointed to by iocbpp_uaddr for processing. Data format is: .I common_data, ctx_id(INT64), nr(INT32), iocbpp_uaddr(INT64) .TP .B addevent.aio.io_submit_one.entry Called by sys_io_submit. It will iterate iocbpp and process them one by one Data format is: .I common_data, ctx(INT64), user_iocb_uaddr(INT64), aio_lio_opcode(INT16), .I aio_reqprio(INT16), aio_fildes(INT32), aio_buf(INT64), aio_nbytes(INT64), .I aio_offset(INT64) .TP .B addevent.aio.io_getevents.entry Fired by calling io_getevents from user space. The corresponding system call is sys_io_getevents, which will attempt to read at least min_nr events and up to nr events from the completion queue for the aio_context specified by ctx_id. Data format is: .I common_data, ctx_id(INT64), min_nr(INT32), nr(INT32), events_uaddr(INT64), .I tv_sec(INT32), tv_nsec(INT32) .TP .B addevent.aio.io_destroy.entry Fired by calling io_destroy from user space. The corresponding system call is sys_io_destroy, which will destroy the aio_context specified. Data format is: .I common_data, ctx(INT64) .TP .B addevent.aio.io_cancel.entry Fired by calling io_cancel from user space. The corresponding system call is sys_io_cancel, which will attempt to cancel an iocb previously passed to io_submit. Data format is: .I common_data, ctx_id(INT64), iocb_uaddr(INT64), result_uaddr(INT64) .SS SUNRPC You could use .I addevent.sunrpc to trace the details of SUNRPC activities. It is now divided into three groups: high-level client operation event hooks (addevent.sunrpc.clnt), high-level server operation event hooks (addevent.sunrpc.svc) and RPC scheduler operation event hooks (addevent.sunrpc.sched). It contains 19 entry hooks and 19 corresponding return hooks. All the return hooks will only log the common_data and the return value. So in the following subsections, only the entry hooks will be listed: .P .TP .B addevent.sunrpc.clnt.create_client.entry Fires when an RPC client is to be created Data format is: .I common_data, servername(STRING), prog(INT64), vers(INT8), .I prot(INT16), port(INT16), authflavor(INT8) .TP .B addevent.sunrpc.clnt.clone_client.entry Fires when the RPC client structure is to be cloned Data format is: .I common_data, servername(STRING), prog(INT64), vers(INT8), .I prot(INT16), port(INT16), authflavor(INT8) .TP .B addevent.sunrpc.clnt.shutdown_client.entry Fires when an RPC client is to be shut down Data format is: .I common_data, servername(STRING), prog(INT64), clones(INT16), .I tasks(INT16), rpccnt(INT32) .TP .B addevent.sunrpc.clnt.bind_new_program.entry Fires when a new RPC program is to be bound an existing client Data format is: .I common_data, servername(STRING), old_prog(INT64), old_vers(INT8), .I prog(INT64), vers(INT8) .TP .B addevent.sunrpc.clnt.call_sync.entry Fires when an RPC procedure is to be called synchronously Data format is: .I common_data, servername(STRING), prog(INT64), vers(INT8), .I proc(INT64), flags(INT64) .TP .B addevent.sunrpc.clnt.call_async.entry Fires when an RPC procedure is to be called asynchronously Data format is: .I common_data, servername(STRING), prog(INT64), vers(INT8), .I proc(INT64), flags(INT64) .TP .B addevent.sunrpc.clnt.restart_call.entry Fires when want to restart a task Data format is: .I common_data, tk_pid(INT64), tk_flags(INT64) .TP .B addevent.sunrpc.svc.register.entry Fires when an RPC service is to be registered with the local portmapper Data format is: .I common_data, sv_name(STRING), prog(INT64), prot(INT16), .I port(INT32) .TP .B addevent.sunrpc.svc.create.entry Fires when an RPC service is to be created Data format is: .I common_data, prog(INT64), pg_nvers(INT8), bufsize(INT32) .TP .B addevent.sunrpc.svc.destroy.entry Fires when an RPC service is to be destroyed Data format is: .I common_data, sv_name(STRING), sv_prog(INT64), sv_nrthreads(INT32) .TP .B addevent.sunrpc.svc.process.entry Fires when an RPC request is to be processed Data format is: .I common_data, sv_name(STRING), sv_prog(INT64), peer_ip(INT64), .I rq_xid(INT64), rq_prog(INT64), rq_vers(INT8), rq_proc(INT8) .TP .B addevent.sunrpc.svc.authorise.entry Fires when an RPC request is to be authorised Data format is: .I common_data, sv_name(STRING), peer_ip(INT64), rq_xid(INT64), .I rq_prog(INT64), rq_vers(INT8), rq_proc(INT64) .TP .B addevent.sunrpc.svc.recv.entry Fires when receiving the next request on any socket Data format is: .I common_data, sv_name(STRING), timeout(INT64) .TP .B addevent.sunrpc.svc.send.entry Fires when want to return reply to the client Data format is: .I sv_name(STRING), peer_ip(INT64), rq_xid(INT64), rq_prog(INT64), .I rq_vers(INT8), rq_proc(INT64) .TP .B addevent.sunrpc.svc.drop.entry Fires when a request is to be dropped Data format is: .I common_data, sv_name(STRING), peer_ip(INT64), rq_xid(INT64), .I rq_prog(INT64), rq_vers(INT8), rq_proc(INT64) .TP .B addevent.sunrpc.sched.new_task.entry Fires when creating a new task for the specified client Data format is: .I common_data, xid(INT64), prog(INT64), vers(INT8), prot(INT64), .I flags(INT64) .TP .B addevent.sunrpc.sched.release_task.entry Fires when releasing a task Data format is: .I common_data, xid(INT64), prog(INT64), vers(INT8), prot(INT64), .I flags(INT64) .TP .B addevent.sunrpc.sched.execute.entry Fires when an RPC request is to be executed Data format is: .I common_data, xid(INT64), prog(INT64), vers(INT8), prot(INT64), .I tk_pid(INT64), tk_flags(INT64) .TP .B addevent.sunrpc.sched.delay.entry Fires when want to delay an RPC request Data format is: .I common_data, xid(INT64), prog(INT64), tk_pid(INT64), .I tk_flags(INT64), delay(INT64) .SS NFS You could use .I addevent.nfs to trace the detail activities of nfs on client side. It divided into three groups: nfs file operation event hooks(addevent.nfs.fop), nfs address space operation event hooks(addevent.nfs.aop), nfs proc event hooks(addevent.nfs.proc). It contains 36 entry hooks and 33 corresponding return hooks All the return hooks will only log the common_data and the return value. So in the following subsections, only the entry hooks will be listed: .P .TP .B addevent.nfs.fop.llseek.entry the entry of nfs_file_llseek Data format is: .I common_data, major_device(INT8), minor_devide(INT8), fileid(INT32), .I offset(INT32), origin(INR8) .TP .B addevent.nfs.fop.read.entry the entry of do_sync_read Data format is: .I common_data, major_device(INT8), minor_devide(INT8), fileid(INT32), .I buf_addr(INT64), count(INT64) , offset(INT64) .TP .B addevent.nfs.fop.write.entry the entry of do_sync_write Data format is: .I common_data, major_device(INT8), minor_devide(INT8), fileid(INT32), .I buf_addr(INT64), count(INT64) , offset(INT64) .TP .B addevent.nfs.fop.aio_read.entry the entry of nfs_file_read Data format is: .I common_data, major_device(INT8), minor_devide(INT8), fileid(INT32), .I buf_addr(INT64), count(INT64) , offset(INT64) .TP .B addevent.nfs.fop.aio_write.entry the entry of nfs_file_read Data format is: .I common_data, major_device(INT8), minor_devide(INT8), fileid(INT32), .I buf_addr(INT64), count(INT64) , offset(INT64) .TP .B addevent.nfs.fop.mmap.entry the entry of nfs_file_mmap Data format is: .I common_data, major_device(INT8), minor_devide(INT8), fileid(INT32), .I vm_start(INT64), vm_end(INT64) , vm_flags(INT32) .TP .B addevent.nfs.fop.open.entry the entry of nfs_file_open Data format is: .I common_data, major_device(INT8), minor_devide(INT8), fileid(INT32), .I flag(INT32), filename(STRING) .TP .B addevent.nfs.fop.flush.entry the entry of nfs_file_flush Data format is: .I common_data, major_device(INT8), minor_devide(INT8), fileid(INT32), .I ndirty(INT32) .TP .B addevent.nfs.fop.release.entry the entry of nfs_file_release Data format is: .I common_data, major_device(INT8), minor_devide(INT8), fileid(INT32), .I mode(INT16) .TP .B addevent.nfs.fop.fsync.entry the entry of nfs_fsync Data format is: .I common_data, major_device(INT8), minor_devide(INT8), fileid(INT32), .I ndirty(INT32) .TP .B addevent.nfs.fop.lock.entry the entry of nfs_lock Data format is: .I common_data, major_device(INT8), minor_devide(INT8), fileid(INT32), .I fl_start(INT64), fl_end(INT64), fl_type(INT8), fl_flag(INT8), cmd(INT32) .TP .B addevent.nfs.fop.sendfile.entry the entry of nfs_file_sendfile Data format is: .I common_data, major_device(INT8), minor_devide(INT8), fileid(INT32), .I count(INT64), ppos(INT64) .TP .B addevent.nfs.fop.checkflags.entry the entry of nfs_check_flags Data format is: .I flag(INT32) .TP .B addevent.nfs.aop.readpage.entry the entry of nfs_readpage Data format is: .I fileid(INT64), rsize(INT32), page_address(INT64), page_index(INT64) .TP .B addevent.nfs.aop.readpages.entry the entry of nfs_readpages Data format is: .I fileid(INT64), rpages(INT32), nr_pages(INT32) .TP .B addevent.nfs.aop.writepage.entry the entry of nfs_writepage Data format is: .I fileid(INT64), wsize(INT32), page_address(INT64), page_index(INT64) .TP .B addevent.nfs.aop.writepages.entry the entry of nfs_writepages Data format is: .I fileid(INT64), wpages(INT32), nr_to_write(INT64) .TP .B addevent.nfs.aop.prepare_write.entry the entry of nfs_prepare_write Data format is: .I fileid(INT64), page_address(INT64), page_index(INT64) .TP .B addevent.nfs.aop.commit_write.entry the entry of nfs_commit_write Data format is: .I fileid(INT64), page_address(INT64), page_index(INT64),offset(INT32),count(INT32) .TP .B addevent.nfs.aop.set_page_dirty.entry the entry of __set_page_dirty_nobuffers Data format is: .I page_address(INT64), page_flag(INT8) .TP .B addevent.nfs.aop.release_page.entry the entry of nfs_release_page Data format is: .I page_address(INT64), page_index(INT64) .TP .B addevent.nfs.proc.lookup.entry the entry of nfs_proc_lookup , nfs3_proc_lookup and nfs4_proc_lookup Data format is: .I major_dev(INT8), minor_dev(INT8), fileid(INT64), version(INT8), .I filename(STRING) .TP .B addevent.nfs.proc.read.entry the entry of nfs_proc_read, nfs3_proc_read and nfs4_proc_read Data format is: .I major_dev(INT8), minor_dev(INT8), fileid(INT64), version(INT8), .I count(INT32),offset(INT64) .TP .B addevent.nfs.proc.write.entry the entry of nfs_proc_write, nfs3_proc_write and nfs4_proc_write Data format is: .I major_dev(INT8), minor_dev(INT8), fileid(INT64), version(INT8), .I count(INT32),offset(INT64) .TP .B addevent.nfs.proc.commit.entry Fires when client writes the buffered data to disk,the buffered data is asynchronously written by client before . The commit function works in sync way,not exist in NFSV2 the entry of nfs3_proc_commit and nfs4_proc_commit Data format is: .I major_dev(INT8), minor_dev(INT8), fileid(INT64), version(INT8), .I count(INT32),offset(INT64) .TP .B addevent.nfs.proc.read_setup.entry The read_setup function is used to setup a read rpc task,not do a real read operation the entry of nfs_proc_read_setup, nfs3_proc_read_setup and nfs4_proc_read_setup Data format is: .I major_dev(INT8), minor_dev(INT8), fileid(INT64), version(INT8), .I count(INT32),offset(INT64) .TP .B addevent.nfs.proc.write_setup.entry The write_setup function is used to setup a write rpc task,not do a real write operation the entry of nfs_proc_write_setup, nfs3_proc_write_setup and nfs4_proc_write_setup Data format is: .I major_dev(INT8), minor_dev(INT8), fileid(INT64), version(INT8), .I how(INT8), count(INT32),offset(INT64) .TP .B addevent.nfs.proc.commit_setup.entry The commit_setup function is used to setup a commit rpc task,not do a real commit operation.It is not exist in NFSV2 the entry of nfs3_proc_commit_setup and nfs4_proc_commit_setup Data format is: .I major_dev(INT8), minor_dev(INT8), fileid(INT64), version(INT8), .I how(INT8), count(INT32),offset(INT64) .TP .B addevent.nfs.proc.read_done.entry Fires when a read reply is received or some read error occur (timeout or socket shutdown) the entry of nfs_read_done, nfs3_read_done and nfs4_read_done Data format is: .I major_dev(INT8), minor_dev(INT8), fileid(INT64), version(INT8), .I status(INT32), count(INT32) .TP .B addevent.nfs.proc.write_done.entry Fires when a write reply is received or some write error occur (timeout or socket shutdown) the entry of nfs_write_done, nfs3_write_done and nfs4_write_done Data format is: .I major_dev(INT8), minor_dev(INT8), fileid(INT64), version(INT8), .I status(INT32), count(INT32) .TP .B addevent.nfs.proc.commit_done.entry Fires when a commit reply is received or some commit operation error occur (timeout or socket shutdown) the entry of nfs_commit_done, nfs3_commit_done and nfs4_commit_done Data format is: .I major_dev(INT8), minor_dev(INT8), fileid(INT64), version(INT8), .I status(INT32), count(INT32) .TP .B addevent.nfs.proc.open.entry the entry of nfs_open Data format is: .I major_dev(INT8), minor_dev(INT8), fileid(INT64), version(INT8), .I filename(STRING), flag(INT32), mode(INT32) .TP .B addevent.nfs.proc.release.entry the entry of nfs_release Data format is: .I major_dev(INT8), minor_dev(INT8), fileid(INT64), version(INT8), .I filename(STRING), flag(INT32), mode(INT32) .TP .B addevent.nfs.proc.create.entry the entry of nfs_proc_create, nfs3_proc_create, nfs4_proc_create Data format is: .I major_dev(INT8), minor_dev(INT8), fileid(INT64), version(INT8), .I filename(STRING), mode(INT32) .TP .B addevent.nfs.proc.rename.entry the entry of nfs_proc_rename, nfs3_proc_rename, nfs4_proc_rename Data format is: .I version(INT8), major_old(INT8), minor_old(INT8), old_fileid(INT64), old_name(STRING), .I major_new(INT8), minor_new(INT8), new_fileid(INT64), new_name(STRING) .TP .B addevent.nfs.proc.remove.entry the entry of nfs_proc_remove, nfs3_proc_remove, nfs4_proc_remove Data format is: .I major_dev(INT8), minor_dev(INT8), fileid(INT64), version(INT8), .I filename(STRING) .SS NFSD You could use .I addevent.nfsd to trace the detail activities of nfs on server side. It divided into two groups: nfsd operation event hooks(addevent.nfsd.op), nfsd proc event hooks(addevent.nfsd.proc). It contains 19 entry hooks and 19 corresponding return hooks All the return hooks will only log the common_data and the return value. So in the following subsections, only the entry hooks will be listed: .P .TP .B addevent.nfsd.dispatch.entry Fires when server receives a NFS operation from client the entry of nfsd_dispatch Data format is: .I proto(INT8), version(INT8), xid(INT32), proc(INT32),client_ip(INT32) .TP .B addevent.nfsd.open.entry the entry of nfsd_open Data format is: .I fh_size(INT8), fhandle0(INT64), fhandle1(INT64), fhandle2(INT64), .I type(INT32), access(INT32) .TP .B addevent.nfsd.read.entry the entry of nfsd_read Data format is: .I fh_size(INT8), fhandle0(INT64), fhandle1(INT64), fhandle2(INT64), .I count(INT64), offset(INT64), iov_len(INT64), vlen(INT64) .TP .B addevent.nfsd.write.entry the entry of nfsd_write Data format is: .I fh_size(INT8), fhandle0(INT64), fhandle1(INT64), fhandle2(INT64), .I count(INT64), offset(INT64), iov_len(INT64), vlen(INT64) .TP .B addevent.nfsd.lookup.entry the entry of nfsd_lookup Data format is: .I fh_size(INT8), fhandle0(INT64), fhandle1(INT64), fhandle2(INT64), .I filename(STRING) .TP .B addevent.nfsd.commit.entry the entry of nfsd_commit Data format is: .I fh_size(INT8), fhandle0(INT64), fhandle1(INT64), fhandle2(INT64), .I count(INT64), offset(INT64) .TP .B addevent.nfsd.create.entry Fires when client creates a file(regular,dir,device,fifo) on server side, sometimes nfsd will call nfsd_create_v3 instead of this function the entry of nfsd_create Data format is: .I fh_size(INT8), fhandle0(INT64), fhandle1(INT64), fhandle2(INT64), .I filename(STRING), type(INT32), iap_valid(INT16), iap_mode(INT32) .TP .B addevent.nfsd.createv3.entry Fires when client creates a regular file or set file attributes on server side, only called by nfsd3_proc_create and nfsd4_open(op_claim_type is NFS4_OPEN_CLAIM_NULL) the entry of nfsd_create_v3 Data format is: .I fh_size(INT8), fhandle0(INT64), fhandle1(INT64), fhandle2(INT64), .I filename(STRING), createmode(INT8), iap_valid(INT16), iap_mode(INT32) .TP .B addevent.nfsd.unlink.entry the entry of nfsd_unlink Data format is: .I fh_size(INT8), fhandle0(INT64), fhandle1(INT64), fhandle2(INT64), .I filename(STRING), type(INT32) .TP .B addevent.nfsd.rename.entry the entry of nfsd_rename Data format is: .I old_fhsize(INT8), old_fh0(INT64), old_fh1(INT64), old_fh2(INT64), old_name(STRING) .I new_fhsize(INT8), new_fh0(INT64), new_fh1(INT64), new_fh2(INT64), new_name(STRING) .TP .B addevent.nfsd.close.entry the entry of nfsd_close Data format is: .I filename(STRING) .TP .B addevent.nfsd.proc.lookup.entry the entry of nfsd_proc_lookup, nfsd3_proc_lookup Data format is: .I fh_size(INT8), fhandle0(INT64), fhandle1(INT64), fhandle2(INT64), version(INT8) .I filename(STRING) .TP .B addevent.nfsd.proc.read.entry the entry of nfsd_proc_read, nfsd3_proc_read Data format is: .I fh_size(INT8), fhandle0(INT64), fhandle1(INT64), fhandle2(INT64), version(INT8) .I count(INT64), offset(INT64), iov_len(INT64), vlen(INT64) .TP .B addevent.nfsd.proc.write.entry the entry of nfsd_proc_write, nfsd3_proc_write Data format is: .I fh_size(INT8), fhandle0(INT64), fhandle1(INT64), fhandle2(INT64), version(INT8) .I count(INT64), offset(INT64), iov_len(INT64), vlen(INT64) .TP .B addevent.nfsd.proc.commit.entry the entry of nfsd_proc_commit, nfsd3_proc_commit Data format is: .I fh_size(INT8), fhandle0(INT64), fhandle1(INT64), fhandle2(INT64), version(INT8) .I count(INT64), offset(INT64) .TP .B addevent.nfsd.proc.commit.entry the entry of nfsd4_proc_compound Data format is: .I number(INT32) .TP .B addevent.nfsd.proc.remove.entry the entry of nfsd4_proc_compound Data format is: .I fh_size(INT8), fhandle0(INT64), fhandle1(INT64), fhandle2(INT64), version(INT8) .I filename(STRING) .TP .B addevent.nfsd.proc.rename.entry the entry of nfsd_proc_rename, nfsd3_proc_rename Data format is: .I old_fhsize(INT8), old_fh0(INT64), old_fh1(INT64), old_fh2(INT64), old_name(STRING) .I new_fhsize(INT8), new_fh0(INT64), new_fh1(INT64), new_fh2(INT64), new_name(STRING) .TP .B addevent.nfsd.proc.create.entry the entry of nfsd_proc_create, nfsd3_proc_create Data format is: .I fh_size(INT8), fhandle0(INT64), fhandle1(INT64), fhandle2(INT64), version(INT8) .I filename(STRING) .SH SEE ALSO .IR stap (1) .IR lket\-b2a (1)