From 70d3a915fa7b375e4b3c4198ae7c7c9687927942 Mon Sep 17 00:00:00 2001 From: Nikola Pajkovsky Date: Wed, 21 Jul 2010 09:40:12 +0200 Subject: rename and lower-case doc files Signed-off-by: Nikola Pajkovsky --- doc/design | 147 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 147 insertions(+) create mode 100644 doc/design (limited to 'doc/design') diff --git a/doc/design b/doc/design new file mode 100644 index 00000000..6074457b --- /dev/null +++ b/doc/design @@ -0,0 +1,147 @@ + Design goals + +We want to catch kernel oopses, binary program crashes (coredumps) +and interpreted languages crashes (Python exceptions, maybe more +in the future). + +We want to support the following use cases: + +* Home/office user with minimal administration + +In this scenario, user expects that abrt will work "out of the box" +with minimal configuration. It will be sufficient if crashes +just show a GUI notification, and user can invoke a GUI tool +to process the crash and report it to bugzilla etc. + +The configuration (like bugzilla address, username, password) +needs to be done via GUI dialogs from the same GUI tool. + +* Standalone server + +The server is installed by an admin. It may lack GUI. +Admin is willing to do somewhat more complex configuration. +Crashes should be recorded, and either processed at once +or reported to the admin by email etc. Admin may log in +and manually request crash(es) to be processed and reported, +using GUI or CLI tools. + +* Mission critical servers, server farms etc. + +Admins are expected to be competent and willing to set up complex +configurations. They might want to avoid any complex crash processing +on the servers - for example, it does not make much sense and/or +can be considered insecure to download debuginfo packages +to such servers. Admins may want to send "raw" crash dumps +to a dedicated server(s) for processing (backtrace, etc). + + + Design + +Abrt design should be flexible enough to accomodate all +of the above usage scenarios. + +The description below is not what abrt does now. +It is (currently incomplete) design notes on how we want +it to achieve design goals. + +Since currently we do not know how to dump oops on demand, +we can only poll for it. There is a small daemon which polls +kernel message buffer and dumps oopses when it sees them. +The dump is written into /var/spool/abrt/DIR. +After this, daemon spawns "abrt-process -d /var/spool/abrt/DIR" +which processes it according to configuration in /etc/abrt/*.conf. + +In order to catch binary crashes, we install a handler for it +in /proc/sys/kernel/core_pattern (by setting it to +"|/usr/libexec/abrt-hook-ccpp /var/spool/abrt %p %s %u"). +When process dumps core, the dump is written into /var/spool/abrt/DIR. +After this, abrt-hook-ccpp spawns "abrt-process -d /var/spool/abrt/DIR" +and terminates. + +When python program crashes, it invokes internel python subroutine +which dumps crash info into ~/abrt/spool/DIR. +[this is a tentative plan, currently we dump in /var/spool/abrt/DIR] +After this, it spawns "abrt-process -d ~/abrt/spool/DIR" +and terminates. + +[Problem: dumping to /var/spool/abrt/DIR needs world-writable +/var/spool/abrt and allows user to go way over his +disk quota. Dumping to ~/abrt/spool/DIR makes it difficult +to present a list of all crashes which happened on the machine - +for example, root-owned processes cannot even access user data +in ~user/* if /home is on NFS4... +] + +When user (admin) wants to see the list of dumped crashes and +process them, he runs abrt-gui or abrt-cli. These programs +perform a dbus call to "com.redhat.abrt" on a system dbus. +If there is no program with this name on it, dbus autostart +will invoke "abrt-process", which registers "com.redhat.abrt" +and processes the call(s). + +abrt-process will terminate after a timeout (a few minutes) +if no new dbus calls are arriving to it. + +The key dbus calls served by abrt-process are: + +- GetCrashInfos(): returns a vector_map_crash_data_t (vector_map_vector_string_t) + of crashes for given uid + v[N]["executable"/"uid"/"kernel"/"backtrace"][N] = "contents" +[see above the problem with producing this list] +- CreateReport(UUID): starts creating a report for /var/spool/abrt/DIR with this UUID. + Returns job id (uint64). + After it returns, when report creation thread has finished, + JobDone(client_dbus_ID,UUID) dbus signal is emitted. + [Problem: how to do privilegged plugin specific actions?] + Solution: if plugin needs an access to some root only accessible dir then + abrt should be run by root anyway + - debuginfo gets installed using pk-debuginfo-install, which cares about + privileges itself, so no problem here +- GetJobResult(UUID): returns map_crash_data_t (map_vector_string_t) +- Report(map_crash_data_t (map_vector_string_t)): + "Please report this crash": calls Report() of all registered reporter plugins + Returns report_status_t (map_vector_string_t) - the status of each call +- DeleteDebugDump(UUID): delete corresponding /var/spool/abrt/DIR. Returns bool + + + Development plan + +Since current code does not match the planned design, we need to gradually +change the code to "morph" it into the desired shape. + +Done: + +* Make abrtd dbus startable. +* Add -t TIMEOUT_SEC option to abrtd. {done} +* Make abrt-gui start abrtd on demand, so that abrt-gui can be started + even if abrtd does not run at the moment. (doesn't work in some cases!) + +Planned steps: + +* make kerneloops plugin into separate daemon (convert it to a hook + and get rid of "cron plugins" which are wrong idea since the begining) + - and make it to the service (write an initscript) +* make C/C++ hook to be started by init script + - init scritp would run ccpp-hook --init whic shoudl just set the core_pattern, which is now done by the C analyzer plugin +* hooks will start the daemon on-demand using dbus + - this is something I'm not sure if it's good idea, but dbus is becoming + to be "un-installable" on Fedora, it's probably ok +* simplify abrt.conf: + - move all plugin related info to plugins/.conf + - enabled, action association, etc ... + - make abrtd to parse plugins/*.conf and set the config options + that it understand + - this will fix the case when this is in abrt.conf + + [Cron] + KerneloopsScanner = 120 + + because this should be in plugins/kerneloops.conf + and thus shouldn't exist if kerneloops-addon is + not installed +* ??? +* ??? +* ??? +* ??? +* ??? +* Take over the world -- cgit