diff options
Diffstat (limited to 'doc/abrt-retrace-server.texi')
-rw-r--r-- | doc/abrt-retrace-server.texi | 799 |
1 files changed, 0 insertions, 799 deletions
diff --git a/doc/abrt-retrace-server.texi b/doc/abrt-retrace-server.texi deleted file mode 100644 index 9eb02ca8..00000000 --- a/doc/abrt-retrace-server.texi +++ /dev/null @@ -1,799 +0,0 @@ -\input texinfo -@c abrt-retrace-server.texi - Retrace Server Documentation -@c -@c .texi extension is recommended in GNU Automake manual -@setfilename abrt-retrace-server.info -@include version.texi - -@settitle Retrace server for ABRT @value{VERSION} Manual - -@dircategory Retrace server -@direntry -* Retrace server: (retrace-server). Remote coredump analysis via HTTP. -@end direntry - -@titlepage -@title Retrace server -@subtitle for ABRT version @value{VERSION}, @value{UPDATED} -@author Karel Klic (@email{kklic@@redhat.com}) -@page -@vskip 0pt plus 1filll -@end titlepage - -@contents - -@ifnottex -@node Top -@top Retrace server - -This manual is for retrace server for ABRT version @value{VERSION}, -@value{UPDATED}. The retrace server provides coredump analysis and -backtrace generation service over a network using HTTP protocol. -@end ifnottex - -@menu -* Overview:: -* HTTP interface:: -* Retrace worker:: -* Task cleanup:: -* Package repository:: -* Traffic and load estimation:: -* Security:: -* Future work:: -@end menu - -@node Overview -@chapter Overview - -Analyzing a program crash from a coredump is a difficult task. The GNU -Debugger (GDB), that is commonly used to analyze coredumps on free -operating systems, expects that the system analyzing the coredump is -identical to the system where the program crashed. Software updates -often break this assumption even on the system where the crash occured, -making the coredump analyzable only with significant -effort. Furthermore, older versions of software packages are often -removed from software repositories, including the packages with -debugging symbols, so the package with debugging symbols is often not -available when user needs to install it for coredump analysis. Packages -with the debugging symbols are large, requiring a lot of free space and -causing problems with downloading them via unreliable internet -connection. - -Retrace server solves these problems for Fedora 14+ and RHEL 6+ -operating systems, and allows developers to analyze coredumps without -having access to the machine where the crash occurred. - -Retrace server is usually run as a service on a local network, or on -Internet. A user sends a coredump together with some additional -information to a retrace server. The server reads the coredump and -depending on its contents it installs necessary software dependencies to -create a software environment which is, from the GDB point of view, -identical to the environment where the crash happened. Then the server -runs GDB to generate a backtrace from the coredump and provides it back -to the user. - -Core dumps generated on i386 and x86_64 architectures are supported -within a single x86_64 retrace server instance. - -The retrace server consists of the following major parts: -@enumerate -@item -a HTTP interface, consisting of a set of scripts handling communication -with clients -@item -a retrace worker, doing the coredump processing, environment -preparation, and running the debugger to generate a backtrace -@item -a cleanup script, handling stalled retracing tasks and removing old data -@item -a package repository, providing the application binaries, libraries, and -debuginfo necessary for generating backtraces from coredumps -@end enumerate - -@node HTTP interface -@chapter HTTP interface - -@menu -* Creating a new task:: -* Task status:: -* Requesting a backtrace:: -* Requesting a log:: -* Limiting traffic:: -@end menu - -The client-server communication proceeds as follows: -@enumerate -@item -Client uploads a coredump to a retrace server. Retrace server creates a -task for processing the coredump, and sends the task ID and task -password in response to the client. -@item -Client asks server for the task status using the task ID and password. -Server responds with the status information (task finished successfully, -task failed, task is still running). -@item -Client asks server for the backtrace from a successfully finished task -using the task ID and password. Server sends the backtrace in response. -@item -Client asks server for a log from the finished task using the task ID -and password, and server sends the log in response. -@end enumerate - -The HTTP interface application is a set of script written in Python, -using the @uref{http://www.python.org/dev/peps/pep-0333/, Python Web -Server Gateway Interface} (WSGI) to interact with a web server. The only -supported and tested configuration is the Apache HTTPD Server with -@uref{http://code.google.com/p/modwsgi/, mod_wsgi}. - -Only secure (HTTPS) communication is allowed for communicating with a -public instance of retrace server, because coredumps and backtraces are -private data. Users may decide to publish their backtraces in a bug -tracker after reviewing them, but the retrace server doesn't do -that. The server is supposed to use HTTP persistent connections to to -avoid frequent SSL renegotiations. - -@node Creating a new task -@section Creating a new task - -A client might create a new task by sending a HTTP request to the -@indicateurl{https://server/create} URL, and providing an archive as the -request content. The archive contains crash data files. The crash data -files are a subset of some local @file{/var/spool/abrt/ccpp-time-pid} -directory contents, so the client must only pack and upload them. - -The server supports uncompressed tar archives, and tar archives -compressed with gzip and xz. Uncompressed archives are the most -efficient way for local network delivery, and gzip can be used there as -well because of its good compression speed. - -The xz compression file format is well suited for public server setup -(slow network), as it provides good compression ratio, which is -important for compressing large coredumps, and it provides reasonable -compress/decompress speed and memory consumption. See @ref{Traffic and -load estimation} for the measurements. The @uref{http://tukaani.org/xz/, -XZ Utils} implementation with the compression level 2 is used to -compress the data. - -The HTTP request for a new task must use the POST method. It must -contain a proper @var{Content-Length} and @var{Content-Type} fields. If -the method is not POST, the server returns the @code{405 Method Not -Allowed} HTTP error code. If the @var{Content-Length} field is missing, -the server returns the @code{411 Length Required} HTTP error code. If an -@var{Content-Type} other than @samp{application/x-tar}, -@samp{application/x-gzip}, @samp{application/x-xz} is used, the server -returns the @code{415 unsupported Media Type} HTTP error code. If the -@var{Content-Length} value is greater than a limit set by -@var{MaxPackedSize} option in the server configuration file (50 MB by -default), or the real HTTP request size gets larger than the limit + 10 -KB for headers, then the server returns the @code{413 Request Entity Too -Large} HTTP error code, and provides an explanation, including the -limit, in the response body. The limit is changeable from the server -configuration file. - -If unpacking the archive would result in having the free disk space -under certain limit in the @file{/var/spool/abrt-retrace} directory, the -server returns the @code{507 Insufficient Storage} HTTP error code. The -limit is specified by the @var{MinStorageLeft} option in the server -configuration file, and it is set to 1024 MB by default. - -If the data from the received archive would take more than 1024 MB of -disk space when uncompressed, the server returns the @code{413 Request -Entity Too Large} HTTP error code, and provides an explanation, -including the limit, in the response body. The size limit is changeable -by the @var{MaxUnpackedSize} option in the server configuration file. It -can be set pretty high because coredumps, that take most disk space, are -stored on the server only temporarily until the backtrace is -generated. When the backtrace is generated the coredump is deleted by -the @command{abrt-retrace-worker}, so most disk space is released. - -The uncompressed data size for xz archives is obtained by calling -@code{`xz --list file.tar.xz`}. The @option{--list} option has been -implemented only recently, so updating @command{xz} on your server might -be necessary. Likewise, the uncompressed data size for gzip archives is -obtained by calling @code{`gzip --list file.tar.gz`}. - -If an upload from a client succeeds, the server creates a new directory -@file{/var/spool/abrt-retrace/@var{id}} and extracts the -received archive into it. Then it checks that the directory contains all -the required files, checks their sizes, and then sends a HTTP -response. After that it spawns a subprocess with -@command{abrt-retrace-worker} on that directory. - -The following files from the local crash directory are required to be -present in the archive: @file{coredump}, @file{executable}, -@file{package}. If one or more files are not present in the archive, or -some other file is present in the archive, the server returns the -@code{403 Forbidden} HTTP error code. - -If the file check succeeds, the server HTTP response has the @code{201 -Created} HTTP code. The response includes the following HTTP header -fields: -@itemize -@item -@var{X-Task-Id} containing a new server-unique numerical -task id -@item -@var{X-Task-Password} containing a newly generated -password, required to access the result -@end itemize - -The @var{X-Task-Password} is a random alphanumeric (@samp{[a-zA-Z0-9]}) -sequence 22 characters long. The password is stored in the -@file{/var/spool/abrt-retrace/@var{id}/password} file, and passwords -sent by a client in subsequent requests are verified by comparing with -this file. - -The task id is intentionally not used as a password, because it is -desirable to keep the id readable and memorable for -humans. Password-like ids would be a loss when an user authentication -mechanism is added, and server-generated password will no longer be -necessary. - -@node Task status -@section Task status - -A client might request a task status by sending a HTTP GET request to -the @indicateurl{https://someserver/@var{id}} URL, where @var{id} is the -numerical task id returned in the @var{X-Task-Id} field by -@indicateurl{https://someserver/create}. If the @var{id} is not in the -valid format, or the task @var{id} does not exist, the server returns -the @code{404 Not Found} HTTP error code. - -The client request must contain the @var{X-Task-Password} field, and its -content must match the password stored in the -@file{/var/spool/abrt-retrace/@var{id}/password} file. If the password is -not valid, the server returns the @code{403 Forbidden} HTTP error code. - -If the checks pass, the server returns the @code{200 OK} HTTP code, and -includes a field @var{X-Task-Status} containing one of the following -values: @samp{FINISHED_SUCCESS}, @samp{FINISHED_FAILURE}, -@samp{PENDING}. - -The field contains @samp{FINISHED_SUCCESS} if the file -@file{/var/spool/abrt-retrace/@var{id}/backtrace} exists. The client might -get the backtrace on the @indicateurl{https://someserver/@var{id}/backtrace} -URL. The log might be obtained on the -@indicateurl{https://someserver/@var{id}/log} URL, and it might contain -warnings about some missing debuginfos etc. - -The field contains @samp{FINISHED_FAILURE} if the file -@file{/var/spool/abrt-retrace/@var{id}/backtrace} does not exist, and file -@file{/var/spool/abrt-retrace/@var{id}/retrace-log} exists. The retrace-log -file containing error messages can be downloaded by the client from the -@indicateurl{https://someserver/@var{id}/log} URL. - -The field contains @samp{PENDING} if neither file exists. The client -should ask again after 10 seconds or later. - -@node Requesting a backtrace -@section Requesting a backtrace - -A client might request a backtrace by sending a HTTP GET request to the -@indicateurl{https://someserver/@var{id}/backtrace} URL, where @var{id} -is the numerical task id returned in the @var{X-Task-Id} field by -@indicateurl{https://someserver/create}. If the @var{id} is not in the -valid format, or the task @var{id} does not exist, the server returns -the @code{404 Not Found} HTTP error code. - -The client request must contain the @var{X-Task-Password} field, and its -content must match the password stored in the -@file{/var/spool/abrt-retrace/@var{id}/password} file. If the password -is not valid, the server returns the @code{403 Forbidden} HTTP error -code. - -If the file @file{/var/spool/abrt-retrace/@var{id}/backtrace} does not -exist, the server returns the @code{404 Not Found} HTTP error code. -Otherwise it returns the file contents, and the @var{Content-Type} -header is set to @samp{text/plain}. - -@node Requesting a log -@section Requesting a log - -A client might request a task log by sending a HTTP GET request to the -@indicateurl{https://someserver/@var{id}/log} URL, where @var{id} is the -numerical task id returned in the @var{X-Task-Id} field by -@indicateurl{https://someserver/create}. If the @var{id} is not in the -valid format, or the task @var{id} does not exist, the server returns -the @code{404 Not Found} HTTP error code. - -The client request must contain the @var{X-Task-Password} field, and its -content must match the password stored in the -@file{/var/spool/abrt-retrace/@var{id}/password} file. If the password -is not valid, the server returns the @code{403 Forbidden} HTTP error -code. - -If the file @file{/var/spool/abrt-retrace/@var{id}/retrace-log} does not -exist, the server returns the @code{404 Not Found} HTTP error code. -Otherwise it returns the file contents, and the @var{Content-Type} -header is set to @samp{text/plain}. - -@node Limiting traffic -@section Limiting traffic - -The maximum number of simultaneously running tasks is limited to 5 by -the server. The limit is changeableby the @var{MaxParallelTasks} option -in the server configuration file. If a new request comes when the server -is fully occupied, the server returns the @code{503 Service Unavailable} -HTTP error code. - -The archive extraction, chroot preparation, and gdb analysis is -mostly limited by the hard drive size and speed. - -@node Retrace worker -@chapter Retrace worker - -Retrace worker is a program (usually residing in -@command{/usr/bin/abrt-retrace-worker}), which: -@enumerate -@item -takes a task id as a parameter, and turns it into a directory containing -a coredump -@item -determines which packages need to be installed from the coredump -@item -installs the packages in a newly created chroot environment together -with @command{gdb} -@item -copies the coredump to the chroot environment -@item -runs @command{gdb} from inside the environment to generate a backtrace -from the coredump -@item -copies the resulting backtrace from the environment to the directory -@end enumerate - -The tasks reside in @file{/var/spool/abrt-retrace/@var{taskid}} -directories. - -To determine which packages need to be installed, -@command{abrt-retrace-worker} runs the @command{coredump2packages} tool. -The tool reads build-ids from the coredump, and tries to find the best -set of packages (epoch, name, version, release) matching the -build-ids. Local yum repositories are used as the source of -packages. GDB requirements are strict, and this is the reason why proper -backtraces cannot be directly and reliably generated on systems whose -software is updated: -@itemize -@item -The exact binary which crashed needs to be available to GDB. -@item -All libraries which are linked to the binary need to be available in the -same exact versions from the time of the crash. -@item -The binary plugins loaded by the binary or libraries via @code{dlopen} -need to be present in proper versions. -@item -The files containing the debugging symbols for the binary and libraries -(build-ids are used to find the pairs) need to be available to GDB. -@end itemize - -The chroot environments are created and managed by @command{mock}, and -they reside in @file{/var/lib/mock/@var{taskid}}. The retrace worker -generates a mock configuration file and then invokes @command{mock} to -create the chroot, and to run programs from inside the chroot. - -The chroot environment is populated by installing packages using -@command{yum}. Package installation cannot be avoided, as GDB expects to -operate on an installed system, and on crashes from that system. GDB -uses plugins written in Python, that are shipped with packages (for -example see @command{rpm -ql libstdc++}). - -Coredumps might be affected by @command{prelink}, which is used on -Fedora to speed up dynamic linking by caching its results directly in -binaries. The system installed by @command{mock} for the purpose of -retracing doesn't use @command{prelink}, so the binaries differ between -the system of origin and the mock environment. It has been tested that -this is not an issue, but in the case some issue -@uref{http://sourceware.org/ml/gdb/2009-05/msg00175.html, occurs} -(GDB fails to work with a binary even if it's the right one), a bug -should be filed on @code{prelink}, as its operation should not affect -the area GDB operates on. - -No special care is taken to avoid the possibility that GDB will not run -with the set of packages (fixed versions) as provided by coredump. It is -expected that any combination of packages user might use in a released -system satisfies the needs of some version of GDB. Yum selects the -newest possible version which has its requirements satisfied. - -@node Task cleanup -@chapter Task cleanup - -It is neccessary to watch and limit the resource usage of tasks for a -retrace server to remain operational. This is performed by the -@command{abrt-retrace-cleanup} tool. It is supposed that the server -administrator sets @command{cron} to run the tool every hour. - -Tasks that were created more than 120 hours (5 days) ago are -deleted. The limit can be changed by the @var{DeleteTaskAfter} option in -the server configuration file. Coredumps are deleted when the retrace -process is finished, and only backtraces, logs, and configuration remain -available for every task until the cleanup. The -@command{abrt-retrace-cleanup} checks the creation time and deletes the -directories in @file{/var/spool/abrt-retrace/}. - -Tasks running for more than 1 hour are terminated and removed from the -system. Tasks for which the @command{abrt-retrace-worker} crashed for -some reason without marking the task as finished are also removed. - -@node Package repository -@chapter Package repository - -Retrace server is able to support every Fedora release with all packages -that ever made it to the updates and updates-testing repositories. In -order to provide all that packages, a local repository needs to be -maintained for every supported operating system. - -A repository with Fedora packages must be maintained locally on the -server to provide good performance and to provide data from older -packages already removed from the official repositories. Retrace server -contains a tool @command{abrt-retrace-reposync}, which is a package -downloader scanning Fedora servers for new packages, and downloading -them so they are immediately available. - -Older versions of packages are regularly deleted from the updates and -updates-testing repositories. Retrace server supports older versions of -packages, as this is one of major pain-points that the retrace server is -supposed to solve. - -The @command{abrt-retrace-reposync} downloads packages from Fedora -repositories, and it does not delete older versions of the packages. The -retrace server administrator is supposed to call this script using cron -approximately every 6 hours. The script uses @command{rsync} to get the -packages and @command{createrepo} to generate respository metadata. - -The packages are downloaded to a local repository in -@file{/var/cache/abrt-retrace/}. The location can be changed via the -@var{RepoDir} option in the server configuration file. - -@node Traffic and load estimation -@chapter Traffic and load estimation - -2500 bugs are reported from ABRT every month. Approximately 7.3% -from that are Python exceptions, which don't need a retrace -server. That means that 2315 bugs need a retrace server. That is 77 -bugs per day, or 3.3 bugs every hour on average. Occasional spikes -might be much higher (imagine a user that decided to report all his 8 -crashes from last month). - -We should probably not try to predict if the monthly bug count goes up -or down. New, untested versions of software are added to Fedora, but -on the other side most software matures and becomes less crashy. So -let's assume that the bug count stays approximately the same. - -Test crashes (see why we use @code{`xz -2`} to compress coredumps): -@itemize -@item -firefox with 7 tabs (random pages opened), coredump size 172 MB -@itemize -@item -xz compression -@itemize -@item -compression level 6 (default): compression took 32.5 sec, compressed -size 5.4 MB, decompression took 2.7 sec -@item -compression level 3: compression took 23.4 sec, compressed size 5.6 MB, -decompression took 1.6 sec -@item -compression level 2: compression took 6.8 sec, compressed size 6.1 MB, -decompression took 3.7 sec -@item -compression level 1: compression took 5.1 sec, compressed size 6.4 MB, -decompression took 2.4 sec -@end itemize -@item -gzip compression -@itemize -@item -compression level 9 (highest): compression took 7.6 sec, compressed size -7.9 MB, decompression took 1.5 sec -@item -compression level 6 (default): compression took 2.6 sec, compressed size -8 MB, decompression took 2.3 sec -@item -compression level 3: compression took 1.7 sec, compressed size 8.9 MB, -decompression took 1.7 sec -@end itemize -@end itemize -@item -thunderbird with thousands of emails opened, coredump size 218 MB -@itemize -@item -xz compression -@itemize -@item -compression level 6 (default): compression took 60 sec, compressed size -12 MB, decompression took 3.6 sec -@item -compression level 3: compression took 42 sec, compressed size 13 MB, -decompression took 3.0 sec -@item -compression level 2: compression took 10 sec, compressed size 14 MB, -decompression took 3.0 sec -@item -compression level 1: compression took 8.3 sec, compressed size 15 MB, -decompression took 3.2 sec -@end itemize -@item -gzip compression -@itemize -@item -compression level 9 (highest): compression took 14.9 sec, compressed -size 18 MB, decompression took 2.4 sec -@item -compression level 6 (default): compression took 4.4 sec, compressed size -18 MB, decompression took 2.2 sec -@item -compression level 3: compression took 2.7 sec, compressed size 20 MB, -decompression took 3 sec -@end itemize -@end itemize -@item -evince with 2 pdfs (1 and 42 pages) opened, coredump size 73 MB -@itemize -@item -xz compression -@itemize -@item -compression level 2: compression took 2.9 sec, compressed size 3.6 MB, -decompression took 0.7 sec -@item -compression level 1: compression took 2.5 sec, compressed size 3.9 MB, -decompression took 0.7 sec -@end itemize -@end itemize -@item -OpenOffice.org Impress with 25 pages presentation, coredump size 116 MB -@itemize -@item -xz compression -@itemize -@item -compression level 2: compression took 7.1 sec, compressed size 12 MB, -decompression took 2.3 sec -@end itemize -@end itemize -@end itemize - -So let's imagine there are some users that want to report their -crashes approximately at the same time. Here is what the retrace -server must handle: -@enumerate -@item -2 OpenOffice crashes -@item -2 evince crashes -@item -2 thunderbird crashes -@item -2 firefox crashes -@end enumerate - -We will use the xz archiver with the compression level 2 on the ABRT's -side to compress the coredumps. So the users spend 53.6 seconds in -total packaging the coredumps. - -The packaged coredumps have 71.4 MB, and the retrace server must -receive that data. - -The server unpacks the coredumps (perhaps in the same time), so they -need 1158 MB of disk space on the server. The decompression will take -19.4 seconds. - -Several hundred megabytes will be needed to install all the -required packages and debuginfos for every chroot (8 chroots 1 GB each -= 8 GB, but this seems like an extreme, maximal case). Some space will -be saved by using a debuginfofs. - -Note that most applications are not as heavyweight as OpenOffice and -Firefox. - -@node Security -@chapter Security - -The retrace server communicates with two other entities: it accepts -coredumps form users, and it downloads debuginfos and packages from -distribution repositories. - -@menu -* Clients:: -* Packages and debuginfo:: -@end menu - -General security from GDB flaws and malicious data is provided by -chroot. The GDB accesses the debuginfos, packages, and the coredump from -within the chroot under a non-root user, unable to access the retrace -server's environment. - -@c We should consider setting a disk quota to every chroot directory, -@c and limit the GDB access to resources using cgroups. - -SELinux policy exists for both the retrace server's HTTP interface, and -for the retrace worker. - -@node Clients -@section Clients - -It is expected that the clients, which are using the retrace server and -sending coredumps to it, trust the retrace server administrator. The -server administrator must not try to get sensitive data from client -coredumps. This is a major bottleneck of the retrace server. However, -users of an operating system already trust the operating system provider -in various important matters. So when the retrace server is operated by -the OS provider, that might be acceptable for users. - -Sending clients' coredumps to the retrace server cannot be avoided if we -want to generate good backtraces containing the values of -variables. Minidumps lower the quality of the resulting backtraces, -while not improving user security. - -A malicious client can craft a nonstandard coredump, which will be -processed by server's GDB. GDB handles malformed coredumps well. - -Users can never be allowed to provide custom packages/debuginfo together -with a coredump. Packages need to be installed to the environment, and -installing untrusted programs is insecure. - -As for attacker trying to steal users' backtraces from the retrace -server, the passwords protecting the backtraces in the -@var{X-Task-Password} header are random alphanumeric -(@samp{[a-zA-Z0-9]}) sequences 22 characters long. 22 alphanumeric -characters corresponds to 128 bit password, because @samp{[a-zA-Z0-9]} -is 62 characters, and @math{2^{128}} < @math{62^{22}}. The source of -randomness is @file{/dev/urandom}. - -@node Packages and debuginfo -@section Packages and debuginfo - -Packages and debuginfo are safely downloaded from the distribution -repositories, as the packages are signed by the distribution, and the -package origin is verified. - -When the debuginfo filesystem server is done, the retrace server can -safely use it, as the data will also be signed. - -@node Future work -@chapter Future work - -@section Coredump stripping -Jan Kratochvil: With my test of OpenOffice.org presentation kernel core -file has 181MB, xz -2 of it has 65MB. According to `set target debug 1' -GDB reads only 131406 bytes of it (incl. the NOTE segment). - -@section Supporting other architectures -Three approaches: -@itemize -@item -Use GDB builds with various target architectures: gdb-i386, gdb-ppc64, -gdb-s390. -@item -Run -@uref{http://wiki.qemu.org/download/qemu-doc.html#QEMU-User-space-emulator, -QEMU user space emulation} on the server -@item -Run @code{abrt-retrace-worker} on a machine with right -architecture. Introduce worker machines and tasks, similarly to Koji. -@end itemize - -@section Use gdbserver instead of uploading whole coredump -GDB's gdbserver cannot process coredumps, but Jan Kratochvil's can: -@verbatim -git://git.fedorahosted.org/git/elfutils.git -branch: jankratochvil/gdbserver - src/gdbserver.c - * Currently threading is not supported. - * Currently only x86_64 is supported (the NOTE registers layout). -@end verbatim - -@section User management for the HTTP interface -Multiple authentication sources (x509 for RHEL). - -@section Make all files except coredump optional on the input -Make @file{architecture}, @file{release}, @file{packages} files, which -must be included in the package when creating a task, optional. Allow -uploading a coredump without involving tar: just coredump, coredump.gz, -or coredump.xz. - -@section Handle non-standard packages (provided by user) -This would make retrace server very vulnerable to attacks, it never can -be enabled in a public instance. - -@section Support vmcores -See @uref{https://fedorahosted.org/cas/, Core analysis system}, its -features etc. - -@section Do not refuse new tasks on a fully loaded server -Consider using @uref{http://git.fedorahosted.org/git/?p=kobo.git, kobo} -for task management and worker handling (master/slaves arch). - -@section Support synchronous operation -Client sends a coredump, and keeps receiving the server response -message. The server response HTTP body is generated and sent gradually -as the task is performed. Client can choose to stop receiving the -response body after getting all headers and ask the server for status -and backtrace asynchronously. - -The server re-sends the output of abrt-retrace-worker (its stdout and -stderr) to the response the body. In addition, a line with the task -status is added in the form @code{X-Task-Status: PENDING} to the body -every 5 seconds. When the worker process ends, either -@samp{FINISHED_SUCCESS} or @samp{FINISHED_FAILURE} status line is -sent. If it's @samp{FINISHED_SUCCESS}, the backtrace is attached after -this line. Then the response body is closed. - -@section Provide task estimation time -The response to the @code{/create} action should contain a header -@var{X-Task-Est-Time}, that contains a number of seconds the server -estimates it will take to generate the backtrace - -The algorithm for the @var{X-Task-Est-Time} time estimation -should take the previous analyses of coredumps with the same -corresponding package name into account. The server should store -simple history in a SQLite database to know how long it takes to -generate a backtrace for certain package. It could be as simple as -this: -@itemize -@item - initialization step one: @code{CREATE TABLE package_time (id INTEGER - PRIMARY KEY AUTOINCREMENT, package, release, time)}; we need the - @var{id} for the database cleanup - to know the insertion order of - rows, so the @code{AUTOINCREMENT} is important here; the @var{package} - is the package name without the version and release numbers, the - @var{release} column stores the operating system, and the @var{time} - is the number of seconds it took to generate the backtrace -@item - initialization step two: @code{CREATE INDEX package_release ON - package_time (package, release)}; we compute the time only for single - package on single supported OS release per query, so it makes sense to - create an index to speed it up -@item - when a task is finished: @code{INSERT INTO package_time (package, - release, time) VALUES ('??', '??', '??')} -@item - to get the average time: @code{SELECT AVG(time) FROM package_time - WHERE package == '??' AND release == '??'}; the arithmetic mean seems - to be sufficient here -@end itemize - -So the server knows that crashes from an OpenOffice.org package -take 5 minutes to process in average, and it can return the value 300 -(seconds) in the field. The client does not waste time asking about -that task every 20 seconds, but the first status request comes after -300 seconds. And even when the package changes (rebases etc.), the -database provides good estimations after some time anyway -(@ref{Task cleanup} chapter describes how the -data are pruned). - -@section Keep the database with statistics small -The database containing packages and processing times should also be -regularly pruned to remain small and provide data quickly. The cleanup -script should delete some rows for packages with too many entries: -@enumerate -@item -get a list of packages from the database: @code{SELECT DISTINCT package, -release FROM package_time} -@item -for every package, get the row count: @code{SELECT COUNT(*) FROM -package_time WHERE package == '??' AND release == '??'} -@item -for every package with the row count larger than 100, some rows most be -removed so that only the newest 100 rows remain in the database: -@itemize -@item -to get highest row id which should be deleted, execute @code{SELECT id -FROM package_time WHERE package == '??' AND release == '??' ORDER BY id -LIMIT 1 OFFSET ??}, where the @code{OFFSET} is the total number of rows -for that single package minus 100 -@item -then all the old rows can be deleted by executing @code{DELETE FROM -package_time WHERE package == '??' AND release == '??' AND id <= ??} -@end itemize -@end enumerate - -@section Support Fedora Rawhide -When the @command{abrt-retrace-reposync} is used to sync with the -Rawhide repository, unneeded packages (where a newer version exists) -must be removed after residing one week with the newer package in the -same repository. - -@bye |