summaryrefslogtreecommitdiffstats
path: root/doc/abrt-retrace-server.texi
diff options
context:
space:
mode:
Diffstat (limited to 'doc/abrt-retrace-server.texi')
-rw-r--r--doc/abrt-retrace-server.texi799
1 files changed, 0 insertions, 799 deletions
diff --git a/doc/abrt-retrace-server.texi b/doc/abrt-retrace-server.texi
deleted file mode 100644
index 9eb02ca8..00000000
--- a/doc/abrt-retrace-server.texi
+++ /dev/null
@@ -1,799 +0,0 @@
-\input texinfo
-@c abrt-retrace-server.texi - Retrace Server Documentation
-@c
-@c .texi extension is recommended in GNU Automake manual
-@setfilename abrt-retrace-server.info
-@include version.texi
-
-@settitle Retrace server for ABRT @value{VERSION} Manual
-
-@dircategory Retrace server
-@direntry
-* Retrace server: (retrace-server). Remote coredump analysis via HTTP.
-@end direntry
-
-@titlepage
-@title Retrace server
-@subtitle for ABRT version @value{VERSION}, @value{UPDATED}
-@author Karel Klic (@email{kklic@@redhat.com})
-@page
-@vskip 0pt plus 1filll
-@end titlepage
-
-@contents
-
-@ifnottex
-@node Top
-@top Retrace server
-
-This manual is for retrace server for ABRT version @value{VERSION},
-@value{UPDATED}. The retrace server provides coredump analysis and
-backtrace generation service over a network using HTTP protocol.
-@end ifnottex
-
-@menu
-* Overview::
-* HTTP interface::
-* Retrace worker::
-* Task cleanup::
-* Package repository::
-* Traffic and load estimation::
-* Security::
-* Future work::
-@end menu
-
-@node Overview
-@chapter Overview
-
-Analyzing a program crash from a coredump is a difficult task. The GNU
-Debugger (GDB), that is commonly used to analyze coredumps on free
-operating systems, expects that the system analyzing the coredump is
-identical to the system where the program crashed. Software updates
-often break this assumption even on the system where the crash occured,
-making the coredump analyzable only with significant
-effort. Furthermore, older versions of software packages are often
-removed from software repositories, including the packages with
-debugging symbols, so the package with debugging symbols is often not
-available when user needs to install it for coredump analysis. Packages
-with the debugging symbols are large, requiring a lot of free space and
-causing problems with downloading them via unreliable internet
-connection.
-
-Retrace server solves these problems for Fedora 14+ and RHEL 6+
-operating systems, and allows developers to analyze coredumps without
-having access to the machine where the crash occurred.
-
-Retrace server is usually run as a service on a local network, or on
-Internet. A user sends a coredump together with some additional
-information to a retrace server. The server reads the coredump and
-depending on its contents it installs necessary software dependencies to
-create a software environment which is, from the GDB point of view,
-identical to the environment where the crash happened. Then the server
-runs GDB to generate a backtrace from the coredump and provides it back
-to the user.
-
-Core dumps generated on i386 and x86_64 architectures are supported
-within a single x86_64 retrace server instance.
-
-The retrace server consists of the following major parts:
-@enumerate
-@item
-a HTTP interface, consisting of a set of scripts handling communication
-with clients
-@item
-a retrace worker, doing the coredump processing, environment
-preparation, and running the debugger to generate a backtrace
-@item
-a cleanup script, handling stalled retracing tasks and removing old data
-@item
-a package repository, providing the application binaries, libraries, and
-debuginfo necessary for generating backtraces from coredumps
-@end enumerate
-
-@node HTTP interface
-@chapter HTTP interface
-
-@menu
-* Creating a new task::
-* Task status::
-* Requesting a backtrace::
-* Requesting a log::
-* Limiting traffic::
-@end menu
-
-The client-server communication proceeds as follows:
-@enumerate
-@item
-Client uploads a coredump to a retrace server. Retrace server creates a
-task for processing the coredump, and sends the task ID and task
-password in response to the client.
-@item
-Client asks server for the task status using the task ID and password.
-Server responds with the status information (task finished successfully,
-task failed, task is still running).
-@item
-Client asks server for the backtrace from a successfully finished task
-using the task ID and password. Server sends the backtrace in response.
-@item
-Client asks server for a log from the finished task using the task ID
-and password, and server sends the log in response.
-@end enumerate
-
-The HTTP interface application is a set of script written in Python,
-using the @uref{http://www.python.org/dev/peps/pep-0333/, Python Web
-Server Gateway Interface} (WSGI) to interact with a web server. The only
-supported and tested configuration is the Apache HTTPD Server with
-@uref{http://code.google.com/p/modwsgi/, mod_wsgi}.
-
-Only secure (HTTPS) communication is allowed for communicating with a
-public instance of retrace server, because coredumps and backtraces are
-private data. Users may decide to publish their backtraces in a bug
-tracker after reviewing them, but the retrace server doesn't do
-that. The server is supposed to use HTTP persistent connections to to
-avoid frequent SSL renegotiations.
-
-@node Creating a new task
-@section Creating a new task
-
-A client might create a new task by sending a HTTP request to the
-@indicateurl{https://server/create} URL, and providing an archive as the
-request content. The archive contains crash data files. The crash data
-files are a subset of some local @file{/var/spool/abrt/ccpp-time-pid}
-directory contents, so the client must only pack and upload them.
-
-The server supports uncompressed tar archives, and tar archives
-compressed with gzip and xz. Uncompressed archives are the most
-efficient way for local network delivery, and gzip can be used there as
-well because of its good compression speed.
-
-The xz compression file format is well suited for public server setup
-(slow network), as it provides good compression ratio, which is
-important for compressing large coredumps, and it provides reasonable
-compress/decompress speed and memory consumption. See @ref{Traffic and
-load estimation} for the measurements. The @uref{http://tukaani.org/xz/,
-XZ Utils} implementation with the compression level 2 is used to
-compress the data.
-
-The HTTP request for a new task must use the POST method. It must
-contain a proper @var{Content-Length} and @var{Content-Type} fields. If
-the method is not POST, the server returns the @code{405 Method Not
-Allowed} HTTP error code. If the @var{Content-Length} field is missing,
-the server returns the @code{411 Length Required} HTTP error code. If an
-@var{Content-Type} other than @samp{application/x-tar},
-@samp{application/x-gzip}, @samp{application/x-xz} is used, the server
-returns the @code{415 unsupported Media Type} HTTP error code. If the
-@var{Content-Length} value is greater than a limit set by
-@var{MaxPackedSize} option in the server configuration file (50 MB by
-default), or the real HTTP request size gets larger than the limit + 10
-KB for headers, then the server returns the @code{413 Request Entity Too
-Large} HTTP error code, and provides an explanation, including the
-limit, in the response body. The limit is changeable from the server
-configuration file.
-
-If unpacking the archive would result in having the free disk space
-under certain limit in the @file{/var/spool/abrt-retrace} directory, the
-server returns the @code{507 Insufficient Storage} HTTP error code. The
-limit is specified by the @var{MinStorageLeft} option in the server
-configuration file, and it is set to 1024 MB by default.
-
-If the data from the received archive would take more than 1024 MB of
-disk space when uncompressed, the server returns the @code{413 Request
-Entity Too Large} HTTP error code, and provides an explanation,
-including the limit, in the response body. The size limit is changeable
-by the @var{MaxUnpackedSize} option in the server configuration file. It
-can be set pretty high because coredumps, that take most disk space, are
-stored on the server only temporarily until the backtrace is
-generated. When the backtrace is generated the coredump is deleted by
-the @command{abrt-retrace-worker}, so most disk space is released.
-
-The uncompressed data size for xz archives is obtained by calling
-@code{`xz --list file.tar.xz`}. The @option{--list} option has been
-implemented only recently, so updating @command{xz} on your server might
-be necessary. Likewise, the uncompressed data size for gzip archives is
-obtained by calling @code{`gzip --list file.tar.gz`}.
-
-If an upload from a client succeeds, the server creates a new directory
-@file{/var/spool/abrt-retrace/@var{id}} and extracts the
-received archive into it. Then it checks that the directory contains all
-the required files, checks their sizes, and then sends a HTTP
-response. After that it spawns a subprocess with
-@command{abrt-retrace-worker} on that directory.
-
-The following files from the local crash directory are required to be
-present in the archive: @file{coredump}, @file{executable},
-@file{package}. If one or more files are not present in the archive, or
-some other file is present in the archive, the server returns the
-@code{403 Forbidden} HTTP error code.
-
-If the file check succeeds, the server HTTP response has the @code{201
-Created} HTTP code. The response includes the following HTTP header
-fields:
-@itemize
-@item
-@var{X-Task-Id} containing a new server-unique numerical
-task id
-@item
-@var{X-Task-Password} containing a newly generated
-password, required to access the result
-@end itemize
-
-The @var{X-Task-Password} is a random alphanumeric (@samp{[a-zA-Z0-9]})
-sequence 22 characters long. The password is stored in the
-@file{/var/spool/abrt-retrace/@var{id}/password} file, and passwords
-sent by a client in subsequent requests are verified by comparing with
-this file.
-
-The task id is intentionally not used as a password, because it is
-desirable to keep the id readable and memorable for
-humans. Password-like ids would be a loss when an user authentication
-mechanism is added, and server-generated password will no longer be
-necessary.
-
-@node Task status
-@section Task status
-
-A client might request a task status by sending a HTTP GET request to
-the @indicateurl{https://someserver/@var{id}} URL, where @var{id} is the
-numerical task id returned in the @var{X-Task-Id} field by
-@indicateurl{https://someserver/create}. If the @var{id} is not in the
-valid format, or the task @var{id} does not exist, the server returns
-the @code{404 Not Found} HTTP error code.
-
-The client request must contain the @var{X-Task-Password} field, and its
-content must match the password stored in the
-@file{/var/spool/abrt-retrace/@var{id}/password} file. If the password is
-not valid, the server returns the @code{403 Forbidden} HTTP error code.
-
-If the checks pass, the server returns the @code{200 OK} HTTP code, and
-includes a field @var{X-Task-Status} containing one of the following
-values: @samp{FINISHED_SUCCESS}, @samp{FINISHED_FAILURE},
-@samp{PENDING}.
-
-The field contains @samp{FINISHED_SUCCESS} if the file
-@file{/var/spool/abrt-retrace/@var{id}/backtrace} exists. The client might
-get the backtrace on the @indicateurl{https://someserver/@var{id}/backtrace}
-URL. The log might be obtained on the
-@indicateurl{https://someserver/@var{id}/log} URL, and it might contain
-warnings about some missing debuginfos etc.
-
-The field contains @samp{FINISHED_FAILURE} if the file
-@file{/var/spool/abrt-retrace/@var{id}/backtrace} does not exist, and file
-@file{/var/spool/abrt-retrace/@var{id}/retrace-log} exists. The retrace-log
-file containing error messages can be downloaded by the client from the
-@indicateurl{https://someserver/@var{id}/log} URL.
-
-The field contains @samp{PENDING} if neither file exists. The client
-should ask again after 10 seconds or later.
-
-@node Requesting a backtrace
-@section Requesting a backtrace
-
-A client might request a backtrace by sending a HTTP GET request to the
-@indicateurl{https://someserver/@var{id}/backtrace} URL, where @var{id}
-is the numerical task id returned in the @var{X-Task-Id} field by
-@indicateurl{https://someserver/create}. If the @var{id} is not in the
-valid format, or the task @var{id} does not exist, the server returns
-the @code{404 Not Found} HTTP error code.
-
-The client request must contain the @var{X-Task-Password} field, and its
-content must match the password stored in the
-@file{/var/spool/abrt-retrace/@var{id}/password} file. If the password
-is not valid, the server returns the @code{403 Forbidden} HTTP error
-code.
-
-If the file @file{/var/spool/abrt-retrace/@var{id}/backtrace} does not
-exist, the server returns the @code{404 Not Found} HTTP error code.
-Otherwise it returns the file contents, and the @var{Content-Type}
-header is set to @samp{text/plain}.
-
-@node Requesting a log
-@section Requesting a log
-
-A client might request a task log by sending a HTTP GET request to the
-@indicateurl{https://someserver/@var{id}/log} URL, where @var{id} is the
-numerical task id returned in the @var{X-Task-Id} field by
-@indicateurl{https://someserver/create}. If the @var{id} is not in the
-valid format, or the task @var{id} does not exist, the server returns
-the @code{404 Not Found} HTTP error code.
-
-The client request must contain the @var{X-Task-Password} field, and its
-content must match the password stored in the
-@file{/var/spool/abrt-retrace/@var{id}/password} file. If the password
-is not valid, the server returns the @code{403 Forbidden} HTTP error
-code.
-
-If the file @file{/var/spool/abrt-retrace/@var{id}/retrace-log} does not
-exist, the server returns the @code{404 Not Found} HTTP error code.
-Otherwise it returns the file contents, and the @var{Content-Type}
-header is set to @samp{text/plain}.
-
-@node Limiting traffic
-@section Limiting traffic
-
-The maximum number of simultaneously running tasks is limited to 5 by
-the server. The limit is changeableby the @var{MaxParallelTasks} option
-in the server configuration file. If a new request comes when the server
-is fully occupied, the server returns the @code{503 Service Unavailable}
-HTTP error code.
-
-The archive extraction, chroot preparation, and gdb analysis is
-mostly limited by the hard drive size and speed.
-
-@node Retrace worker
-@chapter Retrace worker
-
-Retrace worker is a program (usually residing in
-@command{/usr/bin/abrt-retrace-worker}), which:
-@enumerate
-@item
-takes a task id as a parameter, and turns it into a directory containing
-a coredump
-@item
-determines which packages need to be installed from the coredump
-@item
-installs the packages in a newly created chroot environment together
-with @command{gdb}
-@item
-copies the coredump to the chroot environment
-@item
-runs @command{gdb} from inside the environment to generate a backtrace
-from the coredump
-@item
-copies the resulting backtrace from the environment to the directory
-@end enumerate
-
-The tasks reside in @file{/var/spool/abrt-retrace/@var{taskid}}
-directories.
-
-To determine which packages need to be installed,
-@command{abrt-retrace-worker} runs the @command{coredump2packages} tool.
-The tool reads build-ids from the coredump, and tries to find the best
-set of packages (epoch, name, version, release) matching the
-build-ids. Local yum repositories are used as the source of
-packages. GDB requirements are strict, and this is the reason why proper
-backtraces cannot be directly and reliably generated on systems whose
-software is updated:
-@itemize
-@item
-The exact binary which crashed needs to be available to GDB.
-@item
-All libraries which are linked to the binary need to be available in the
-same exact versions from the time of the crash.
-@item
-The binary plugins loaded by the binary or libraries via @code{dlopen}
-need to be present in proper versions.
-@item
-The files containing the debugging symbols for the binary and libraries
-(build-ids are used to find the pairs) need to be available to GDB.
-@end itemize
-
-The chroot environments are created and managed by @command{mock}, and
-they reside in @file{/var/lib/mock/@var{taskid}}. The retrace worker
-generates a mock configuration file and then invokes @command{mock} to
-create the chroot, and to run programs from inside the chroot.
-
-The chroot environment is populated by installing packages using
-@command{yum}. Package installation cannot be avoided, as GDB expects to
-operate on an installed system, and on crashes from that system. GDB
-uses plugins written in Python, that are shipped with packages (for
-example see @command{rpm -ql libstdc++}).
-
-Coredumps might be affected by @command{prelink}, which is used on
-Fedora to speed up dynamic linking by caching its results directly in
-binaries. The system installed by @command{mock} for the purpose of
-retracing doesn't use @command{prelink}, so the binaries differ between
-the system of origin and the mock environment. It has been tested that
-this is not an issue, but in the case some issue
-@uref{http://sourceware.org/ml/gdb/2009-05/msg00175.html, occurs}
-(GDB fails to work with a binary even if it's the right one), a bug
-should be filed on @code{prelink}, as its operation should not affect
-the area GDB operates on.
-
-No special care is taken to avoid the possibility that GDB will not run
-with the set of packages (fixed versions) as provided by coredump. It is
-expected that any combination of packages user might use in a released
-system satisfies the needs of some version of GDB. Yum selects the
-newest possible version which has its requirements satisfied.
-
-@node Task cleanup
-@chapter Task cleanup
-
-It is neccessary to watch and limit the resource usage of tasks for a
-retrace server to remain operational. This is performed by the
-@command{abrt-retrace-cleanup} tool. It is supposed that the server
-administrator sets @command{cron} to run the tool every hour.
-
-Tasks that were created more than 120 hours (5 days) ago are
-deleted. The limit can be changed by the @var{DeleteTaskAfter} option in
-the server configuration file. Coredumps are deleted when the retrace
-process is finished, and only backtraces, logs, and configuration remain
-available for every task until the cleanup. The
-@command{abrt-retrace-cleanup} checks the creation time and deletes the
-directories in @file{/var/spool/abrt-retrace/}.
-
-Tasks running for more than 1 hour are terminated and removed from the
-system. Tasks for which the @command{abrt-retrace-worker} crashed for
-some reason without marking the task as finished are also removed.
-
-@node Package repository
-@chapter Package repository
-
-Retrace server is able to support every Fedora release with all packages
-that ever made it to the updates and updates-testing repositories. In
-order to provide all that packages, a local repository needs to be
-maintained for every supported operating system.
-
-A repository with Fedora packages must be maintained locally on the
-server to provide good performance and to provide data from older
-packages already removed from the official repositories. Retrace server
-contains a tool @command{abrt-retrace-reposync}, which is a package
-downloader scanning Fedora servers for new packages, and downloading
-them so they are immediately available.
-
-Older versions of packages are regularly deleted from the updates and
-updates-testing repositories. Retrace server supports older versions of
-packages, as this is one of major pain-points that the retrace server is
-supposed to solve.
-
-The @command{abrt-retrace-reposync} downloads packages from Fedora
-repositories, and it does not delete older versions of the packages. The
-retrace server administrator is supposed to call this script using cron
-approximately every 6 hours. The script uses @command{rsync} to get the
-packages and @command{createrepo} to generate respository metadata.
-
-The packages are downloaded to a local repository in
-@file{/var/cache/abrt-retrace/}. The location can be changed via the
-@var{RepoDir} option in the server configuration file.
-
-@node Traffic and load estimation
-@chapter Traffic and load estimation
-
-2500 bugs are reported from ABRT every month. Approximately 7.3%
-from that are Python exceptions, which don't need a retrace
-server. That means that 2315 bugs need a retrace server. That is 77
-bugs per day, or 3.3 bugs every hour on average. Occasional spikes
-might be much higher (imagine a user that decided to report all his 8
-crashes from last month).
-
-We should probably not try to predict if the monthly bug count goes up
-or down. New, untested versions of software are added to Fedora, but
-on the other side most software matures and becomes less crashy. So
-let's assume that the bug count stays approximately the same.
-
-Test crashes (see why we use @code{`xz -2`} to compress coredumps):
-@itemize
-@item
-firefox with 7 tabs (random pages opened), coredump size 172 MB
-@itemize
-@item
-xz compression
-@itemize
-@item
-compression level 6 (default): compression took 32.5 sec, compressed
-size 5.4 MB, decompression took 2.7 sec
-@item
-compression level 3: compression took 23.4 sec, compressed size 5.6 MB,
-decompression took 1.6 sec
-@item
-compression level 2: compression took 6.8 sec, compressed size 6.1 MB,
-decompression took 3.7 sec
-@item
-compression level 1: compression took 5.1 sec, compressed size 6.4 MB,
-decompression took 2.4 sec
-@end itemize
-@item
-gzip compression
-@itemize
-@item
-compression level 9 (highest): compression took 7.6 sec, compressed size
-7.9 MB, decompression took 1.5 sec
-@item
-compression level 6 (default): compression took 2.6 sec, compressed size
-8 MB, decompression took 2.3 sec
-@item
-compression level 3: compression took 1.7 sec, compressed size 8.9 MB,
-decompression took 1.7 sec
-@end itemize
-@end itemize
-@item
-thunderbird with thousands of emails opened, coredump size 218 MB
-@itemize
-@item
-xz compression
-@itemize
-@item
-compression level 6 (default): compression took 60 sec, compressed size
-12 MB, decompression took 3.6 sec
-@item
-compression level 3: compression took 42 sec, compressed size 13 MB,
-decompression took 3.0 sec
-@item
-compression level 2: compression took 10 sec, compressed size 14 MB,
-decompression took 3.0 sec
-@item
-compression level 1: compression took 8.3 sec, compressed size 15 MB,
-decompression took 3.2 sec
-@end itemize
-@item
-gzip compression
-@itemize
-@item
-compression level 9 (highest): compression took 14.9 sec, compressed
-size 18 MB, decompression took 2.4 sec
-@item
-compression level 6 (default): compression took 4.4 sec, compressed size
-18 MB, decompression took 2.2 sec
-@item
-compression level 3: compression took 2.7 sec, compressed size 20 MB,
-decompression took 3 sec
-@end itemize
-@end itemize
-@item
-evince with 2 pdfs (1 and 42 pages) opened, coredump size 73 MB
-@itemize
-@item
-xz compression
-@itemize
-@item
-compression level 2: compression took 2.9 sec, compressed size 3.6 MB,
-decompression took 0.7 sec
-@item
-compression level 1: compression took 2.5 sec, compressed size 3.9 MB,
-decompression took 0.7 sec
-@end itemize
-@end itemize
-@item
-OpenOffice.org Impress with 25 pages presentation, coredump size 116 MB
-@itemize
-@item
-xz compression
-@itemize
-@item
-compression level 2: compression took 7.1 sec, compressed size 12 MB,
-decompression took 2.3 sec
-@end itemize
-@end itemize
-@end itemize
-
-So let's imagine there are some users that want to report their
-crashes approximately at the same time. Here is what the retrace
-server must handle:
-@enumerate
-@item
-2 OpenOffice crashes
-@item
-2 evince crashes
-@item
-2 thunderbird crashes
-@item
-2 firefox crashes
-@end enumerate
-
-We will use the xz archiver with the compression level 2 on the ABRT's
-side to compress the coredumps. So the users spend 53.6 seconds in
-total packaging the coredumps.
-
-The packaged coredumps have 71.4 MB, and the retrace server must
-receive that data.
-
-The server unpacks the coredumps (perhaps in the same time), so they
-need 1158 MB of disk space on the server. The decompression will take
-19.4 seconds.
-
-Several hundred megabytes will be needed to install all the
-required packages and debuginfos for every chroot (8 chroots 1 GB each
-= 8 GB, but this seems like an extreme, maximal case). Some space will
-be saved by using a debuginfofs.
-
-Note that most applications are not as heavyweight as OpenOffice and
-Firefox.
-
-@node Security
-@chapter Security
-
-The retrace server communicates with two other entities: it accepts
-coredumps form users, and it downloads debuginfos and packages from
-distribution repositories.
-
-@menu
-* Clients::
-* Packages and debuginfo::
-@end menu
-
-General security from GDB flaws and malicious data is provided by
-chroot. The GDB accesses the debuginfos, packages, and the coredump from
-within the chroot under a non-root user, unable to access the retrace
-server's environment.
-
-@c We should consider setting a disk quota to every chroot directory,
-@c and limit the GDB access to resources using cgroups.
-
-SELinux policy exists for both the retrace server's HTTP interface, and
-for the retrace worker.
-
-@node Clients
-@section Clients
-
-It is expected that the clients, which are using the retrace server and
-sending coredumps to it, trust the retrace server administrator. The
-server administrator must not try to get sensitive data from client
-coredumps. This is a major bottleneck of the retrace server. However,
-users of an operating system already trust the operating system provider
-in various important matters. So when the retrace server is operated by
-the OS provider, that might be acceptable for users.
-
-Sending clients' coredumps to the retrace server cannot be avoided if we
-want to generate good backtraces containing the values of
-variables. Minidumps lower the quality of the resulting backtraces,
-while not improving user security.
-
-A malicious client can craft a nonstandard coredump, which will be
-processed by server's GDB. GDB handles malformed coredumps well.
-
-Users can never be allowed to provide custom packages/debuginfo together
-with a coredump. Packages need to be installed to the environment, and
-installing untrusted programs is insecure.
-
-As for attacker trying to steal users' backtraces from the retrace
-server, the passwords protecting the backtraces in the
-@var{X-Task-Password} header are random alphanumeric
-(@samp{[a-zA-Z0-9]}) sequences 22 characters long. 22 alphanumeric
-characters corresponds to 128 bit password, because @samp{[a-zA-Z0-9]}
-is 62 characters, and @math{2^{128}} < @math{62^{22}}. The source of
-randomness is @file{/dev/urandom}.
-
-@node Packages and debuginfo
-@section Packages and debuginfo
-
-Packages and debuginfo are safely downloaded from the distribution
-repositories, as the packages are signed by the distribution, and the
-package origin is verified.
-
-When the debuginfo filesystem server is done, the retrace server can
-safely use it, as the data will also be signed.
-
-@node Future work
-@chapter Future work
-
-@section Coredump stripping
-Jan Kratochvil: With my test of OpenOffice.org presentation kernel core
-file has 181MB, xz -2 of it has 65MB. According to `set target debug 1'
-GDB reads only 131406 bytes of it (incl. the NOTE segment).
-
-@section Supporting other architectures
-Three approaches:
-@itemize
-@item
-Use GDB builds with various target architectures: gdb-i386, gdb-ppc64,
-gdb-s390.
-@item
-Run
-@uref{http://wiki.qemu.org/download/qemu-doc.html#QEMU-User-space-emulator,
-QEMU user space emulation} on the server
-@item
-Run @code{abrt-retrace-worker} on a machine with right
-architecture. Introduce worker machines and tasks, similarly to Koji.
-@end itemize
-
-@section Use gdbserver instead of uploading whole coredump
-GDB's gdbserver cannot process coredumps, but Jan Kratochvil's can:
-@verbatim
-git://git.fedorahosted.org/git/elfutils.git
-branch: jankratochvil/gdbserver
- src/gdbserver.c
- * Currently threading is not supported.
- * Currently only x86_64 is supported (the NOTE registers layout).
-@end verbatim
-
-@section User management for the HTTP interface
-Multiple authentication sources (x509 for RHEL).
-
-@section Make all files except coredump optional on the input
-Make @file{architecture}, @file{release}, @file{packages} files, which
-must be included in the package when creating a task, optional. Allow
-uploading a coredump without involving tar: just coredump, coredump.gz,
-or coredump.xz.
-
-@section Handle non-standard packages (provided by user)
-This would make retrace server very vulnerable to attacks, it never can
-be enabled in a public instance.
-
-@section Support vmcores
-See @uref{https://fedorahosted.org/cas/, Core analysis system}, its
-features etc.
-
-@section Do not refuse new tasks on a fully loaded server
-Consider using @uref{http://git.fedorahosted.org/git/?p=kobo.git, kobo}
-for task management and worker handling (master/slaves arch).
-
-@section Support synchronous operation
-Client sends a coredump, and keeps receiving the server response
-message. The server response HTTP body is generated and sent gradually
-as the task is performed. Client can choose to stop receiving the
-response body after getting all headers and ask the server for status
-and backtrace asynchronously.
-
-The server re-sends the output of abrt-retrace-worker (its stdout and
-stderr) to the response the body. In addition, a line with the task
-status is added in the form @code{X-Task-Status: PENDING} to the body
-every 5 seconds. When the worker process ends, either
-@samp{FINISHED_SUCCESS} or @samp{FINISHED_FAILURE} status line is
-sent. If it's @samp{FINISHED_SUCCESS}, the backtrace is attached after
-this line. Then the response body is closed.
-
-@section Provide task estimation time
-The response to the @code{/create} action should contain a header
-@var{X-Task-Est-Time}, that contains a number of seconds the server
-estimates it will take to generate the backtrace
-
-The algorithm for the @var{X-Task-Est-Time} time estimation
-should take the previous analyses of coredumps with the same
-corresponding package name into account. The server should store
-simple history in a SQLite database to know how long it takes to
-generate a backtrace for certain package. It could be as simple as
-this:
-@itemize
-@item
- initialization step one: @code{CREATE TABLE package_time (id INTEGER
- PRIMARY KEY AUTOINCREMENT, package, release, time)}; we need the
- @var{id} for the database cleanup - to know the insertion order of
- rows, so the @code{AUTOINCREMENT} is important here; the @var{package}
- is the package name without the version and release numbers, the
- @var{release} column stores the operating system, and the @var{time}
- is the number of seconds it took to generate the backtrace
-@item
- initialization step two: @code{CREATE INDEX package_release ON
- package_time (package, release)}; we compute the time only for single
- package on single supported OS release per query, so it makes sense to
- create an index to speed it up
-@item
- when a task is finished: @code{INSERT INTO package_time (package,
- release, time) VALUES ('??', '??', '??')}
-@item
- to get the average time: @code{SELECT AVG(time) FROM package_time
- WHERE package == '??' AND release == '??'}; the arithmetic mean seems
- to be sufficient here
-@end itemize
-
-So the server knows that crashes from an OpenOffice.org package
-take 5 minutes to process in average, and it can return the value 300
-(seconds) in the field. The client does not waste time asking about
-that task every 20 seconds, but the first status request comes after
-300 seconds. And even when the package changes (rebases etc.), the
-database provides good estimations after some time anyway
-(@ref{Task cleanup} chapter describes how the
-data are pruned).
-
-@section Keep the database with statistics small
-The database containing packages and processing times should also be
-regularly pruned to remain small and provide data quickly. The cleanup
-script should delete some rows for packages with too many entries:
-@enumerate
-@item
-get a list of packages from the database: @code{SELECT DISTINCT package,
-release FROM package_time}
-@item
-for every package, get the row count: @code{SELECT COUNT(*) FROM
-package_time WHERE package == '??' AND release == '??'}
-@item
-for every package with the row count larger than 100, some rows most be
-removed so that only the newest 100 rows remain in the database:
-@itemize
-@item
-to get highest row id which should be deleted, execute @code{SELECT id
-FROM package_time WHERE package == '??' AND release == '??' ORDER BY id
-LIMIT 1 OFFSET ??}, where the @code{OFFSET} is the total number of rows
-for that single package minus 100
-@item
-then all the old rows can be deleted by executing @code{DELETE FROM
-package_time WHERE package == '??' AND release == '??' AND id <= ??}
-@end itemize
-@end enumerate
-
-@section Support Fedora Rawhide
-When the @command{abrt-retrace-reposync} is used to sync with the
-Rawhide repository, unneeded packages (where a newer version exists)
-must be removed after residing one week with the newer package in the
-same repository.
-
-@bye