From 6ec12db137f2d0fe18f059fcef2390512d0b2c3f Mon Sep 17 00:00:00 2001
From: Karel Klic <kklic@redhat.com>
Date: Wed, 9 Mar 2011 16:49:09 +0100
Subject: retrace-server: ship documentation in retrace-server rpm, use
 abrt-retrace-server.info instead of retrace-server.info

---
 doc/Makefile.am              |   2 +-
 doc/abrt-retrace-server.texi | 800 +++++++++++++++++++++++++++++++++++++++++++
 doc/retrace-server.texi      | 800 -------------------------------------------
 3 files changed, 801 insertions(+), 801 deletions(-)
 create mode 100644 doc/abrt-retrace-server.texi
 delete mode 100644 doc/retrace-server.texi

(limited to 'doc')

diff --git a/doc/Makefile.am b/doc/Makefile.am
index 43c0bc6d..4896aa1a 100644
--- a/doc/Makefile.am
+++ b/doc/Makefile.am
@@ -1 +1 @@
-info_TEXINFOS = retrace-server.texi
+info_TEXINFOS = abrt-retrace-server.texi
diff --git a/doc/abrt-retrace-server.texi b/doc/abrt-retrace-server.texi
new file mode 100644
index 00000000..43f05627
--- /dev/null
+++ b/doc/abrt-retrace-server.texi
@@ -0,0 +1,800 @@
+\input texinfo
+@c retrace-server.texi - Retrace Server Documentation
+@c
+@c .texi extension is recommended in GNU Automake manual
+@setfilename retrace-server.info
+@include version.texi
+
+@settitle Retrace server for ABRT @value{VERSION} Manual
+
+@dircategory Retrace server
+@direntry
+* Retrace server: (retrace-server).  Remote coredump analysis via HTTP.
+@end direntry
+
+@titlepage
+@title Retrace server
+@subtitle for ABRT version @value{VERSION}, @value{UPDATED}
+@author Karel Klic (@email{kklic@@redhat.com})
+@page
+@vskip 0pt plus 1filll
+@end titlepage
+
+@contents
+
+@ifnottex
+@node Top
+@top Retrace server
+
+This manual is for retrace server for ABRT version @value{VERSION},
+@value{UPDATED}.  The retrace server provides a coredump analysis and
+backtrace generation service over a network using HTTP protocol.
+@end ifnottex
+
+@menu
+* Overview::
+* HTTP interface::
+* Retrace worker::
+* Package repository::
+* Traffic and load estimation::
+* Security::
+* Future work::
+@end menu
+
+@node Overview
+@chapter Overview
+
+A client sends a coredump (created by Linux kernel) together with
+some additional information to the server, and gets a backtrace
+generation task ID in response. Then the client, after some time, asks
+the server for the task status, and when the task is done (backtrace
+has been generated from the coredump), the client downloads the
+backtrace. If the backtrace generation fails, the client gets an error
+code and downloads a log indicating what happened. Alternatively, the
+client sends a coredump, and keeps receiving the server response
+message. Server then, via the response's body, periodically sends
+status of the task, and delivers the resulting backtrace as soon as
+it's ready.
+
+The retrace server must be able to support multiple operating
+systems and their releases (Fedora N-1, N, Rawhide, Branched Rawhide,
+RHEL), and multiple architectures within a single installation.
+
+The retrace server consists of the following parts:
+@enumerate
+@item
+abrt-retrace-server: a HTTP interface script handling the
+communication with clients, task creation and management
+@item
+abrt-retrace-worker: a program doing the environment preparation
+and coredump processing
+@item
+package repository: a repository placed on the server containing
+all the application binaries, libraries, and debuginfo necessary for
+backtrace generation
+@end enumerate
+
+@node HTTP interface
+@chapter HTTP interface
+
+@menu
+* Creating a new task::
+* Task status::
+* Requesting a backtrace::
+* Requesting a log::
+* Task cleanup::
+* Limiting traffic::
+@end menu
+
+The HTTP interface application is a script written in Python. The
+script is named @file{abrt-retrace-server}, and it uses the
+@uref{http://www.python.org/dev/peps/pep-0333/, Python Web Server
+Gateway Interface} (WSGI) to interact with the web server.
+Administrators may use
+@uref{http://code.google.com/p/modwsgi/, mod_wsgi} to run
+@command{abrt-retrace-server} on Apache. The mod_wsgi is a part of
+both Fedora 12 and RHEL 6. The Python language is a good choice for
+this application, because it supports HTTP handling well, and it is
+already used in ABRT.
+
+Only secure (HTTPS) communication must be allowed for the communication
+with @command{abrt-retrace-server}, because coredumps and backtraces are
+private data. Users may decide to publish their backtraces in a bug
+tracker after reviewing them, but the retrace server doesn't do
+that. The HTTPS requirement must be specified in the server's man
+page. The server must support HTTP persistent connections to to avoid
+frequent SSL renegotiations. The server's manual page should include a
+recommendation for administrator to check that the persistent
+connections are enabled.
+
+@node Creating a new task
+@section Creating a new task
+
+A client might create a new task by sending a HTTP request to the
+@indicateurl{https://server/create} URL, and providing an archive as the
+request content. The archive must contain crash data files. The crash
+data files are a subset of the local
+@file{/var/spool/abrt/ccpp-time-pid/} directory contents, so the client
+must only pack and upload them.
+
+The server must support uncompressed tar archives, and tar archives
+compressed with gzip and xz. Uncompressed archives are the most
+efficient way for local network delivery, and gzip can be used there
+as well because of its good compression speed.
+
+The xz compression file format is well suited for public server setup
+(slow network), as it provides good compression ratio, which is
+important for compressing large coredumps, and it provides reasonable
+compress/decompress speed and memory consumption. See @ref{Traffic and
+load estimation} for the measurements. The @uref{http://tukaani.org/xz/, XZ Utils}
+implementation with the compression level 2 should be used to compress
+the data.
+
+The HTTP request for a new task must use the POST method. It must
+contain a proper @var{Content-Length} and @var{Content-Type} fields. If
+the method is not POST, the server must return the "405 Method Not
+Allowed" HTTP error code. If the @var{Content-Length} field is missing,
+the server must return the "411 Length Required" HTTP error code. If an
+@var{Content-Type} other than @samp{application/x-tar},
+@samp{application/x-gzip}, @samp{application/x-xz} is used, the server
+must return the "415 unsupported Media Type" HTTP error code. If the
+@var{Content-Length} value is greater than a limit set in the server
+configuration file (50 MB by default), or the real HTTP request size
+gets larger than the limit + 10 KB for headers, then the server must
+return the "413 Request Entity Too Large" HTTP error code, and provide
+an explanation, including the limit, in the response body. The limit
+must be changeable from the server configuration file.
+
+If there is less than 20 GB of free disk space in the
+@file{/var/spool/abrt-retrace} directory, the server must return
+the "507 Insufficient Storage" HTTP error code. The server must return
+the same HTTP error code if decompressing the received archive would
+cause the free disk space to become less than 20 GB. The 20 GB limit
+must be changeable from the server configuration file.
+
+If the data from the received archive would take more than 500
+MB of disk space when uncompressed, the server must return the "413
+Request Entity Too Large" HTTP error code, and provide an explanation,
+including the limit, in the response body. The size limit must be
+changeable from the server configuration file. It can be set pretty
+high because coredumps, that take most disk space, are stored on the
+server only temporarily until the backtrace is generated. When the
+backtrace is generated the coredump is deleted by the
+@command{abrt-retrace-worker}, so most disk space is released.
+
+The uncompressed data size for xz archives can be obtained by calling
+@code{`xz --list file.tar.xz`}. The @option{--list} option has been
+implemented only recently, so it might be necessary to implement a
+method to get the uncompressed data size by extracting the archive to
+the stdout, and counting the extracted bytes, and call this method if
+the @option{--list} doesn't work on the server. Likewise, the
+uncompressed data size for gzip archives can be obtained by calling
+@code{`gzip --list file.tar.gz`}.
+
+If an upload from a client succeeds, the server creates a new directory
+@file{/var/spool/abrt-retrace/@var{id}} and extracts the
+received archive into it. Then it checks that the directory contains all
+the required files, checks their sizes, and then sends a HTTP
+response. After that it spawns a subprocess with
+@command{abrt-retrace-worker} on that directory.
+
+To support multiple architectures, the retrace server needs a GDB
+package compiled separately for every supported target architecture
+(see the avr-gdb package in Fedora for an example). This is
+technically and economically better solution than using a standalone
+machine for every supported architecture and resending coredumps
+depending on client's architecture. However, GDB's support for using a
+target architecture different from the host architecture seems to be
+fragile. If it doesn't work, the QEMU user mode emulation should be
+tried as an alternative approach.
+
+The following files from the local crash directory are required to be
+present in the archive: @file{coredump}, @file{architecture},
+@file{release}, @file{packages} (this one does not exist yet). If one or
+more files are not present in the archive, or some other file is present
+in the archive, the server must return the "403 Forbidden" HTTP error
+code. If the size of any file except the coredump exceeds 100 KB, the
+server must return the "413 Request Entity Too Large" HTTP error code,
+and provide an explanation, including the limit, in the response
+body. The 100 KB limit must be changeable from the server configuration
+file.
+
+If the file check succeeds, the server HTTP response must have the "201
+Created" HTTP code. The response must include the following HTTP header
+fields:
+@itemize
+@item
+@var{X-Task-Id} containing a new server-unique numerical
+task id
+@item
+@var{X-Task-Password} containing a newly generated
+password, required to access the result
+@item
+@var{X-Task-Est-Time} containing a number of seconds the
+server estimates it will take to generate the backtrace
+@end itemize
+
+The @var{X-Task-Password} is a random alphanumeric (@samp{[a-zA-Z0-9]})
+sequence 22 characters long. 22 alphanumeric characters corresponds to
+128 bit password, because @samp{[a-zA-Z0-9]} = 62 characters, and
+@math{2^128} < @math{62^22}. The source of randomness must be,
+directly or indirectly, @file{/dev/urandom}. The @code{rand()} function
+from glibc and similar functions from other libraries cannot be used
+because of their poor characteristics (in several aspects). The password
+must be stored to the @file{/var/spool/abrt-retrace/@var{id}/password} file,
+so passwords sent by a client in subsequent requests can be verified.
+
+The task id is intentionally not used as a password, because it is
+desirable to keep the id readable and memorable for
+humans. Password-like ids would be a loss when an user authentication
+mechanism is added, and server-generated password will no longer be
+necessary.
+
+The algorithm for the @var{X-Task-Est-Time} time estimation
+should take the previous analyses of coredumps with the same
+corresponding package name into account. The server should store
+simple history in a SQLite database to know how long it takes to
+generate a backtrace for certain package. It could be as simple as
+this:
+@itemize
+@item
+  initialization step one: @code{CREATE TABLE package_time (id INTEGER
+  PRIMARY KEY AUTOINCREMENT, package, release, time)}; we need the
+  @var{id} for the database cleanup - to know the insertion order of
+  rows, so the @code{AUTOINCREMENT} is important here; the @var{package}
+  is the package name without the version and release numbers, the
+  @var{release} column stores the operating system, and the @var{time}
+  is the number of seconds it took to generate the backtrace
+@item
+  initialization step two: @code{CREATE INDEX package_release ON
+  package_time (package, release)}; we compute the time only for single
+  package on single supported OS release per query, so it makes sense to
+  create an index to speed it up
+@item
+  when a task is finished: @code{INSERT INTO package_time (package,
+  release, time) VALUES ('??', '??', '??')}
+@item
+  to get the average time: @code{SELECT AVG(time) FROM package_time
+  WHERE package == '??' AND release == '??'}; the arithmetic mean seems
+  to be sufficient here
+@end itemize
+
+So the server knows that crashes from an OpenOffice.org package
+take 5 minutes to process in average, and it can return the value 300
+(seconds) in the field. The client does not waste time asking about
+that task every 20 seconds, but the first status request comes after
+300 seconds. And even when the package changes (rebases etc.), the
+database provides good estimations after some time anyway
+(@ref{Task cleanup} chapter describes how the
+data are pruned).
+
+The server response HTTP body is generated and sent
+gradually as the task is performed. Client chooses either to receive
+the body, or terminate after getting all headers and ask the server
+for status and backtrace asynchronously.
+
+The server re-sends the output of abrt-retrace-worker (its stdout and
+stderr) to the response the body. In addition, a line with the task
+status is added in the form @code{X-Task-Status: PENDING} to the body
+every 5 seconds. When the worker process ends, either
+@samp{FINISHED_SUCCESS} or @samp{FINISHED_FAILURE} status line is
+sent. If it's @samp{FINISHED_SUCCESS}, the backtrace is attached after
+this line. Then the response body is closed.
+
+@node Task status
+@section Task status
+
+A client might request a task status by sending a HTTP GET request to
+the @indicateurl{https://someserver/@var{id}} URL, where @var{id} is the
+numerical task id returned in the @var{X-Task-Id} field by
+@indicateurl{https://someserver/create}. If the @var{id} is not in the
+valid format, or the task @var{id} does not exist, the server must
+return the "404 Not Found" HTTP error code.
+
+The client request must contain the @var{X-Task-Password} field, and its
+content must match the password stored in the
+@file{/var/spool/abrt-retrace/@var{id}/password} file. If the password is
+not valid, the server must return the "403 Forbidden" HTTP error code.
+
+If the checks pass, the server returns the "200 OK" HTTP code, and
+includes a field @var{X-Task-Status} containing one of the following
+values: @samp{FINISHED_SUCCESS}, @samp{FINISHED_FAILURE},
+@samp{PENDING}.
+
+The field contains @samp{FINISHED_SUCCESS} if the file
+@file{/var/spool/abrt-retrace/@var{id}/backtrace} exists. The client might
+get the backtrace on the @indicateurl{https://someserver/@var{id}/backtrace}
+URL. The log might be obtained on the
+@indicateurl{https://someserver/@var{id}/log} URL, and it might contain
+warnings about some missing debuginfos etc.
+
+The field contains @samp{FINISHED_FAILURE} if the file
+@file{/var/spool/abrt-retrace/@var{id}/backtrace} does not exist, and file
+@file{/var/spool/abrt-retrace/@var{id}/retrace-log} exists. The retrace-log
+file containing error messages can be downloaded by the client from the
+@indicateurl{https://someserver/@var{id}/log} URL.
+
+The field contains @samp{PENDING} if neither file exists. The client
+should ask again after 10 seconds or later.
+
+@node Requesting a backtrace
+@section Requesting a backtrace
+
+A client might request a backtrace by sending a HTTP GET request to the
+@indicateurl{https://someserver/@var{id}/backtrace} URL, where @var{id}
+is the numerical task id returned in the @var{X-Task-Id} field by
+@indicateurl{https://someserver/create}. If the @var{id} is not in the
+valid format, or the task @var{id} does not exist, the server must
+return the "404 Not Found" HTTP error code.
+
+The client request must contain the @var{X-Task-Password} field, and its
+content must match the password stored in the
+@file{/var/spool/abrt-retrace/@var{id}/password} file. If the password
+is not valid, the server must return the "403 Forbidden" HTTP error
+code.
+
+If the file @file{/var/spool/abrt-retrace/@var{id}/backtrace} does not
+exist, the server must return the "404 Not Found" HTTP error code.
+Otherwise it returns the file contents, and the @var{Content-Type} field
+must contain @samp{text/plain}.
+
+@node Requesting a log
+@section Requesting a log
+
+A client might request a task log by sending a HTTP GET request to the
+@indicateurl{https://someserver/@var{id}/log} URL, where @var{id} is the
+numerical task id returned in the @var{X-Task-Id} field by
+@indicateurl{https://someserver/create}. If the @var{id} is not in the
+valid format, or the task @var{id} does not exist, the server must
+return the "404 Not Found" HTTP error code.
+
+The client request must contain the @var{X-Task-Password} field, and its
+content must match the password stored in the
+@file{/var/spool/abrt-retrace/@var{id}/password} file. If the password is
+not valid, the server must return the "403 Forbidden" HTTP error code.
+
+If the file @file{/var/spool/abrt-retrace/@var{id}/retrace-log} does not
+exist, the server must return the "404 Not Found" HTTP error code.
+Otherwise it returns the file contents, and the "Content-Type" must
+contain "text/plain".
+
+@node Task cleanup
+@section Task cleanup
+
+Tasks that were created more than 5 days ago must be deleted, because
+tasks occupy disk space (not so much space, as the coredumps are deleted
+after the retrace, and only backtraces and configuration remain). A
+shell script @command{abrt-retrace-clean} must check the creation time
+and delete the directories in @file{/var/spool/abrt-retrace/}. It is
+supposed that the server administrator sets @command{cron} to call the
+script once a day. This assumption must be mentioned in the
+@command{abrt-retrace-clean} manual page.
+
+The database containing packages and processing times should also
+be regularly pruned to remain small and provide data quickly. The
+cleanup script should delete some rows for packages with too many
+entries:
+@enumerate
+@item
+get a list of packages from the database: @code{SELECT DISTINCT
+package, release FROM package_time}
+@item
+for every package, get the row count: @code{SELECT COUNT(*) FROM
+package_time WHERE package == '??' AND release == '??'}
+@item
+for every package with the row count larger than 100, some rows
+most be removed so that only the newest 100 rows remain in the
+database:
+@itemize
+ @item
+  to get highest row id which should be deleted,
+  execute @code{SELECT id FROM package_time WHERE package == '??' AND
+  release == '??' ORDER BY id LIMIT 1 OFFSET ??}, where the
+  @code{OFFSET} is the total number of rows for that single
+  package minus 100
+ @item
+  then all the old rows can be deleted by executing @code{DELETE
+  FROM package_time WHERE package == '??' AND release == '??' AND id
+  <= ??}
+@end itemize
+@end enumerate
+
+@node Limiting traffic
+@section Limiting traffic
+
+The maximum number of simultaneously running tasks must be limited
+to 20 by the server. The limit must be changeable from the server
+configuration file. If a new request comes when the server is fully
+occupied, the server must return the "503 Service Unavailable" HTTP
+error code.
+
+The archive extraction, chroot preparation, and gdb analysis is
+mostly limited by the hard drive size and speed.
+
+@node Retrace worker
+@chapter Retrace worker
+
+The worker (@command{abrt-retrace-worker} binary) gets a
+@file{/var/spool/abrt-retrace/@var{id}} directory as an input. The worker
+reads the operating system name and version, the coredump, and the list
+of packages needed for retracing (a package containing the binary which
+crashed, and packages with the libraries that are used by the binary).
+
+The worker prepares a new @file{chroot} subdirectory with the packages,
+their debuginfo, and gdb installed. In other words, a new directory
+@file{/var/spool/abrt-retrace/@var{id}/chroot} is created and
+the packages are unpacked or installed into this directory, so for
+example the gdb ends up as
+@file{/var/.../@var{id}/chroot/usr/bin/gdb}.
+
+After the @file{chroot} subdirectory is prepared, the worker moves the
+coredump there and changes root (using the chroot system function) of a
+child script there. The child script runs the gdb on the coredump, and
+the gdb sees the corresponding crashy binary, all the debuginfo and all
+the proper versions of libraries on right places.
+
+When the gdb run is finished, the worker copies the resulting backtrace
+to the @file{/var/spool/abrt-retrace/@var{id}/backtrace} file and stores a
+log from the whole chroot process to the @file{retrace-log} file in the
+same directory. Then it removes the @file{chroot} directory.
+
+The GDB installed into the chroot must:
+@itemize
+@item
+run on the server (same architecture, or we can use
+@uref{http://wiki.qemu.org/download/qemu-doc.html#QEMU-User-space-emulator, QEMU
+user space emulation})
+@item
+process the coredump (possibly from another architecture): that
+means we need a special GDB for every supported architecture
+@item
+be able to handle coredumps created in an environment with prelink
+enabled
+(@uref{http://sourceware.org/ml/gdb/2009-05/msg00175.html, should
+not} be a problem)
+@item
+use libc, zlib, readline, ncurses, expat and Python packages,
+while the version numbers required by the coredump might be different
+from what is required by the GDB
+@end itemize
+
+The gdb might fail to run with certain combinations of package
+dependencies. Nevertheless, we need to provide the libc/Python/*
+package versions which are required by the coredump. If we would not
+do that, the backtraces generated from such an environment would be of
+lower quality. Consider a coredump which was caused by a crash of
+Python application on a client, and which we analyze on the retrace
+server with completely different version of Python because the
+client's Python version is not compatible with our GDB.
+
+We can solve the issue by installing the GDB package dependencies first,
+move their binaries to some safe place (@file{/lib/gdb} in the chroot),
+and create the @file{/etc/ld.so.preload} file pointing to that place, or
+set @env{LD_LIBRARY_PATH}. Then we can unpack libc binaries and
+other packages and their versions as required by the coredump to the
+common paths, and the GDB would run happily, using the libraries from
+@file{/lib/gdb} and not those from @file{/lib} and @file{/usr/lib}. This
+approach can use standard GDB builds with various target architectures:
+gdb, gdb-i386, gdb-ppc64, gdb-s390 (nonexistent in Fedora/EPEL at the
+time of writing this).
+
+The GDB and its dependencies are stored separately from the packages
+used as data for coredump processing. A single combination of GDB and
+its dependencies can be used across all supported OS to generate
+backtraces.
+
+The retrace worker must be able to prepare a chroot-ready environment
+for certain supported operating system, which is different from the
+retrace server's operating system. It needs to fake the @file{/dev}
+directory and create some basic files in @file{/etc} like @file{passwd}
+and @file{hosts}. We can use the @uref{https://fedorahosted.org/mock/,
+mock} library to do that, as it does almost what we need (but not
+exactly as it has a strong focus on preparing the environment for
+rpmbuild and running it), or we can come up with our own solution, while
+stealing some code from the mock library. The @file{/usr/bin/mock}
+executable is entirely unuseful for the retrace server, but the
+underlying Python library can be used. So if would like to use mock, an
+ABRT-specific interface to the mock library must be written or the
+retrace worker must be written in Python and use the mock Python library
+directly.
+
+We should save some time and disk space by extracting only binaries
+and dynamic libraries from the packages for the coredump analysis, and
+omit all other files. We can save even more time and disk space by
+extracting only the libraries and binaries really referenced by the
+coredump (eu-unstrip tells us). Packages should not be
+@emph{installed} to the chroot, they should be @emph{extracted}
+only, because we use them as a data source, and we never run them.
+
+Another idea to be considered is that we can avoid the package
+extraction if we can teach GDB to read the dynamic libraries, the
+binary, and the debuginfo directly from the RPM packages. We would
+provide a backend to GDB which can do that, and provide tiny front-end
+program which tells the backend which RPMs it should use and then run
+the GDB command loop. The result would be a GDB wrapper/extension we
+need to maintain, but it should end up pretty small. We would use
+Python to write our extension, as we do not want to (inelegantly)
+maintain a patch against GDB core. We need to ask GDB people if the
+Python interface is capable of handling this idea, and how much work
+it would be to implement it.
+
+@node Package repository
+@chapter Package repository
+
+We should support every Fedora release with all packages that ever
+made it to the updates and updates-testing repositories. In order to
+provide all that packages, a local repository is maintained for every
+supported operating system. The debuginfos might be provided by a
+debuginfo server in future (it will save the server disk space). We
+should support the usage of local debuginfo first, and add the
+debuginfofs support later.
+
+A repository with Fedora packages must be maintained locally on the
+server to provide good performance and to provide data from older
+packages already removed from the official repositories. We need a
+package downloader, which scans Fedora servers for new packages, and
+downloads them so they are immediately available.
+
+Older versions of packages are regularly deleted from the updates
+and updates-testing repositories. We must support older versions of
+packages, because that is one of two major pain-points that the
+retrace server is supposed to solve (the other one is the slowness of
+debuginfo download and debuginfo disk space requirements).
+
+A script abrt-reposync must download packages from Fedora
+repositories, but it must not delete older versions of the
+packages. The retrace server administrator is supposed to call this
+script using cron every ~6 hours. This expectation must be documented
+in the abrt-reposync manual page. The script can use use wget, rsync,
+or reposync tool to get the packages. The remote yum source
+repositories must be configured from a configuration file or files
+(@file{/etc/yum.repos.d} might be used).
+
+When the abrt-reposync is used to sync with the Rawhide repository,
+unneeded packages (where a newer version exists) must be removed after
+residing one week with the newer package in the same repository.
+
+All the unneeded content from the newly downloaded packages should be
+removed to save disk space and speed up chroot creation. We need just
+the binaries and dynamic libraries, and that is a tiny part of package
+contents.
+
+The packages should be downloaded to a local repository in
+@file{/var/cache/abrt-repo/@{fedora12,fedora12-debuginfo,...@}}.
+
+@node Traffic and load estimation
+@chapter Traffic and load estimation
+
+2500 bugs are reported from ABRT every month. Approximately 7.3%
+from that are Python exceptions, which don't need a retrace
+server. That means that 2315 bugs need a retrace server. That is 77
+bugs per day, or 3.3 bugs every hour on average. Occasional spikes
+might be much higher (imagine a user that decided to report all his 8
+crashes from last month).
+
+We should probably not try to predict if the monthly bug count goes up
+or down. New, untested versions of software are added to Fedora, but
+on the other side most software matures and becomes less crashy.  So
+let's assume that the bug count stays approximately the same.
+
+Test crashes (see that we should probably use @code{`xz -2`}
+to compress coredumps):
+@itemize
+@item
+firefox with 7 tabs (random pages opened), coredump size 172 MB
+@itemize
+@item
+xz compression
+@itemize
+@item
+compression level 6 (default): compression took 32.5 sec, compressed
+size 5.4 MB, decompression took 2.7 sec
+@item
+compression level 3: compression took 23.4 sec, compressed size 5.6 MB,
+decompression took 1.6 sec
+@item
+compression level 2: compression took 6.8 sec, compressed size 6.1 MB,
+decompression took 3.7 sec
+@item
+compression level 1: compression took 5.1 sec, compressed size 6.4 MB,
+decompression took 2.4 sec
+@end itemize
+@item
+gzip compression
+@itemize
+@item
+compression level 9 (highest): compression took 7.6 sec, compressed size
+7.9 MB, decompression took 1.5 sec
+@item
+compression level 6 (default): compression took 2.6 sec, compressed size
+8 MB, decompression took 2.3 sec
+@item
+compression level 3: compression took 1.7 sec, compressed size 8.9 MB,
+decompression took 1.7 sec
+@end itemize
+@end itemize
+@item
+thunderbird with thousands of emails opened, coredump size 218 MB
+@itemize
+@item
+xz compression
+@itemize
+@item
+compression level 6 (default): compression took 60 sec, compressed size
+12 MB, decompression took 3.6 sec
+@item
+compression level 3: compression took 42 sec, compressed size 13 MB,
+decompression took 3.0 sec
+@item
+compression level 2: compression took 10 sec, compressed size 14 MB,
+decompression took 3.0 sec
+@item
+compression level 1: compression took 8.3 sec, compressed size 15 MB,
+decompression took 3.2 sec
+@end itemize
+@item
+gzip compression
+@itemize
+@item
+compression level 9 (highest): compression took 14.9 sec, compressed
+size 18 MB, decompression took 2.4 sec
+@item
+compression level 6 (default): compression took 4.4 sec, compressed size
+18 MB, decompression took 2.2 sec
+@item
+compression level 3: compression took 2.7 sec, compressed size 20 MB,
+decompression took 3 sec
+@end itemize
+@end itemize
+@item
+evince with 2 pdfs (1 and 42 pages) opened, coredump size 73 MB
+@itemize
+@item
+xz compression
+@itemize
+@item
+compression level 2: compression took 2.9 sec, compressed size 3.6 MB,
+decompression took 0.7 sec
+@item
+compression level 1: compression took 2.5 sec, compressed size 3.9 MB,
+decompression took 0.7 sec
+@end itemize
+@end itemize
+@item
+OpenOffice.org Impress with 25 pages presentation, coredump size 116 MB
+@itemize
+@item
+xz compression
+@itemize
+@item
+compression level 2: compression took 7.1 sec, compressed size 12 MB,
+decompression took 2.3 sec
+@end itemize
+@end itemize
+@end itemize
+
+So let's imagine there are some users that want to report their
+crashes approximately at the same time. Here is what the retrace
+server must handle:
+@enumerate
+@item
+2 OpenOffice crashes
+@item
+2 evince crashes
+@item
+2 thunderbird crashes
+@item
+2 firefox crashes
+@end enumerate
+
+We will use the xz archiver with the compression level 2 on the ABRT's
+side to compress the coredumps. So the users spend 53.6 seconds in
+total packaging the coredumps.
+
+The packaged coredumps have 71.4 MB, and the retrace server must
+receive that data.
+
+The server unpacks the coredumps (perhaps in the same time), so they
+need 1158 MB of disk space on the server. The decompression will take
+19.4 seconds.
+
+Several hundred megabytes will be needed to install all the
+required packages and debuginfos for every chroot (8 chroots 1 GB each
+= 8 GB, but this seems like an extreme, maximal case). Some space will
+be saved by using a debuginfofs.
+
+Note that most applications are not as heavyweight as OpenOffice and
+Firefox.
+
+@node Security
+@chapter Security
+
+The retrace server communicates with two other entities: it accepts
+coredumps form users, and it downloads debuginfos and packages from
+distribution repositories.
+
+@menu
+* Clients::
+* Packages and debuginfo::
+@end menu
+
+General security from GDB flaws and malicious data is provided by
+chroot. The GDB accesses the debuginfos, packages, and the coredump
+from within the chroot, unable to access the retrace server's
+environment. We should consider setting a disk quota to every chroot
+directory, and limit the GDB access to resources using cgroups.
+
+SELinux policy should be written for both the retrace server's HTTP
+interface, and for the retrace worker.
+
+@node Clients
+@section Clients
+
+The clients, which are using the retrace server and sending coredumps
+to it, must fully trust the retrace server administrator.  The server
+administrator must not try to get sensitive data from client
+coredumps.  That seems to be a major bottleneck of the retrace server
+idea.  However, users of an operating system already trust the OS
+provider in various important matters. So when the retrace server is
+operated by the operating system provider, that might be acceptable by
+users.
+
+We cannot avoid sending clients' coredumps to the retrace server, if
+we want to generate quality backtraces containing the values of
+variables. Minidumps are not acceptable solution, as they lower the
+quality of the resulting backtraces, while not improving user
+security.
+
+Can the retrace server trust clients? We must know what can a
+malicious client achieve by crafting a nonstandard coredump, which
+will be processed by server's GDB.  We should ask GDB experts about
+this.
+
+Another question is whether we can allow users providing some packages
+and debuginfo together with a coredump. That might be useful for
+users, who run the operating system only with some minor
+modifications, and they still want to use the retrace server. So they
+send a coredump together with a few nonstandard packages. The retrace
+server uses the nonstandard packages together with the OS packages to
+generate the backtrace. Is it safe? We must know what can a malicious
+client achieve by crafting a special binary and debuginfo, which will
+be processed by server's GDB.
+
+@node Packages and debuginfo
+@section Packages and debuginfo
+
+We can safely download packages and debuginfo from the distribution,
+as the packages are signed by the distribution, and the package origin
+can be verified.
+
+When the debuginfo server is done, the retrace server can safely use
+it, as the data will also be signed.
+
+@node Future work
+@chapter Future work
+
+1. Coredump stripping. Jan Kratochvil: With my test of OpenOffice.org
+presentation kernel core file has 181MB, xz -2 of it has 65MB.
+According to `set target debug 1' GDB reads only 131406 bytes of it
+(incl. the NOTE segment).
+
+2. Use gdbserver instead of uploading whole coredump.  GDB's
+gdbserver cannot process coredumps, but Jan Kratochvil's can:
+<pre>  git://git.fedorahosted.org/git/elfutils.git
+  branch: jankratochvil/gdbserver
+  src/gdbserver.c
+   * Currently threading is not supported.
+   * Currently only x86_64 is supported (the NOTE registers layout).
+</pre>
+
+3. User management for the HTTP interface. We need multiple
+authentication sources (x509 for RHEL).
+
+4. Make @file{architecture}, @file{release},
+@file{packages} files, which must be included in the package
+when creating a task, optional. Allow uploading a coredump without
+involving tar: just coredump, coredump.gz, or coredump.xz.
+
+5. Handle non-standard packages (provided by user)
+
+@bye
diff --git a/doc/retrace-server.texi b/doc/retrace-server.texi
deleted file mode 100644
index 43f05627..00000000
--- a/doc/retrace-server.texi
+++ /dev/null
@@ -1,800 +0,0 @@
-\input texinfo
-@c retrace-server.texi - Retrace Server Documentation
-@c
-@c .texi extension is recommended in GNU Automake manual
-@setfilename retrace-server.info
-@include version.texi
-
-@settitle Retrace server for ABRT @value{VERSION} Manual
-
-@dircategory Retrace server
-@direntry
-* Retrace server: (retrace-server).  Remote coredump analysis via HTTP.
-@end direntry
-
-@titlepage
-@title Retrace server
-@subtitle for ABRT version @value{VERSION}, @value{UPDATED}
-@author Karel Klic (@email{kklic@@redhat.com})
-@page
-@vskip 0pt plus 1filll
-@end titlepage
-
-@contents
-
-@ifnottex
-@node Top
-@top Retrace server
-
-This manual is for retrace server for ABRT version @value{VERSION},
-@value{UPDATED}.  The retrace server provides a coredump analysis and
-backtrace generation service over a network using HTTP protocol.
-@end ifnottex
-
-@menu
-* Overview::
-* HTTP interface::
-* Retrace worker::
-* Package repository::
-* Traffic and load estimation::
-* Security::
-* Future work::
-@end menu
-
-@node Overview
-@chapter Overview
-
-A client sends a coredump (created by Linux kernel) together with
-some additional information to the server, and gets a backtrace
-generation task ID in response. Then the client, after some time, asks
-the server for the task status, and when the task is done (backtrace
-has been generated from the coredump), the client downloads the
-backtrace. If the backtrace generation fails, the client gets an error
-code and downloads a log indicating what happened. Alternatively, the
-client sends a coredump, and keeps receiving the server response
-message. Server then, via the response's body, periodically sends
-status of the task, and delivers the resulting backtrace as soon as
-it's ready.
-
-The retrace server must be able to support multiple operating
-systems and their releases (Fedora N-1, N, Rawhide, Branched Rawhide,
-RHEL), and multiple architectures within a single installation.
-
-The retrace server consists of the following parts:
-@enumerate
-@item
-abrt-retrace-server: a HTTP interface script handling the
-communication with clients, task creation and management
-@item
-abrt-retrace-worker: a program doing the environment preparation
-and coredump processing
-@item
-package repository: a repository placed on the server containing
-all the application binaries, libraries, and debuginfo necessary for
-backtrace generation
-@end enumerate
-
-@node HTTP interface
-@chapter HTTP interface
-
-@menu
-* Creating a new task::
-* Task status::
-* Requesting a backtrace::
-* Requesting a log::
-* Task cleanup::
-* Limiting traffic::
-@end menu
-
-The HTTP interface application is a script written in Python. The
-script is named @file{abrt-retrace-server}, and it uses the
-@uref{http://www.python.org/dev/peps/pep-0333/, Python Web Server
-Gateway Interface} (WSGI) to interact with the web server.
-Administrators may use
-@uref{http://code.google.com/p/modwsgi/, mod_wsgi} to run
-@command{abrt-retrace-server} on Apache. The mod_wsgi is a part of
-both Fedora 12 and RHEL 6. The Python language is a good choice for
-this application, because it supports HTTP handling well, and it is
-already used in ABRT.
-
-Only secure (HTTPS) communication must be allowed for the communication
-with @command{abrt-retrace-server}, because coredumps and backtraces are
-private data. Users may decide to publish their backtraces in a bug
-tracker after reviewing them, but the retrace server doesn't do
-that. The HTTPS requirement must be specified in the server's man
-page. The server must support HTTP persistent connections to to avoid
-frequent SSL renegotiations. The server's manual page should include a
-recommendation for administrator to check that the persistent
-connections are enabled.
-
-@node Creating a new task
-@section Creating a new task
-
-A client might create a new task by sending a HTTP request to the
-@indicateurl{https://server/create} URL, and providing an archive as the
-request content. The archive must contain crash data files. The crash
-data files are a subset of the local
-@file{/var/spool/abrt/ccpp-time-pid/} directory contents, so the client
-must only pack and upload them.
-
-The server must support uncompressed tar archives, and tar archives
-compressed with gzip and xz. Uncompressed archives are the most
-efficient way for local network delivery, and gzip can be used there
-as well because of its good compression speed.
-
-The xz compression file format is well suited for public server setup
-(slow network), as it provides good compression ratio, which is
-important for compressing large coredumps, and it provides reasonable
-compress/decompress speed and memory consumption. See @ref{Traffic and
-load estimation} for the measurements. The @uref{http://tukaani.org/xz/, XZ Utils}
-implementation with the compression level 2 should be used to compress
-the data.
-
-The HTTP request for a new task must use the POST method. It must
-contain a proper @var{Content-Length} and @var{Content-Type} fields. If
-the method is not POST, the server must return the "405 Method Not
-Allowed" HTTP error code. If the @var{Content-Length} field is missing,
-the server must return the "411 Length Required" HTTP error code. If an
-@var{Content-Type} other than @samp{application/x-tar},
-@samp{application/x-gzip}, @samp{application/x-xz} is used, the server
-must return the "415 unsupported Media Type" HTTP error code. If the
-@var{Content-Length} value is greater than a limit set in the server
-configuration file (50 MB by default), or the real HTTP request size
-gets larger than the limit + 10 KB for headers, then the server must
-return the "413 Request Entity Too Large" HTTP error code, and provide
-an explanation, including the limit, in the response body. The limit
-must be changeable from the server configuration file.
-
-If there is less than 20 GB of free disk space in the
-@file{/var/spool/abrt-retrace} directory, the server must return
-the "507 Insufficient Storage" HTTP error code. The server must return
-the same HTTP error code if decompressing the received archive would
-cause the free disk space to become less than 20 GB. The 20 GB limit
-must be changeable from the server configuration file.
-
-If the data from the received archive would take more than 500
-MB of disk space when uncompressed, the server must return the "413
-Request Entity Too Large" HTTP error code, and provide an explanation,
-including the limit, in the response body. The size limit must be
-changeable from the server configuration file. It can be set pretty
-high because coredumps, that take most disk space, are stored on the
-server only temporarily until the backtrace is generated. When the
-backtrace is generated the coredump is deleted by the
-@command{abrt-retrace-worker}, so most disk space is released.
-
-The uncompressed data size for xz archives can be obtained by calling
-@code{`xz --list file.tar.xz`}. The @option{--list} option has been
-implemented only recently, so it might be necessary to implement a
-method to get the uncompressed data size by extracting the archive to
-the stdout, and counting the extracted bytes, and call this method if
-the @option{--list} doesn't work on the server. Likewise, the
-uncompressed data size for gzip archives can be obtained by calling
-@code{`gzip --list file.tar.gz`}.
-
-If an upload from a client succeeds, the server creates a new directory
-@file{/var/spool/abrt-retrace/@var{id}} and extracts the
-received archive into it. Then it checks that the directory contains all
-the required files, checks their sizes, and then sends a HTTP
-response. After that it spawns a subprocess with
-@command{abrt-retrace-worker} on that directory.
-
-To support multiple architectures, the retrace server needs a GDB
-package compiled separately for every supported target architecture
-(see the avr-gdb package in Fedora for an example). This is
-technically and economically better solution than using a standalone
-machine for every supported architecture and resending coredumps
-depending on client's architecture. However, GDB's support for using a
-target architecture different from the host architecture seems to be
-fragile. If it doesn't work, the QEMU user mode emulation should be
-tried as an alternative approach.
-
-The following files from the local crash directory are required to be
-present in the archive: @file{coredump}, @file{architecture},
-@file{release}, @file{packages} (this one does not exist yet). If one or
-more files are not present in the archive, or some other file is present
-in the archive, the server must return the "403 Forbidden" HTTP error
-code. If the size of any file except the coredump exceeds 100 KB, the
-server must return the "413 Request Entity Too Large" HTTP error code,
-and provide an explanation, including the limit, in the response
-body. The 100 KB limit must be changeable from the server configuration
-file.
-
-If the file check succeeds, the server HTTP response must have the "201
-Created" HTTP code. The response must include the following HTTP header
-fields:
-@itemize
-@item
-@var{X-Task-Id} containing a new server-unique numerical
-task id
-@item
-@var{X-Task-Password} containing a newly generated
-password, required to access the result
-@item
-@var{X-Task-Est-Time} containing a number of seconds the
-server estimates it will take to generate the backtrace
-@end itemize
-
-The @var{X-Task-Password} is a random alphanumeric (@samp{[a-zA-Z0-9]})
-sequence 22 characters long. 22 alphanumeric characters corresponds to
-128 bit password, because @samp{[a-zA-Z0-9]} = 62 characters, and
-@math{2^128} < @math{62^22}. The source of randomness must be,
-directly or indirectly, @file{/dev/urandom}. The @code{rand()} function
-from glibc and similar functions from other libraries cannot be used
-because of their poor characteristics (in several aspects). The password
-must be stored to the @file{/var/spool/abrt-retrace/@var{id}/password} file,
-so passwords sent by a client in subsequent requests can be verified.
-
-The task id is intentionally not used as a password, because it is
-desirable to keep the id readable and memorable for
-humans. Password-like ids would be a loss when an user authentication
-mechanism is added, and server-generated password will no longer be
-necessary.
-
-The algorithm for the @var{X-Task-Est-Time} time estimation
-should take the previous analyses of coredumps with the same
-corresponding package name into account. The server should store
-simple history in a SQLite database to know how long it takes to
-generate a backtrace for certain package. It could be as simple as
-this:
-@itemize
-@item
-  initialization step one: @code{CREATE TABLE package_time (id INTEGER
-  PRIMARY KEY AUTOINCREMENT, package, release, time)}; we need the
-  @var{id} for the database cleanup - to know the insertion order of
-  rows, so the @code{AUTOINCREMENT} is important here; the @var{package}
-  is the package name without the version and release numbers, the
-  @var{release} column stores the operating system, and the @var{time}
-  is the number of seconds it took to generate the backtrace
-@item
-  initialization step two: @code{CREATE INDEX package_release ON
-  package_time (package, release)}; we compute the time only for single
-  package on single supported OS release per query, so it makes sense to
-  create an index to speed it up
-@item
-  when a task is finished: @code{INSERT INTO package_time (package,
-  release, time) VALUES ('??', '??', '??')}
-@item
-  to get the average time: @code{SELECT AVG(time) FROM package_time
-  WHERE package == '??' AND release == '??'}; the arithmetic mean seems
-  to be sufficient here
-@end itemize
-
-So the server knows that crashes from an OpenOffice.org package
-take 5 minutes to process in average, and it can return the value 300
-(seconds) in the field. The client does not waste time asking about
-that task every 20 seconds, but the first status request comes after
-300 seconds. And even when the package changes (rebases etc.), the
-database provides good estimations after some time anyway
-(@ref{Task cleanup} chapter describes how the
-data are pruned).
-
-The server response HTTP body is generated and sent
-gradually as the task is performed. Client chooses either to receive
-the body, or terminate after getting all headers and ask the server
-for status and backtrace asynchronously.
-
-The server re-sends the output of abrt-retrace-worker (its stdout and
-stderr) to the response the body. In addition, a line with the task
-status is added in the form @code{X-Task-Status: PENDING} to the body
-every 5 seconds. When the worker process ends, either
-@samp{FINISHED_SUCCESS} or @samp{FINISHED_FAILURE} status line is
-sent. If it's @samp{FINISHED_SUCCESS}, the backtrace is attached after
-this line. Then the response body is closed.
-
-@node Task status
-@section Task status
-
-A client might request a task status by sending a HTTP GET request to
-the @indicateurl{https://someserver/@var{id}} URL, where @var{id} is the
-numerical task id returned in the @var{X-Task-Id} field by
-@indicateurl{https://someserver/create}. If the @var{id} is not in the
-valid format, or the task @var{id} does not exist, the server must
-return the "404 Not Found" HTTP error code.
-
-The client request must contain the @var{X-Task-Password} field, and its
-content must match the password stored in the
-@file{/var/spool/abrt-retrace/@var{id}/password} file. If the password is
-not valid, the server must return the "403 Forbidden" HTTP error code.
-
-If the checks pass, the server returns the "200 OK" HTTP code, and
-includes a field @var{X-Task-Status} containing one of the following
-values: @samp{FINISHED_SUCCESS}, @samp{FINISHED_FAILURE},
-@samp{PENDING}.
-
-The field contains @samp{FINISHED_SUCCESS} if the file
-@file{/var/spool/abrt-retrace/@var{id}/backtrace} exists. The client might
-get the backtrace on the @indicateurl{https://someserver/@var{id}/backtrace}
-URL. The log might be obtained on the
-@indicateurl{https://someserver/@var{id}/log} URL, and it might contain
-warnings about some missing debuginfos etc.
-
-The field contains @samp{FINISHED_FAILURE} if the file
-@file{/var/spool/abrt-retrace/@var{id}/backtrace} does not exist, and file
-@file{/var/spool/abrt-retrace/@var{id}/retrace-log} exists. The retrace-log
-file containing error messages can be downloaded by the client from the
-@indicateurl{https://someserver/@var{id}/log} URL.
-
-The field contains @samp{PENDING} if neither file exists. The client
-should ask again after 10 seconds or later.
-
-@node Requesting a backtrace
-@section Requesting a backtrace
-
-A client might request a backtrace by sending a HTTP GET request to the
-@indicateurl{https://someserver/@var{id}/backtrace} URL, where @var{id}
-is the numerical task id returned in the @var{X-Task-Id} field by
-@indicateurl{https://someserver/create}. If the @var{id} is not in the
-valid format, or the task @var{id} does not exist, the server must
-return the "404 Not Found" HTTP error code.
-
-The client request must contain the @var{X-Task-Password} field, and its
-content must match the password stored in the
-@file{/var/spool/abrt-retrace/@var{id}/password} file. If the password
-is not valid, the server must return the "403 Forbidden" HTTP error
-code.
-
-If the file @file{/var/spool/abrt-retrace/@var{id}/backtrace} does not
-exist, the server must return the "404 Not Found" HTTP error code.
-Otherwise it returns the file contents, and the @var{Content-Type} field
-must contain @samp{text/plain}.
-
-@node Requesting a log
-@section Requesting a log
-
-A client might request a task log by sending a HTTP GET request to the
-@indicateurl{https://someserver/@var{id}/log} URL, where @var{id} is the
-numerical task id returned in the @var{X-Task-Id} field by
-@indicateurl{https://someserver/create}. If the @var{id} is not in the
-valid format, or the task @var{id} does not exist, the server must
-return the "404 Not Found" HTTP error code.
-
-The client request must contain the @var{X-Task-Password} field, and its
-content must match the password stored in the
-@file{/var/spool/abrt-retrace/@var{id}/password} file. If the password is
-not valid, the server must return the "403 Forbidden" HTTP error code.
-
-If the file @file{/var/spool/abrt-retrace/@var{id}/retrace-log} does not
-exist, the server must return the "404 Not Found" HTTP error code.
-Otherwise it returns the file contents, and the "Content-Type" must
-contain "text/plain".
-
-@node Task cleanup
-@section Task cleanup
-
-Tasks that were created more than 5 days ago must be deleted, because
-tasks occupy disk space (not so much space, as the coredumps are deleted
-after the retrace, and only backtraces and configuration remain). A
-shell script @command{abrt-retrace-clean} must check the creation time
-and delete the directories in @file{/var/spool/abrt-retrace/}. It is
-supposed that the server administrator sets @command{cron} to call the
-script once a day. This assumption must be mentioned in the
-@command{abrt-retrace-clean} manual page.
-
-The database containing packages and processing times should also
-be regularly pruned to remain small and provide data quickly. The
-cleanup script should delete some rows for packages with too many
-entries:
-@enumerate
-@item
-get a list of packages from the database: @code{SELECT DISTINCT
-package, release FROM package_time}
-@item
-for every package, get the row count: @code{SELECT COUNT(*) FROM
-package_time WHERE package == '??' AND release == '??'}
-@item
-for every package with the row count larger than 100, some rows
-most be removed so that only the newest 100 rows remain in the
-database:
-@itemize
- @item
-  to get highest row id which should be deleted,
-  execute @code{SELECT id FROM package_time WHERE package == '??' AND
-  release == '??' ORDER BY id LIMIT 1 OFFSET ??}, where the
-  @code{OFFSET} is the total number of rows for that single
-  package minus 100
- @item
-  then all the old rows can be deleted by executing @code{DELETE
-  FROM package_time WHERE package == '??' AND release == '??' AND id
-  <= ??}
-@end itemize
-@end enumerate
-
-@node Limiting traffic
-@section Limiting traffic
-
-The maximum number of simultaneously running tasks must be limited
-to 20 by the server. The limit must be changeable from the server
-configuration file. If a new request comes when the server is fully
-occupied, the server must return the "503 Service Unavailable" HTTP
-error code.
-
-The archive extraction, chroot preparation, and gdb analysis is
-mostly limited by the hard drive size and speed.
-
-@node Retrace worker
-@chapter Retrace worker
-
-The worker (@command{abrt-retrace-worker} binary) gets a
-@file{/var/spool/abrt-retrace/@var{id}} directory as an input. The worker
-reads the operating system name and version, the coredump, and the list
-of packages needed for retracing (a package containing the binary which
-crashed, and packages with the libraries that are used by the binary).
-
-The worker prepares a new @file{chroot} subdirectory with the packages,
-their debuginfo, and gdb installed. In other words, a new directory
-@file{/var/spool/abrt-retrace/@var{id}/chroot} is created and
-the packages are unpacked or installed into this directory, so for
-example the gdb ends up as
-@file{/var/.../@var{id}/chroot/usr/bin/gdb}.
-
-After the @file{chroot} subdirectory is prepared, the worker moves the
-coredump there and changes root (using the chroot system function) of a
-child script there. The child script runs the gdb on the coredump, and
-the gdb sees the corresponding crashy binary, all the debuginfo and all
-the proper versions of libraries on right places.
-
-When the gdb run is finished, the worker copies the resulting backtrace
-to the @file{/var/spool/abrt-retrace/@var{id}/backtrace} file and stores a
-log from the whole chroot process to the @file{retrace-log} file in the
-same directory. Then it removes the @file{chroot} directory.
-
-The GDB installed into the chroot must:
-@itemize
-@item
-run on the server (same architecture, or we can use
-@uref{http://wiki.qemu.org/download/qemu-doc.html#QEMU-User-space-emulator, QEMU
-user space emulation})
-@item
-process the coredump (possibly from another architecture): that
-means we need a special GDB for every supported architecture
-@item
-be able to handle coredumps created in an environment with prelink
-enabled
-(@uref{http://sourceware.org/ml/gdb/2009-05/msg00175.html, should
-not} be a problem)
-@item
-use libc, zlib, readline, ncurses, expat and Python packages,
-while the version numbers required by the coredump might be different
-from what is required by the GDB
-@end itemize
-
-The gdb might fail to run with certain combinations of package
-dependencies. Nevertheless, we need to provide the libc/Python/*
-package versions which are required by the coredump. If we would not
-do that, the backtraces generated from such an environment would be of
-lower quality. Consider a coredump which was caused by a crash of
-Python application on a client, and which we analyze on the retrace
-server with completely different version of Python because the
-client's Python version is not compatible with our GDB.
-
-We can solve the issue by installing the GDB package dependencies first,
-move their binaries to some safe place (@file{/lib/gdb} in the chroot),
-and create the @file{/etc/ld.so.preload} file pointing to that place, or
-set @env{LD_LIBRARY_PATH}. Then we can unpack libc binaries and
-other packages and their versions as required by the coredump to the
-common paths, and the GDB would run happily, using the libraries from
-@file{/lib/gdb} and not those from @file{/lib} and @file{/usr/lib}. This
-approach can use standard GDB builds with various target architectures:
-gdb, gdb-i386, gdb-ppc64, gdb-s390 (nonexistent in Fedora/EPEL at the
-time of writing this).
-
-The GDB and its dependencies are stored separately from the packages
-used as data for coredump processing. A single combination of GDB and
-its dependencies can be used across all supported OS to generate
-backtraces.
-
-The retrace worker must be able to prepare a chroot-ready environment
-for certain supported operating system, which is different from the
-retrace server's operating system. It needs to fake the @file{/dev}
-directory and create some basic files in @file{/etc} like @file{passwd}
-and @file{hosts}. We can use the @uref{https://fedorahosted.org/mock/,
-mock} library to do that, as it does almost what we need (but not
-exactly as it has a strong focus on preparing the environment for
-rpmbuild and running it), or we can come up with our own solution, while
-stealing some code from the mock library. The @file{/usr/bin/mock}
-executable is entirely unuseful for the retrace server, but the
-underlying Python library can be used. So if would like to use mock, an
-ABRT-specific interface to the mock library must be written or the
-retrace worker must be written in Python and use the mock Python library
-directly.
-
-We should save some time and disk space by extracting only binaries
-and dynamic libraries from the packages for the coredump analysis, and
-omit all other files. We can save even more time and disk space by
-extracting only the libraries and binaries really referenced by the
-coredump (eu-unstrip tells us). Packages should not be
-@emph{installed} to the chroot, they should be @emph{extracted}
-only, because we use them as a data source, and we never run them.
-
-Another idea to be considered is that we can avoid the package
-extraction if we can teach GDB to read the dynamic libraries, the
-binary, and the debuginfo directly from the RPM packages. We would
-provide a backend to GDB which can do that, and provide tiny front-end
-program which tells the backend which RPMs it should use and then run
-the GDB command loop. The result would be a GDB wrapper/extension we
-need to maintain, but it should end up pretty small. We would use
-Python to write our extension, as we do not want to (inelegantly)
-maintain a patch against GDB core. We need to ask GDB people if the
-Python interface is capable of handling this idea, and how much work
-it would be to implement it.
-
-@node Package repository
-@chapter Package repository
-
-We should support every Fedora release with all packages that ever
-made it to the updates and updates-testing repositories. In order to
-provide all that packages, a local repository is maintained for every
-supported operating system. The debuginfos might be provided by a
-debuginfo server in future (it will save the server disk space). We
-should support the usage of local debuginfo first, and add the
-debuginfofs support later.
-
-A repository with Fedora packages must be maintained locally on the
-server to provide good performance and to provide data from older
-packages already removed from the official repositories. We need a
-package downloader, which scans Fedora servers for new packages, and
-downloads them so they are immediately available.
-
-Older versions of packages are regularly deleted from the updates
-and updates-testing repositories. We must support older versions of
-packages, because that is one of two major pain-points that the
-retrace server is supposed to solve (the other one is the slowness of
-debuginfo download and debuginfo disk space requirements).
-
-A script abrt-reposync must download packages from Fedora
-repositories, but it must not delete older versions of the
-packages. The retrace server administrator is supposed to call this
-script using cron every ~6 hours. This expectation must be documented
-in the abrt-reposync manual page. The script can use use wget, rsync,
-or reposync tool to get the packages. The remote yum source
-repositories must be configured from a configuration file or files
-(@file{/etc/yum.repos.d} might be used).
-
-When the abrt-reposync is used to sync with the Rawhide repository,
-unneeded packages (where a newer version exists) must be removed after
-residing one week with the newer package in the same repository.
-
-All the unneeded content from the newly downloaded packages should be
-removed to save disk space and speed up chroot creation. We need just
-the binaries and dynamic libraries, and that is a tiny part of package
-contents.
-
-The packages should be downloaded to a local repository in
-@file{/var/cache/abrt-repo/@{fedora12,fedora12-debuginfo,...@}}.
-
-@node Traffic and load estimation
-@chapter Traffic and load estimation
-
-2500 bugs are reported from ABRT every month. Approximately 7.3%
-from that are Python exceptions, which don't need a retrace
-server. That means that 2315 bugs need a retrace server. That is 77
-bugs per day, or 3.3 bugs every hour on average. Occasional spikes
-might be much higher (imagine a user that decided to report all his 8
-crashes from last month).
-
-We should probably not try to predict if the monthly bug count goes up
-or down. New, untested versions of software are added to Fedora, but
-on the other side most software matures and becomes less crashy.  So
-let's assume that the bug count stays approximately the same.
-
-Test crashes (see that we should probably use @code{`xz -2`}
-to compress coredumps):
-@itemize
-@item
-firefox with 7 tabs (random pages opened), coredump size 172 MB
-@itemize
-@item
-xz compression
-@itemize
-@item
-compression level 6 (default): compression took 32.5 sec, compressed
-size 5.4 MB, decompression took 2.7 sec
-@item
-compression level 3: compression took 23.4 sec, compressed size 5.6 MB,
-decompression took 1.6 sec
-@item
-compression level 2: compression took 6.8 sec, compressed size 6.1 MB,
-decompression took 3.7 sec
-@item
-compression level 1: compression took 5.1 sec, compressed size 6.4 MB,
-decompression took 2.4 sec
-@end itemize
-@item
-gzip compression
-@itemize
-@item
-compression level 9 (highest): compression took 7.6 sec, compressed size
-7.9 MB, decompression took 1.5 sec
-@item
-compression level 6 (default): compression took 2.6 sec, compressed size
-8 MB, decompression took 2.3 sec
-@item
-compression level 3: compression took 1.7 sec, compressed size 8.9 MB,
-decompression took 1.7 sec
-@end itemize
-@end itemize
-@item
-thunderbird with thousands of emails opened, coredump size 218 MB
-@itemize
-@item
-xz compression
-@itemize
-@item
-compression level 6 (default): compression took 60 sec, compressed size
-12 MB, decompression took 3.6 sec
-@item
-compression level 3: compression took 42 sec, compressed size 13 MB,
-decompression took 3.0 sec
-@item
-compression level 2: compression took 10 sec, compressed size 14 MB,
-decompression took 3.0 sec
-@item
-compression level 1: compression took 8.3 sec, compressed size 15 MB,
-decompression took 3.2 sec
-@end itemize
-@item
-gzip compression
-@itemize
-@item
-compression level 9 (highest): compression took 14.9 sec, compressed
-size 18 MB, decompression took 2.4 sec
-@item
-compression level 6 (default): compression took 4.4 sec, compressed size
-18 MB, decompression took 2.2 sec
-@item
-compression level 3: compression took 2.7 sec, compressed size 20 MB,
-decompression took 3 sec
-@end itemize
-@end itemize
-@item
-evince with 2 pdfs (1 and 42 pages) opened, coredump size 73 MB
-@itemize
-@item
-xz compression
-@itemize
-@item
-compression level 2: compression took 2.9 sec, compressed size 3.6 MB,
-decompression took 0.7 sec
-@item
-compression level 1: compression took 2.5 sec, compressed size 3.9 MB,
-decompression took 0.7 sec
-@end itemize
-@end itemize
-@item
-OpenOffice.org Impress with 25 pages presentation, coredump size 116 MB
-@itemize
-@item
-xz compression
-@itemize
-@item
-compression level 2: compression took 7.1 sec, compressed size 12 MB,
-decompression took 2.3 sec
-@end itemize
-@end itemize
-@end itemize
-
-So let's imagine there are some users that want to report their
-crashes approximately at the same time. Here is what the retrace
-server must handle:
-@enumerate
-@item
-2 OpenOffice crashes
-@item
-2 evince crashes
-@item
-2 thunderbird crashes
-@item
-2 firefox crashes
-@end enumerate
-
-We will use the xz archiver with the compression level 2 on the ABRT's
-side to compress the coredumps. So the users spend 53.6 seconds in
-total packaging the coredumps.
-
-The packaged coredumps have 71.4 MB, and the retrace server must
-receive that data.
-
-The server unpacks the coredumps (perhaps in the same time), so they
-need 1158 MB of disk space on the server. The decompression will take
-19.4 seconds.
-
-Several hundred megabytes will be needed to install all the
-required packages and debuginfos for every chroot (8 chroots 1 GB each
-= 8 GB, but this seems like an extreme, maximal case). Some space will
-be saved by using a debuginfofs.
-
-Note that most applications are not as heavyweight as OpenOffice and
-Firefox.
-
-@node Security
-@chapter Security
-
-The retrace server communicates with two other entities: it accepts
-coredumps form users, and it downloads debuginfos and packages from
-distribution repositories.
-
-@menu
-* Clients::
-* Packages and debuginfo::
-@end menu
-
-General security from GDB flaws and malicious data is provided by
-chroot. The GDB accesses the debuginfos, packages, and the coredump
-from within the chroot, unable to access the retrace server's
-environment. We should consider setting a disk quota to every chroot
-directory, and limit the GDB access to resources using cgroups.
-
-SELinux policy should be written for both the retrace server's HTTP
-interface, and for the retrace worker.
-
-@node Clients
-@section Clients
-
-The clients, which are using the retrace server and sending coredumps
-to it, must fully trust the retrace server administrator.  The server
-administrator must not try to get sensitive data from client
-coredumps.  That seems to be a major bottleneck of the retrace server
-idea.  However, users of an operating system already trust the OS
-provider in various important matters. So when the retrace server is
-operated by the operating system provider, that might be acceptable by
-users.
-
-We cannot avoid sending clients' coredumps to the retrace server, if
-we want to generate quality backtraces containing the values of
-variables. Minidumps are not acceptable solution, as they lower the
-quality of the resulting backtraces, while not improving user
-security.
-
-Can the retrace server trust clients? We must know what can a
-malicious client achieve by crafting a nonstandard coredump, which
-will be processed by server's GDB.  We should ask GDB experts about
-this.
-
-Another question is whether we can allow users providing some packages
-and debuginfo together with a coredump. That might be useful for
-users, who run the operating system only with some minor
-modifications, and they still want to use the retrace server. So they
-send a coredump together with a few nonstandard packages. The retrace
-server uses the nonstandard packages together with the OS packages to
-generate the backtrace. Is it safe? We must know what can a malicious
-client achieve by crafting a special binary and debuginfo, which will
-be processed by server's GDB.
-
-@node Packages and debuginfo
-@section Packages and debuginfo
-
-We can safely download packages and debuginfo from the distribution,
-as the packages are signed by the distribution, and the package origin
-can be verified.
-
-When the debuginfo server is done, the retrace server can safely use
-it, as the data will also be signed.
-
-@node Future work
-@chapter Future work
-
-1. Coredump stripping. Jan Kratochvil: With my test of OpenOffice.org
-presentation kernel core file has 181MB, xz -2 of it has 65MB.
-According to `set target debug 1' GDB reads only 131406 bytes of it
-(incl. the NOTE segment).
-
-2. Use gdbserver instead of uploading whole coredump.  GDB's
-gdbserver cannot process coredumps, but Jan Kratochvil's can:
-<pre>  git://git.fedorahosted.org/git/elfutils.git
-  branch: jankratochvil/gdbserver
-  src/gdbserver.c
-   * Currently threading is not supported.
-   * Currently only x86_64 is supported (the NOTE registers layout).
-</pre>
-
-3. User management for the HTTP interface. We need multiple
-authentication sources (x509 for RHEL).
-
-4. Make @file{architecture}, @file{release},
-@file{packages} files, which must be included in the package
-when creating a task, optional. Allow uploading a coredump without
-involving tar: just coredump, coredump.gz, or coredump.xz.
-
-5. Handle non-standard packages (provided by user)
-
-@bye
-- 
cgit