From acda58b561b92d21df685d03cc703b5792d9d72b Mon Sep 17 00:00:00 2001 From: Rainer Gerhards Date: Thu, 28 Jan 2010 12:28:35 +0100 Subject: added some information on how to help troubleshoot rsyslog --- doc/debug.html | 2 +- doc/troubleshoot.html | 54 +++++++++++++++++++++++++++++++++++++++++++++++++-- 2 files changed, 53 insertions(+), 3 deletions(-) diff --git a/doc/debug.html b/doc/debug.html index 46759986..6aeb7975 100644 --- a/doc/debug.html +++ b/doc/debug.html @@ -138,7 +138,7 @@ instance of rsyslogd can be aborted by pressing ctl-c.

[manual index] [rsyslog site]

This documentation is part of the rsyslog project.
-Copyright © 2008, 2009 by Rainer Gerhards and +Copyright © 2008-2010 by Rainer Gerhards and Adiscon. Released under the GNU GPL version 3 or higher.

diff --git a/doc/troubleshoot.html b/doc/troubleshoot.html index a8855fd4..16b2754b 100644 --- a/doc/troubleshoot.html +++ b/doc/troubleshoot.html @@ -102,13 +102,63 @@ comes without any guarantees, include no guarantee on confidentiality [aka "we don't want to be sued for work were are not even paid for ;)]. So if you submit debug logs, do so at your sole risk. By submitting them, you accept this policy. +

Segmentation Faults +

Rsyslog has a very rapid development process, complex capabilities and now gradually gets +more and more exposure. While we are happy about this, it also has some bad effects: some +deployment scenarios have probably never been tested and it may be impossible to test +them for the development team because of resources needed. So while we try to avoid this, +you may see a serious problem during deployments in demanding, non-standard, environments +(hopefully not with a stable version, but chances are good you'll run into troubles with +the development versions). +

Active support from the user base is very important to help us track down those things. +Most often, serious problems are the result of some memory misadressing. During development, +we routinely use valgrind, a very well and capable memory debugger. This helps us to create +pretty clean code. But valgrind can not detect anything, most importantly not code pathes +that are never executed. So of most use for us is information about aborts and abort locations. +

Unforutnately, faults rooted in adressing errors typically show up only later, so the +actual abort location is in an unrelated spot. To help track down the original spot, +libc +later than 5.4.23 offers support for finding, and possible temporary relief from it, +by means of the MALLOC_CHECK_ environment variable. Setting it to 2 is a useful troubleshooting +aid for us. It will make the program abort as soon as the check routines detect anything +suspicious (unfortunately, this may still not be the root cause, but hopefully closer to it). +Setting it to 0 may even make some problems disappear (but it will NOT fix them!). +With functionality comes cost, and so exporting MALLOC_CHECK_ without need comes at +a performance penalty. However, we strongly recommend adding this instrumentation to your +test environment should you see any serious problems. Chances are good it will help us +interpret a dump better, and thus be able to quicker craft a fix. +

In order to get useful information, we need some backtrace of the abort. First, you need +to make sure that a core file is created. Under Fedora, for example, that means you need +to have an "ulimit -c unlimited" in place. +

Now let's assume you got a core file (e.g. in /core.1234). So what to do next? Sending a +core file to us is most often pointless - we need to have the exact same system configuration in +order to interpret it correctly. Obviously, chances are extremely slim for this to be. So we would +appreciate if you could extract the most important information. This is done as follows: +

+

Then please send all information that gdb spit out to the development team. It is best to first +ask on the forum or mailing list on how to do that. The developers will keep in contact with you +and, I fear, will probably ask for other things as well ;) +

Note that we strive for highest reliability of the engine even in unusual deployment scenarios. +Unfortunately, this is hard to achieve, especially with limited resources. So we are depending on +cooperation from users. This is your chance to make a big contribution to the project without the +need to program or do anything else except get a problem solved ;)

[manual index] [rsyslog site]

This documentation is part of the rsyslog project.
-Copyright © 2008 by Rainer Gerhards and +Copyright © 2008-2010 by Rainer Gerhards and Adiscon. Released under the GNU GPL -version 2 or higher.

+version 3 or higher.

-- cgit