summaryrefslogtreecommitdiffstats
path: root/doc/rsyslog_high_database_rate.html
diff options
context:
space:
mode:
authorRainer Gerhards <rgerhards@adiscon.com>2008-01-28 10:13:29 +0000
committerRainer Gerhards <rgerhards@adiscon.com>2008-01-28 10:13:29 +0000
commit28f30ca25beb1c7dc8f7d642648856ef5ad0218d (patch)
treed704a8c81f15b506ef8791c9cbe08b2b855a3ad8 /doc/rsyslog_high_database_rate.html
parent1fe9d38f8490430dcb7917c3f760061fe4d78d62 (diff)
downloadrsyslog-28f30ca25beb1c7dc8f7d642648856ef5ad0218d.tar.gz
rsyslog-28f30ca25beb1c7dc8f7d642648856ef5ad0218d.tar.xz
rsyslog-28f30ca25beb1c7dc8f7d642648856ef5ad0218d.zip
added paper "writing massive amounts of syslog messages to a database"
Diffstat (limited to 'doc/rsyslog_high_database_rate.html')
-rw-r--r--doc/rsyslog_high_database_rate.html128
1 files changed, 128 insertions, 0 deletions
diff --git a/doc/rsyslog_high_database_rate.html b/doc/rsyslog_high_database_rate.html
new file mode 100644
index 00000000..213050e6
--- /dev/null
+++ b/doc/rsyslog_high_database_rate.html
@@ -0,0 +1,128 @@
+<html><head>
+
+<title>Handling a massive syslog database insert rate with Rsyslog</title>
+
+<meta name="KEYWORDS" content="syslog, rsyslog, reliable, howto, database, postgresql, mysql, buffering, disk, queue">
+
+</head>
+
+<body>
+
+<h1>Handling a massive syslog database insert rate with Rsyslog</h1>
+
+ <P><small><i>Written by
+
+ <a href="http://www.adiscon.com/en/people/rainer-gerhards.php">Rainer
+
+ Gerhards</a> (2008-01-28)</i></small></P>
+
+<h2>Abstract</h2>
+
+<p><i><b>In this paper, I describe how log massive amounts of
+<a href="http://www.monitorware.com/en/topics/syslog/">syslog</a>
+
+messages to a database. </b>This HOWTO is currently under development and thus a
+bit brief. Updates are promised ;).</i></p>
+
+<h2>The Intention</h2>
+
+<p>Database updates are inherently slow when it comes to storing syslog
+messages. However, there are a number of applications where it is handy to have
+the message inside a database. Rsyslog supports native database writing via
+output plugins. As of this writing, there are plugins available for MySQL an
+PostgreSQL. Maybe additional plugins have become available by the time you read
+this. Be sure to check.</p>
+<p>In order to successfully write messages to a database backend, the backend
+must be capable to record messages at the expected average arrival rate. This is
+the rate if you take all messages that can arrive within a day and divide it by
+86400 (the number of seconds per day). Let's say you expect 43,200,000 messages
+per day. That's an average rate of 500 messages per second (mps). Your database
+server MUST be able to handle that amount of message per second on a sustained
+rate. If it doesn't, you either need to add an additional server, lower the
+number of message - or forget about it.</p>
+<p>However, this is probably not your peak rate. Let's simply assume your
+systems work only half a day, that's 12 hours (and, yes, I know this is
+unrealistic, but you'll get the point soon). So your average rate is actually
+1,000 mps during work hours and 0 mps during non-work hours. To make matters
+worse, workload is not divided evenly during the day. So you may have peaks of
+up to 10,000mps while at other times the load may go down to maybe just 100mps.
+Peaks may stay well above 2,000mps for a few minutes.</p>
+<p>So how the hack you will be able to handle all of this traffic (including the
+peaks) with a database server that is just capable of inserting a maximum of
+500mps?</p>
+<p>The key here is buffering. Messages that the database server is not capable
+to handle will be buffered until it is. Of course, that means database insert
+are NOT real-time. If you need real-time inserts, you need to make sure your
+database server can handle traffic at the actual peak rate. But lets assume you
+are OK with some delay.</p>
+<p>Buffering is fine. But how about these massive amounts of data? That can't be
+hold in memory, so don't we run out of luck with buffering? The key here is that
+rsyslog can not only buffer in memory but also buffer to disk (this may remind
+you of &quot;spooling&quot; which gets you the right idea). There are several queuing
+modes available, offering differnent throughput. In general, the idea is to
+buffer in memory until the memory buffer is exhausted and switch to
+disk-buffering when needed (and only as long as needed). All of this is handled
+automatically and transparently by rsyslog.</p>
+<p>With our above scenario, the disk buffer would build up during the day and
+rsyslog would use the night to drain it. Obviously, this is an extreme example,
+but it shows what can be done. Please note that queue content survies rsyslogd
+restarts, so even a reboot of the system will not cause any message loss.</p>
+<h2>How To Setup</h2>
+<p>Frankly, it's quite easy. You just need to do is instruct rsyslog to use a
+disk queue and then configure your action. There is nothing else to do. With the
+following simple config file, you log anything you receive to a MySQL database
+and have buffering applied automatically.</p>
+<textarea rows="11" cols="80">
+$ModLoad ommysql.so # load the output driver (use ompgsql.so for PostgreSQL)
+$ModLoad imudp.so # network reception
+$ModLoad imuxsock.so # local message reception
+$MainMessage
+
+*.* :ommysql:
+</textarea>
+<p>&nbsp;</p>
+
+<h3>Feedback requested</h3>
+
+<P>I would appreciate feedback on this tutorial. If you have additional ideas,
+
+comments or find bugs (I *do* bugs - no way... ;)), please
+
+<a href="mailto:rgerhards@adiscon.com">let me know</a>.</P>
+
+<h2>Revision History</h2>
+
+<ul>
+
+ <li>2008-01-28 *
+
+ <a href="http://www.adiscon.com/en/people/rainer-gerhards.php">Rainer Gerhards</a> * Initial Version created</li>
+
+</ul>
+<h2>Copyright</h2>
+
+<p>Copyright (c) 2008
+
+<a href="http://www.adiscon.com/en/people/rainer-gerhards.php">Rainer Gerhards</a> and
+
+<a href="http://www.adiscon.com/en/">Adiscon</a>.</p>
+
+<p> Permission is granted to copy, distribute and/or modify this document
+
+ under the terms of the GNU Free Documentation License, Version 1.2
+
+ or any later version published by the Free Software Foundation;
+
+ with no Invariant Sections, no Front-Cover Texts, and no Back-Cover
+
+ Texts. A copy of the license can be viewed at
+
+<a href="http://www.gnu.org/copyleft/fdl.html">
+
+http://www.gnu.org/copyleft/fdl.html</a>.</p>
+
+
+
+</body>
+
+</html> \ No newline at end of file