1 files changed, 331 insertions, 0 deletions
diff --git a/doc/dev_oplugins.html b/doc/dev_oplugins.html
new file mode 100644
index 00000000..63c186a3
--- /dev/null
+++ b/doc/dev_oplugins.html
@@ -0,0 +1,331 @@
+<html>
+<head>
+<title>writing rsyslog output plugins (developer's guide)</title>
+</head>
+<body>
+<h1>Writing Rsyslog Output Plugins</h1>
+<p>This page is the begin of some developer documentation for writing output
+plugins. Doing so is quite easy (and that was a design goal), but there currently
+is only sparse documentation on the process available. I was tempted NOT to 
+write this guide here because I know I will most probably not be able to
+write a complete guide.
+<p>However, I finally concluded that it may be better to have same information
+and pointers than to have nothing.
+<h2>Getting Started and Samples</h2>
+<p>The best to get started with rsyslog plugin development is by looking at
+existing plugins. All that start with "om" are <b>o</b>utput <b>m</b>odules. That
+means they are primarily thought of being message sinks. In theory, however, output
+plugins may aggergate other functionality, too. Nobody has taken this route so far
+so if you would like to do that, it is highly suggested to post your plan on the
+rsyslog mailing list, first (so that we can offer advise).
+<p>The rsyslog distribution tarball contains two plugins that are extremely well
+targeted for getting started:
+<ul>
+<li>omtemplate
+<li>omstdout
+</ul>
+Plugin omtemplate was specifically created to provide a copy template for new output
+plugins. It is bare of real functionality but has ample comments. Even if you decide
+to start from another plugin (or even from scratch), be sure to read omtemplate source
+and comments first. The omstdout is primarily a testing aide, but offers support for
+the two different parameter-passing conventions plugins can use (plus the way to 
+differentiate between the two). It also is not bare of functionaly, only mostly
+bare of it ;). But you can actually execute it and play with it.
+<p>In any case, you should also read the comments in ./runtime/module-template.h. 
+Output plugins are build based on a large set of code-generating macros. These 
+macros handle most of the plumbing needed by the interface. As long as no
+special callback to rsyslog is needed (it typically is not), an output plugin does
+not really need to be aware that it is executed by rsyslog. As a plug-in programmer,
+you can (in most cases) "code as usual". However, all macros and entry points need to be
+provided and thus reading the code comments in the files mentioned is highly suggested.
+<p>In short, the best idea is to start with a template. Let's assume you start by
+copying omtemplate. Then, the basic steps you need to do are:
+<ul>
+<li>cp ./plugins/omtemplate ./plugins/your-plugin
+<li>mv cd ./plugins/your-plugin
+<li>vi Makefile.am, adjust to your-plugin
+<li>mv omtemplate.c your-plugin.c
+<li>cd ../..
+<li>vi Makefile.am configure.ac
+<br>search for omtemplate, copy and modify (follow comments)
+</ul>
+<p>Basically, this is all you need to do ... Well, except, of course, coding
+your plugin ;). For testing, you need rsyslog's debugging support. Some useful
+information is given in "<a href="troubleshoot.html">troubleshooting rsyslog</a>
+from the doc set.
+<h2>Special Topics</h2>
+<h3>Threading</h3>
+<p>Rsyslog uses massive parallel processing and multithreading. However, a plugin's entry
+points are guaranteed to be never called concurrently <b>for the same action</b>.
+That means your plugin must be able to be called concurrently by two or more 
+threads, but you can be sure that for the same instance no concurrent calls
+happen. This is guaranteed by the interface specification and the rsyslog core
+guards against multiple concurrent calls. An instance, in simple words, is one
+that shares a single instanceData structure.
+<p>So as long as you do not mess around with global data, you do not need
+to think about multithreading (and can apply a purely sequential programming
+methodology).
+<p>Please note that duringt the configuraton parsing stage of execution, access to
+global variables for the configuration system is safe. In that stage, the core will
+only call sequentially into the plugin.
+<h3>Getting Message Data</h3>
+<p>The doAction() entry point of your plugin is provided with messages to be processed.
+It will only be activated after filtering and all other conditions, so you do not need
+to apply any other conditional but can simply process the message.
+<p>Note that you do NOT receive the full internal representation of the message
+object. There are various (including historical) reasons for this and, among
+others, this is a design decision based on security.
+<p>Your plugin will only receive what the end user has configured in a $template
+statement. However, starting with 4.1.6, there are two ways of receiving the
+template content. The default mode, and in most cases sufficient and optimal,
+is to receive a single string with the expanded template. As I said, this is usually
+optimal, think about writing things to files, emailing content or forwarding it.
+<p>The important philosophy is that a plugin should <b>never</b> reformat any
+of such strings - that would either remove the user's ability to fully control 
+message formats or it would lead to duplicating code that is already present in the
+core. If you need some formatting that is not yet present in the core, suggest it
+to the rsyslog project, best done by sending a patch ;), and we will try hard to
+get it into the core (so far, we could accept all such suggestions - no promise, though).
+<p>If a single string seems not suitable for your application, the plugin can also
+request access to the template components. The typical use case seems to be databases, where
+you would like to access properties via specific fields. With that mode, you receive a
+char ** array, where each array element points to one field from the template (from
+left to right). Fields start at arrray index 0 and a NULL pointer means you have
+reached the end of the array (the typical Unix "poor man's linked list in an array"
+design). Note, however, that each of the individual components is a string. It is 
+not a date stamp, number or whatever, but a string. This is because rsyslog processes
+strings (from a high-level design look at it) and so this is the natural data type.
+Feel free to convert to whatever you need, but keep in mind that malformed packets
+may have lead to field contents you'd never expected...
+<p>If you like to use the array-based parameter passing method, think that it
+is only available in rsyslog 4.1.6 and above. If you can accept that your plugin
+will not be working with previous versions, you do not need to handle pre 4.1.6 cases.
+However, it would be "nice" if you shut down yourself in these cases - otherwise the
+older rsyslog core engine will pass you a string where you expect the array of pointers,
+what most probably results in a segfault. To check whether or not the core supports the
+functionality, you can use this code sequence:
+<pre>
+<code>
+BEGINmodInit()
+	rsRetVal localRet;
+	rsRetVal (*pomsrGetSupportedTplOpts)(unsigned long *pOpts);
+	unsigned long opts;
+	int bArrayPassingSupported;		/* does core support template passing as an array? */
+CODESTARTmodInit
+	*ipIFVersProvided = CURR_MOD_IF_VERSION; /* we only support the current interface specification */
+CODEmodInit_QueryRegCFSLineHdlr
+	/* check if the rsyslog core supports parameter passing code */
+	bArrayPassingSupported = 0;
+	localRet = pHostQueryEtryPt((uchar*)"OMSRgetSupportedTplOpts", &pomsrGetSupportedTplOpts);
+	if(localRet == RS_RET_OK) {
+		/* found entry point, so let's see if core supports array passing */
+		CHKiRet((*pomsrGetSupportedTplOpts)(&opts));
+		if(opts & OMSR_TPL_AS_ARRAY)
+			bArrayPassingSupported = 1;
+	} else if(localRet != RS_RET_ENTRY_POINT_NOT_FOUND) {
+		ABORT_FINALIZE(localRet); /* Something else went wrong, what is not acceptable */
+	}
+	DBGPRINTF("omstdout: array-passing is %ssupported by rsyslog core.\n", bArrayPassingSupported ? "" : "not ");
+
+	if(!bArrayPassingSupported) {
+		DBGPRINTF("rsyslog core too old, shutting down this plug-in\n");
+		ABORT_FINALIZE(RS_RET_ERR);
+	}
+
+</code>
+</pre>
+<p>The code first checks if the core supports the OMSRgetSupportedTplOpts() API (which is
+also not present in all versions!) and, if so, queries the core if the OMSR_TPL_AS_ARRAY mode
+is supported. If either does not exits, the core is too old for this functionality. The sample
+snippet above then shuts down, but a plugin may instead just do things different. In
+omstdout, you can see how a plugin may deal with the situation.
+<p><b>In any case, it is recommended that at least a graceful shutdown is made and the
+array-passing capability not blindly be used.</b> In such cases, we can not guard the
+plugin from segfaulting and if the plugin (as currently always) is run within 
+rsyslog's process space, that results in a segfault for rsyslog. So do not do this.
+<h3>Batching of Messages</h3>
+<p>Starting with rsyslog 4.3.x, batching of output messages is supported. Previously, only
+a single-message interface was supported.
+<p>With the <b>single message</b> plugin interface, each message is passed via a separate call to the plugin.
+Most importantly, the rsyslog engine assumes that each call to the plugin is a complete transaction
+and as such assumes that messages be properly commited after the plugin returns to the engine.
+<p>With the <b>batching</b> interface, rsyslog employs something along the line of
+&quot;transactions&quot;. Obviously, the rsyslog core can not make non-transactional outputs
+to be fully transactional. But what it can is support that the output tells the core which
+messages have been commited by the output and which not yet. The core can than take care
+of those uncommited messages when problems occur. For example, if a plugin has received
+50 messages but not yet told the core that it commited them, and then returns an error state, the
+core assumes that all these 50 messages were <b>not</b> written to the output. The core then
+requeues all 50 messages and does the usual retry processing. Once the output plugin tells the 
+core that it is ready again to accept messages, the rsyslog core will provide it with these 50
+not yet commited messages again (actually, at this point, the rsyslog core no longer knows that
+it is re-submiting the messages). If, in contrary, the plugin had told rsyslog that 40 of these 50
+messages were commited (before it failed), then only 10 would have been requeued and resubmitted.
+<p>In order to provide an efficient implementation, there are some (mild) constraints in that
+transactional model: first of all, rsyslog itself specifies the ultimate transaction boundaries.
+That is, it tells the plugin when a transaction begins and when it must finish. The plugin
+is free to commit messages in between, but it <b>must</b> commit all work done when the core
+tells it that the transaction ends. All messages passed in between a begin and end transaction
+notification are called a batch of messages. They are passed in one by one, just as without
+transaction support. Note that batch sizes are variable within the range of 1 to a user configured
+maximum limit. Most importantly, that means that plugins may receive batches of single messages,
+so they are required to commit each message individually. If the plugin tries to be &quot;smarter&quot;
+than the rsyslog engine and does not commit messages in those cases (for example), the plugin
+puts message stream integrity at risk: once rsyslog has notified the plugin of transacton end,
+it discards all messages as it considers them committed and save. If now something goes wrong,
+the rsyslog core does not try to recover lost messages (and keep in mind that &quot;goes wrong&quot;
+includes such uncontrollable things like connection loss to a database server). So it is
+highly recommended to fully abide to the plugin interface details, even though you may
+think you can do it better. The second reason for that is that the core engine will 
+have configuration settings that enable the user to tune commit rate to their use-case
+specific needs. And, as a relief: why would rsyslog ever decide to use batches of one?
+There is a trivial case and that is when we have very low activity so that no queue of
+messages builds up, in which case it makes sense to commit work as it arrives.
+(As a side-note, there are some valid cases where a timeout-based commit feature makes sense.
+This is also under evaluation and, once decided, the core will offer an interface plus a way
+to preserve message stream integrity for properly-crafted plugins).
+<p>The second restriction is that if a plugin makes commits in between (what is perfectly
+legal) those commits must be in-order. So if a commit is made for message ten out of 50,
+this means that messages one to nine are also commited. It would be possible to remove
+this restriction, but we have decided to deliberately introduce it to simpify things.
+<h3>Output Plugin Transaction Interface</h3>
+<p>In order to keep compatible with existing output plugins (and because it introduces
+no complexity), the transactional plugin interface is build on the traditional
+non-transactional one. Well... actually the traditional interface was transactional
+since its introduction, in the sense that each message was processed in its own
+transaction.
+<p>So the current <code>doAction()</b> entry point can be considered to have this
+structure (from the transactional interface point of view):
+<p><pre><code>
+doAction()
+    {
+    beginTransaction()
+    ProcessMessage()
+    endTransaction()
+    }
+ </code></pre>
+<p>For the <b>transactional interface</b>, we now move these implicit <code>beginTransaction()</code>
+and <code>endTransaction(()</code> call out of the message processing body, resulting is such
+a structure:
+<p><pre><code>
+beginTransaction()
+    {
+    /* prepare for transaction */
+    }
+
+doAction()
+    {
+    ProcessMessage()
+    /* maybe do partial commits */
+    }
+
+endTransaction()
+    {
+    /* commit (rest of) batch */
+    }
+</code></pre>
+<p>And this calling structure actually is the transactional interface! It is as simple as this.
+For the new interface, the core calls a <code>beginTransaction()</code> entry point inside the
+plugin at the start of the batch. Similarly, the core call <code>endTransaction()</code> at the
+end of the batch. The plugin must implement these entry points according to its needs.
+<p>But how does the core know when to use the old or the new calling interface? This is rather
+easy: when loading a plugin, the core queries the plugin for the <code>beginTransaction()</code>
+and <code>endTransaction()</code> entry points. If the plugin supports these, the new interface is
+used. If the plugin does not support them, the old interface is used and rsyslog implies that
+a commit is done after each message. Note that there is no special "downlevel" handling
+necessary to support this. In the case of the non-transactional interface, rsyslog considers
+each completed call to <code>doAction</code> as partial commit up to the current message.
+So implementation inside the core is very straightforward.
+<p>Actually, <b>we recommend that the transactional entry points only be defined by those
+plugins that actually need them</b>. All others should not define them in which case
+the default commit behaviour inside rsyslog will apply (thus removing complexity from the
+plugin).
+<p>In order to support partial commits, special return codes must be defined for
+<code>doAction</code>. All those return codes mean that processing completed successfully.
+But they convey additional information about the commit status as follows:
+<p>
+<table border="0">
+<tr>
+<td valign="top"><i>RS_RET_OK</i></td>
+<td>The record and all previous inside the batch has been commited.
+<i>Note:</i> this definition is what makes integrating plugins without the
+transaction being/end calls so easy - this is the traditional "success" return
+state and if every call returns it, there is no need for actually calling
+<code>endTransaction()</code>, because there is no transaction open).</td>
+</tr>
+<tr>
+<td valign="top"><i>RS_RET_DEFER_COMMIT</i></td>
+<td>The record has been processed, but is not yet commited. This is the
+expected state for transactional-aware plugins.</td>
+</tr>
+<tr>
+<td valign="top"><i>RS_RET_PREVIOUS_COMMITTED</i></td>
+<td>The <b>previous</b> record inside the batch has been committed, but the
+current one not yet. This state is introduced to support sources that fill up
+buffers and commit once a buffer is completely filled. That may occur halfway
+in the next record, so it may be important to be able to tell the
+engine the everything up to the previouos record is commited</td>
+</tr>
+</table>
+<p>Note that the typical <b>calling cycle</b> is <code>beginTransaction()</code>,
+followed by <i>n</i> times
+<code>doAction()</code></n> followed by <code>endTransaction()</code>. However, if either 
+<code>beginTransaction()</code> or <code>doAction()</code> return back an error state
+(including RS_RET_SUSPENDED), then the transaction is considered aborted. In result, the
+remaining calls in this cycle (e.g. <code>endTransaction()</code>) are never made and a 
+new cycle (starting with <code>beginTransaction()</code> is begun when processing resumes.
+So an output plugin must expect and handle those partial cycles gracefully.
+<p><b>The question remains how can a plugin know if the core supports batching?</b>
+First of all, even if the engine would not know it, the plugin would return with RS_RET_DEFER_COMMIT,
+what then would be treated as an error by the engine. This would effectively disable the
+output, but cause no further harm (but may be harm enough in itself).
+<p>The real solution is to enable the plugin to query the rsyslog core if this feature is
+supported or not. At the time of the introduction of batching, no such query-interface
+exists. So we introduce it with that release. What the means is if a rsyslog core can
+not provide this query interface, it is a core that was build before batching support
+was available. So the absence of a query interface indicates that the transactional
+interface is not available. One might now be tempted the think there is no need to do
+the actual check, but is is recommended to ask the rsyslog engine explicitely if
+the transactional interface is present and will be honored. This enables us to
+create versions in the future which have, for whatever reason we do not yet know, no
+support for this interface.
+<p>The logic to do these checks is contained in the <code>INITChkCoreFeature</code> macro,
+which can be used as follows:
+<p><pre><code>
+INITChkCoreFeature(bCoreSupportsBatching, CORE_FEATURE_BATCHING);
+</code></pre>
+<p>Here, bCoreSupportsBatching is a plugin-defined integer which after execution is
+1 if batches (and thus the transational interface) is supported and 0 otherwise.
+CORE_FEATURE_BATCHING is the feature we are interested in. Future versions of rsyslog
+may contain additional feature-test-macros (you can see all of them in
+./runtime/rsyslog.h).
+<p>Note that the ompsql output plugin supports transactional mode in a hybrid way and
+thus can be considered good example code.
+
+<h2>Open Issues</h2>
+<ul>
+<li>Processing errors handling
+<li>reliable re-queue during error handling and queue termination
+</ul>
+
+
+
+<h3>Licensing</h3>
+<p>From the rsyslog point of view, plugins constitute separate projects. As such,
+we think plugins are not required to be compatible with GPLv3. However, this is
+no legal advise. If you intend to release something under a non-GPLV3 compatible license
+it is probably best to consult with your lawyer.
+<p>Most importantly, and this is definite, the rsyslog team does not expect
+or require you to contribute your plugin to the rsyslog project (but of course
+we are happy if you do).
+<h2>Copyright</h2>
+<p>Copyright (c) 2009 <a href="http://www.gerhards.net/rainer">Rainer Gerhards</a> 
+and <a href="http://www.adiscon.com/en/">Adiscon</a>.</p>
+<p>Permission is granted to copy, distribute and/or modify this document under 
+the terms of the GNU Free Documentation License, Version 1.2 or any later 
+version published by the Free Software Foundation; with no Invariant Sections, 
+no Front-Cover Texts, and no Back-Cover Texts. A copy of the license can be 
+viewed at <a href="http://www.gnu.org/copyleft/fdl.html">
+http://www.gnu.org/copyleft/fdl.html</a>.</p>
+</body>
+</html>