diff options
Diffstat (limited to 'doc/dev_oplugins.html')
-rw-r--r-- | doc/dev_oplugins.html | 331 |
1 files changed, 331 insertions, 0 deletions
diff --git a/doc/dev_oplugins.html b/doc/dev_oplugins.html new file mode 100644 index 00000000..63c186a3 --- /dev/null +++ b/doc/dev_oplugins.html @@ -0,0 +1,331 @@ +<html> +<head> +<title>writing rsyslog output plugins (developer's guide)</title> +</head> +<body> +<h1>Writing Rsyslog Output Plugins</h1> +<p>This page is the begin of some developer documentation for writing output +plugins. Doing so is quite easy (and that was a design goal), but there currently +is only sparse documentation on the process available. I was tempted NOT to +write this guide here because I know I will most probably not be able to +write a complete guide. +<p>However, I finally concluded that it may be better to have same information +and pointers than to have nothing. +<h2>Getting Started and Samples</h2> +<p>The best to get started with rsyslog plugin development is by looking at +existing plugins. All that start with "om" are <b>o</b>utput <b>m</b>odules. That +means they are primarily thought of being message sinks. In theory, however, output +plugins may aggergate other functionality, too. Nobody has taken this route so far +so if you would like to do that, it is highly suggested to post your plan on the +rsyslog mailing list, first (so that we can offer advise). +<p>The rsyslog distribution tarball contains two plugins that are extremely well +targeted for getting started: +<ul> +<li>omtemplate +<li>omstdout +</ul> +Plugin omtemplate was specifically created to provide a copy template for new output +plugins. It is bare of real functionality but has ample comments. Even if you decide +to start from another plugin (or even from scratch), be sure to read omtemplate source +and comments first. The omstdout is primarily a testing aide, but offers support for +the two different parameter-passing conventions plugins can use (plus the way to +differentiate between the two). It also is not bare of functionaly, only mostly +bare of it ;). But you can actually execute it and play with it. +<p>In any case, you should also read the comments in ./runtime/module-template.h. +Output plugins are build based on a large set of code-generating macros. These +macros handle most of the plumbing needed by the interface. As long as no +special callback to rsyslog is needed (it typically is not), an output plugin does +not really need to be aware that it is executed by rsyslog. As a plug-in programmer, +you can (in most cases) "code as usual". However, all macros and entry points need to be +provided and thus reading the code comments in the files mentioned is highly suggested. +<p>In short, the best idea is to start with a template. Let's assume you start by +copying omtemplate. Then, the basic steps you need to do are: +<ul> +<li>cp ./plugins/omtemplate ./plugins/your-plugin +<li>mv cd ./plugins/your-plugin +<li>vi Makefile.am, adjust to your-plugin +<li>mv omtemplate.c your-plugin.c +<li>cd ../.. +<li>vi Makefile.am configure.ac +<br>search for omtemplate, copy and modify (follow comments) +</ul> +<p>Basically, this is all you need to do ... Well, except, of course, coding +your plugin ;). For testing, you need rsyslog's debugging support. Some useful +information is given in "<a href="troubleshoot.html">troubleshooting rsyslog</a> +from the doc set. +<h2>Special Topics</h2> +<h3>Threading</h3> +<p>Rsyslog uses massive parallel processing and multithreading. However, a plugin's entry +points are guaranteed to be never called concurrently <b>for the same action</b>. +That means your plugin must be able to be called concurrently by two or more +threads, but you can be sure that for the same instance no concurrent calls +happen. This is guaranteed by the interface specification and the rsyslog core +guards against multiple concurrent calls. An instance, in simple words, is one +that shares a single instanceData structure. +<p>So as long as you do not mess around with global data, you do not need +to think about multithreading (and can apply a purely sequential programming +methodology). +<p>Please note that duringt the configuraton parsing stage of execution, access to +global variables for the configuration system is safe. In that stage, the core will +only call sequentially into the plugin. +<h3>Getting Message Data</h3> +<p>The doAction() entry point of your plugin is provided with messages to be processed. +It will only be activated after filtering and all other conditions, so you do not need +to apply any other conditional but can simply process the message. +<p>Note that you do NOT receive the full internal representation of the message +object. There are various (including historical) reasons for this and, among +others, this is a design decision based on security. +<p>Your plugin will only receive what the end user has configured in a $template +statement. However, starting with 4.1.6, there are two ways of receiving the +template content. The default mode, and in most cases sufficient and optimal, +is to receive a single string with the expanded template. As I said, this is usually +optimal, think about writing things to files, emailing content or forwarding it. +<p>The important philosophy is that a plugin should <b>never</b> reformat any +of such strings - that would either remove the user's ability to fully control +message formats or it would lead to duplicating code that is already present in the +core. If you need some formatting that is not yet present in the core, suggest it +to the rsyslog project, best done by sending a patch ;), and we will try hard to +get it into the core (so far, we could accept all such suggestions - no promise, though). +<p>If a single string seems not suitable for your application, the plugin can also +request access to the template components. The typical use case seems to be databases, where +you would like to access properties via specific fields. With that mode, you receive a +char ** array, where each array element points to one field from the template (from +left to right). Fields start at arrray index 0 and a NULL pointer means you have +reached the end of the array (the typical Unix "poor man's linked list in an array" +design). Note, however, that each of the individual components is a string. It is +not a date stamp, number or whatever, but a string. This is because rsyslog processes +strings (from a high-level design look at it) and so this is the natural data type. +Feel free to convert to whatever you need, but keep in mind that malformed packets +may have lead to field contents you'd never expected... +<p>If you like to use the array-based parameter passing method, think that it +is only available in rsyslog 4.1.6 and above. If you can accept that your plugin +will not be working with previous versions, you do not need to handle pre 4.1.6 cases. +However, it would be "nice" if you shut down yourself in these cases - otherwise the +older rsyslog core engine will pass you a string where you expect the array of pointers, +what most probably results in a segfault. To check whether or not the core supports the +functionality, you can use this code sequence: +<pre> +<code> +BEGINmodInit() + rsRetVal localRet; + rsRetVal (*pomsrGetSupportedTplOpts)(unsigned long *pOpts); + unsigned long opts; + int bArrayPassingSupported; /* does core support template passing as an array? */ +CODESTARTmodInit + *ipIFVersProvided = CURR_MOD_IF_VERSION; /* we only support the current interface specification */ +CODEmodInit_QueryRegCFSLineHdlr + /* check if the rsyslog core supports parameter passing code */ + bArrayPassingSupported = 0; + localRet = pHostQueryEtryPt((uchar*)"OMSRgetSupportedTplOpts", &pomsrGetSupportedTplOpts); + if(localRet == RS_RET_OK) { + /* found entry point, so let's see if core supports array passing */ + CHKiRet((*pomsrGetSupportedTplOpts)(&opts)); + if(opts & OMSR_TPL_AS_ARRAY) + bArrayPassingSupported = 1; + } else if(localRet != RS_RET_ENTRY_POINT_NOT_FOUND) { + ABORT_FINALIZE(localRet); /* Something else went wrong, what is not acceptable */ + } + DBGPRINTF("omstdout: array-passing is %ssupported by rsyslog core.\n", bArrayPassingSupported ? "" : "not "); + + if(!bArrayPassingSupported) { + DBGPRINTF("rsyslog core too old, shutting down this plug-in\n"); + ABORT_FINALIZE(RS_RET_ERR); + } + +</code> +</pre> +<p>The code first checks if the core supports the OMSRgetSupportedTplOpts() API (which is +also not present in all versions!) and, if so, queries the core if the OMSR_TPL_AS_ARRAY mode +is supported. If either does not exits, the core is too old for this functionality. The sample +snippet above then shuts down, but a plugin may instead just do things different. In +omstdout, you can see how a plugin may deal with the situation. +<p><b>In any case, it is recommended that at least a graceful shutdown is made and the +array-passing capability not blindly be used.</b> In such cases, we can not guard the +plugin from segfaulting and if the plugin (as currently always) is run within +rsyslog's process space, that results in a segfault for rsyslog. So do not do this. +<h3>Batching of Messages</h3> +<p>Starting with rsyslog 4.3.x, batching of output messages is supported. Previously, only +a single-message interface was supported. +<p>With the <b>single message</b> plugin interface, each message is passed via a separate call to the plugin. +Most importantly, the rsyslog engine assumes that each call to the plugin is a complete transaction +and as such assumes that messages be properly commited after the plugin returns to the engine. +<p>With the <b>batching</b> interface, rsyslog employs something along the line of +"transactions". Obviously, the rsyslog core can not make non-transactional outputs +to be fully transactional. But what it can is support that the output tells the core which +messages have been commited by the output and which not yet. The core can than take care +of those uncommited messages when problems occur. For example, if a plugin has received +50 messages but not yet told the core that it commited them, and then returns an error state, the +core assumes that all these 50 messages were <b>not</b> written to the output. The core then +requeues all 50 messages and does the usual retry processing. Once the output plugin tells the +core that it is ready again to accept messages, the rsyslog core will provide it with these 50 +not yet commited messages again (actually, at this point, the rsyslog core no longer knows that +it is re-submiting the messages). If, in contrary, the plugin had told rsyslog that 40 of these 50 +messages were commited (before it failed), then only 10 would have been requeued and resubmitted. +<p>In order to provide an efficient implementation, there are some (mild) constraints in that +transactional model: first of all, rsyslog itself specifies the ultimate transaction boundaries. +That is, it tells the plugin when a transaction begins and when it must finish. The plugin +is free to commit messages in between, but it <b>must</b> commit all work done when the core +tells it that the transaction ends. All messages passed in between a begin and end transaction +notification are called a batch of messages. They are passed in one by one, just as without +transaction support. Note that batch sizes are variable within the range of 1 to a user configured +maximum limit. Most importantly, that means that plugins may receive batches of single messages, +so they are required to commit each message individually. If the plugin tries to be "smarter" +than the rsyslog engine and does not commit messages in those cases (for example), the plugin +puts message stream integrity at risk: once rsyslog has notified the plugin of transacton end, +it discards all messages as it considers them committed and save. If now something goes wrong, +the rsyslog core does not try to recover lost messages (and keep in mind that "goes wrong" +includes such uncontrollable things like connection loss to a database server). So it is +highly recommended to fully abide to the plugin interface details, even though you may +think you can do it better. The second reason for that is that the core engine will +have configuration settings that enable the user to tune commit rate to their use-case +specific needs. And, as a relief: why would rsyslog ever decide to use batches of one? +There is a trivial case and that is when we have very low activity so that no queue of +messages builds up, in which case it makes sense to commit work as it arrives. +(As a side-note, there are some valid cases where a timeout-based commit feature makes sense. +This is also under evaluation and, once decided, the core will offer an interface plus a way +to preserve message stream integrity for properly-crafted plugins). +<p>The second restriction is that if a plugin makes commits in between (what is perfectly +legal) those commits must be in-order. So if a commit is made for message ten out of 50, +this means that messages one to nine are also commited. It would be possible to remove +this restriction, but we have decided to deliberately introduce it to simpify things. +<h3>Output Plugin Transaction Interface</h3> +<p>In order to keep compatible with existing output plugins (and because it introduces +no complexity), the transactional plugin interface is build on the traditional +non-transactional one. Well... actually the traditional interface was transactional +since its introduction, in the sense that each message was processed in its own +transaction. +<p>So the current <code>doAction()</b> entry point can be considered to have this +structure (from the transactional interface point of view): +<p><pre><code> +doAction() + { + beginTransaction() + ProcessMessage() + endTransaction() + } + </code></pre> +<p>For the <b>transactional interface</b>, we now move these implicit <code>beginTransaction()</code> +and <code>endTransaction(()</code> call out of the message processing body, resulting is such +a structure: +<p><pre><code> +beginTransaction() + { + /* prepare for transaction */ + } + +doAction() + { + ProcessMessage() + /* maybe do partial commits */ + } + +endTransaction() + { + /* commit (rest of) batch */ + } +</code></pre> +<p>And this calling structure actually is the transactional interface! It is as simple as this. +For the new interface, the core calls a <code>beginTransaction()</code> entry point inside the +plugin at the start of the batch. Similarly, the core call <code>endTransaction()</code> at the +end of the batch. The plugin must implement these entry points according to its needs. +<p>But how does the core know when to use the old or the new calling interface? This is rather +easy: when loading a plugin, the core queries the plugin for the <code>beginTransaction()</code> +and <code>endTransaction()</code> entry points. If the plugin supports these, the new interface is +used. If the plugin does not support them, the old interface is used and rsyslog implies that +a commit is done after each message. Note that there is no special "downlevel" handling +necessary to support this. In the case of the non-transactional interface, rsyslog considers +each completed call to <code>doAction</code> as partial commit up to the current message. +So implementation inside the core is very straightforward. +<p>Actually, <b>we recommend that the transactional entry points only be defined by those +plugins that actually need them</b>. All others should not define them in which case +the default commit behaviour inside rsyslog will apply (thus removing complexity from the +plugin). +<p>In order to support partial commits, special return codes must be defined for +<code>doAction</code>. All those return codes mean that processing completed successfully. +But they convey additional information about the commit status as follows: +<p> +<table border="0"> +<tr> +<td valign="top"><i>RS_RET_OK</i></td> +<td>The record and all previous inside the batch has been commited. +<i>Note:</i> this definition is what makes integrating plugins without the +transaction being/end calls so easy - this is the traditional "success" return +state and if every call returns it, there is no need for actually calling +<code>endTransaction()</code>, because there is no transaction open).</td> +</tr> +<tr> +<td valign="top"><i>RS_RET_DEFER_COMMIT</i></td> +<td>The record has been processed, but is not yet commited. This is the +expected state for transactional-aware plugins.</td> +</tr> +<tr> +<td valign="top"><i>RS_RET_PREVIOUS_COMMITTED</i></td> +<td>The <b>previous</b> record inside the batch has been committed, but the +current one not yet. This state is introduced to support sources that fill up +buffers and commit once a buffer is completely filled. That may occur halfway +in the next record, so it may be important to be able to tell the +engine the everything up to the previouos record is commited</td> +</tr> +</table> +<p>Note that the typical <b>calling cycle</b> is <code>beginTransaction()</code>, +followed by <i>n</i> times +<code>doAction()</code></n> followed by <code>endTransaction()</code>. However, if either +<code>beginTransaction()</code> or <code>doAction()</code> return back an error state +(including RS_RET_SUSPENDED), then the transaction is considered aborted. In result, the +remaining calls in this cycle (e.g. <code>endTransaction()</code>) are never made and a +new cycle (starting with <code>beginTransaction()</code> is begun when processing resumes. +So an output plugin must expect and handle those partial cycles gracefully. +<p><b>The question remains how can a plugin know if the core supports batching?</b> +First of all, even if the engine would not know it, the plugin would return with RS_RET_DEFER_COMMIT, +what then would be treated as an error by the engine. This would effectively disable the +output, but cause no further harm (but may be harm enough in itself). +<p>The real solution is to enable the plugin to query the rsyslog core if this feature is +supported or not. At the time of the introduction of batching, no such query-interface +exists. So we introduce it with that release. What the means is if a rsyslog core can +not provide this query interface, it is a core that was build before batching support +was available. So the absence of a query interface indicates that the transactional +interface is not available. One might now be tempted the think there is no need to do +the actual check, but is is recommended to ask the rsyslog engine explicitely if +the transactional interface is present and will be honored. This enables us to +create versions in the future which have, for whatever reason we do not yet know, no +support for this interface. +<p>The logic to do these checks is contained in the <code>INITChkCoreFeature</code> macro, +which can be used as follows: +<p><pre><code> +INITChkCoreFeature(bCoreSupportsBatching, CORE_FEATURE_BATCHING); +</code></pre> +<p>Here, bCoreSupportsBatching is a plugin-defined integer which after execution is +1 if batches (and thus the transational interface) is supported and 0 otherwise. +CORE_FEATURE_BATCHING is the feature we are interested in. Future versions of rsyslog +may contain additional feature-test-macros (you can see all of them in +./runtime/rsyslog.h). +<p>Note that the ompsql output plugin supports transactional mode in a hybrid way and +thus can be considered good example code. + +<h2>Open Issues</h2> +<ul> +<li>Processing errors handling +<li>reliable re-queue during error handling and queue termination +</ul> + + + +<h3>Licensing</h3> +<p>From the rsyslog point of view, plugins constitute separate projects. As such, +we think plugins are not required to be compatible with GPLv3. However, this is +no legal advise. If you intend to release something under a non-GPLV3 compatible license +it is probably best to consult with your lawyer. +<p>Most importantly, and this is definite, the rsyslog team does not expect +or require you to contribute your plugin to the rsyslog project (but of course +we are happy if you do). +<h2>Copyright</h2> +<p>Copyright (c) 2009 <a href="http://www.gerhards.net/rainer">Rainer Gerhards</a> +and <a href="http://www.adiscon.com/en/">Adiscon</a>.</p> +<p>Permission is granted to copy, distribute and/or modify this document under +the terms of the GNU Free Documentation License, Version 1.2 or any later +version published by the Free Software Foundation; with no Invariant Sections, +no Front-Cover Texts, and no Back-Cover Texts. A copy of the license can be +viewed at <a href="http://www.gnu.org/copyleft/fdl.html"> +http://www.gnu.org/copyleft/fdl.html</a>.</p> +</body> +</html> |