summaryrefslogtreecommitdiffstats
path: root/doc/property_replacer.html
blob: 367603b707ecaf54929287eef9339028c1e16f0f (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html><head><title>The Rsyslogd Property Replacer</title></head>
<body>
<a href="rsyslog_conf_templates.html">back</a>
<h1>The Property Replacer</h1>
<p><b>The property replacer is a core component in
rsyslogd's output system.</b> A syslog message has a number of
well-defined properties (see below). Each of this properties can be
accessed <b>and</b> manipulated by the property replacer.
With it, it is easy to use only part of a property value or manipulate
the value, e.g. by converting all characters to lower case.</p>
<h1>Accessing Properties</h1>
<p>Syslog message properties are used inside templates. They are
accessed by putting them between percent signs. Properties can be
modified by the property replacer. The full syntax is as follows:</p>
<blockquote><b><code>%propname:fromChar:toChar:options:fieldname%</code></b></blockquote>
<h2>Available Properties</h2>
<p><b><code>propname</code></b> is the
name of the property to access. It is case-insensitive (prior to 3.17.0, they were case-senstive).
Currently supported are:</p>
<table>
<tbody>
<tr>
<td><b>msg</b></td>
<td>the MSG part of the message (aka "the message" ;))</td>
</tr>
<tr>
<td><b>rawmsg</b></td>
<td>the message excactly as it was received from the
socket. Should be useful for debugging.</td>
</tr>
<tr>
<td><b>hostname</b></td>
<td>hostname from the message</td>
</tr>
<tr>
<td><b>source</b></td>
<td>alias for HOSTNAME</td>
</tr>
<tr>
<td><b>fromhost</b></td>
<td>hostname of the system the message was received from
(in a relay chain, this is the system immediately in front of us and
not necessarily the original sender). This is a DNS-resolved name, except
if that is not possible or DNS resolution has been disabled.</td>
</tr>
<tr>
<td><b>fromhost-ip</b></td>
<td>The same as fromhost, but alsways as an IP address. Local inputs
(like imklog) use 127.0.0.1 in this property.</td>
</tr>
<tr>
<td><b>syslogtag</b></td>
<td>TAG from the message</td>
</tr>
<tr>
<td><b>programname</b></td>
<td>the "static" part of the tag, as defined by
BSD syslogd. For example, when TAG is "named[12345]", programname is
"named".</td>
</tr>
<tr>
<td><b>pri</b></td>
<td>PRI part of the message - undecoded (single value)</td>
</tr>
<tr>
<td><b>pri-text</b></td>
<td>the PRI part of the message in textual form (e.g.  "syslog.info")</td>
</tr>
<tr>
<td><b>iut</b></td>
<td>the monitorware InfoUnitType - used when talking
to a <a href="http://www.monitorware.com">MonitorWare</a>
backend (also for <a href="http://www.phplogcon.org/">phpLogCon</a>)</td>
</tr>
<tr>
<td><b>syslogfacility</b></td>
<td>the facility from the message - in numerical form</td>
</tr>
<tr>
<td><b>syslogfacility-text</b></td>
<td>the facility from the message - in text form</td>
</tr>
<tr>
<td><b>syslogseverity</b></td>
<td>severity from the message - in numerical form</td>
</tr>
<tr>
<td><b>syslogseverity-text</b></td>
<td>severity from the message - in text form</td>
</tr>
<tr>
<td><b>syslogpriority</b></td>
<td>an alias for syslogseverity - included for historical
reasons (be careful: it still is the severity, not PRI!)</td>
</tr>
<tr>
<td><b>syslogpriority-text</b></td>
<td>an alias for syslogseverity-text</td>
</tr>
<tr>
<td><b>timegenerated</b></td>
<td>timestamp when the message was RECEIVED. Always in high
resolution</td>
</tr>
<tr>
<td><b>timereported</b></td>
<td>timestamp from the message. Resolution depends on
what was provided in the message (in most cases,
only seconds)</td>
</tr>
<tr>
<td><b>timestamp</b></td>
<td>alias for timereported</td>
</tr>
<tr>
<td><b>protocol-version</b></td>
<td>The contents of the PROTCOL-VERSION field from IETF
draft draft-ietf-syslog-protcol</td>
</tr>
<tr>
<td><b>structured-data</b></td>
<td>The contents of the STRUCTURED-DATA field from IETF
draft draft-ietf-syslog-protocol</td>
</tr>
<tr>
<td><b>app-name</b></td>
<td>The contents of the APP-NAME field from IETF draft
draft-ietf-syslog-protocol</td>
</tr>
<tr>
<td><b>procid</b></td>
<td>The contents of the PROCID field from IETF draft
draft-ietf-syslog-protocol</td>
</tr>
<tr>
<td><b>msgid</b></td>
<td>The contents of the MSGID field from
IETF draft draft-ietf-syslog-protocol</td>
</tr>
<tr>
<td><b>parsesuccess</b></td>
<td>This returns the status of the <b>last</b> called higher level parser,
like mmjsonparse. A higher level parser parses the actual message for additional
structured data and maintains an extra property table while doing so (this is
often referred to as "cee data" because the idea was originally rooted in the
cee effort, only (but has been extended since then). Note that higher level
parsers must explicitely support (and set) this property. So, depending on the
parser, it may not be set correctly.
<br>If the parser properly supports it, the value "OK" means that parsing was
successfull, while "FAIL" means the parser could not successfully obtain any data.
Failure state is not necessarily an error. For example, it may simple indicate
that the cee-enhanced syslog parser (mmjsonparse) did not detect cee-enhanced format,
what can be totally valid. Using this property, further processing of the message
can be directed based on this parsing outcome. If no parser has been called at the
time this property is accessed, it will contain "FAIL".
<br><b>This property is available since version 6.3.8.</b>
</td>
</tr>
<td><b>inputname</b></td>
<td>The name of the input module that generated the
message (e.g. "imuxsock", "imudp"). Note that not all modules
necessarily provide this property. If not provided, it is an
empty string. Also note that the input module may provide
any value of its liking. Most importantly, it is <b>not</b>
necessarily the module input name. Internal sources can also
provide inputnames. Currently, "rsyslogd" is defined as inputname
for messages internally generated by rsyslogd, for example startup
and shutdown and error messages.
This property is considered useful when trying to filter messages
based on where they originated - e.g. locally generated messages
("rsyslogd", "imuxsock", "imklog") should go to a different place
than messages generated somewhere.
</td>
</tr>
<tr>
<td><b>$bom</b></td>
<td>The UTF-8 encoded Unicode byte-order mask (BOM). This may be useful
in templates for RFC5424 support, when the character set is know to be
Unicode.</td>
</tr>
<td><b>$uptime</b></td>
<td>system-uptime in seconds (as reported by operating system).
</td>
</tr>
<tr>
<td><b>$now</b></td>
<td>The current date stamp in the format YYYY-MM-DD</td>
</tr>
<tr>
<td><b>$year</b></td>
<td>The current year (4-digit)</td>
</tr>
<tr>
<td><b>$month</b></td>
<td>The current month (2-digit)</td>
</tr>
<tr>
<td><b>$day</b></td>
<td>The current day of the month (2-digit)</td>
</tr>
<tr>
<td><b>$hour</b></td>
<td>The current hour in military (24 hour) time (2-digit)</td>
</tr>
<tr>
<td><b>$hhour</b></td>
<td>The current half hour we are in. From minute 0 to 29,
this is always 0 while
from 30 to 59 it is always 1.</td>
</tr>
<tr>
<td><b>$qhour</b></td>
<td>The current quarter hour we are in. Much like $HHOUR, but values
range from 0 to 3 (for the four quater hours that are in each hour)</td>
</tr>
<tr>
<td><b>$minute</b></td>
<td>The current minute (2-digit)</td>
</tr>
<tr>
<td><b>$myhostname</b></td>
<td>The name of the current host as it knows itself (probably useful
for filtering in a generic way)</td>
</tr>
<tr>
<td><b>$!&lt;name&gt;</b></td>
<td>This is the "bridge" to syslog message normalization (via
<a href="mmnormalize.html">mmnormalize</a>): name is a name defined
inside the normalization rule. It has the value selected by the rule
or none if no rule with this field did match.
</td>
</tr>
</tbody>
</table>
<p>Properties starting with a $-sign are so-called system
properties. These do NOT stem from the message but are rather
internally-generated.</p>
<h2>Character Positions</h2>
<p><b><code>FromChar</code></b> and <b><code>toChar</code></b>
are used to build substrings. They specify the offset within the string
that should be copied. Offset counting starts at 1, so if you need to
obtain the first 2 characters of the message text, you can use this
syntax: "%msg:1:2%". If you do not whish to specify from and to, but
you want to specify options, you still need to include the colons. For
example, if you would like to convert the full message text to lower
case, use "%msg:::lowercase%". If you would like to extract from a
position until the end of the string, you can place a dollar-sign ("$")
in toChar (e.g. %msg:10:$%, which will extract from position 10 to the
end of the string).</p>
<p>There is also support for <b>regular expressions</b>.
To use them, you need to place a "R" into FromChar. This tells rsyslog
that a regular expression instead of position-based extraction is
desired. The actual regular expression must then be provided in toChar.
The regular expression <b>must</b> be followed by the
string "--end". It denotes the end of the regular expression and will
not become part of it. If you are using regular expressions, the
property replacer will return the part of the property text that
matches the regular expression. An example for a property replacer
sequence with a regular expression is: "%msg:R:.*Sev:. \(.*\)
\[.*--end%"</p>
<p>It is possible to specify some parametes after the "R". These are
comma-separated. They are:
<p>R,&lt;regexp-type&gt;,&lt;submatch&gt;,&lt;<a href="rsyslog_conf_nomatch.html">nomatch</a>&gt;,&lt;match-number&gt;
<p>regexp-type is either "BRE" for Posix basic regular expressions or
"ERE" for extended ones. The string must be given in upper case. The
default is "BRE" to be consistent with earlier versions of rsyslog that
did not support ERE. The submatch identifies the submatch to be used
with the result. A single digit is supported. Match 0 is the full match,
while 1 to 9 are the acutal submatches. The match-number identifies which match to
use, if the expression occurs more than once inside the string. Please note
that the first match is number 0, the second 1 and so on. Up to 10 matches
(up to number 9) are supported. Please note that it would be more
natural to have the match-number in front of submatch, but this would break 
backward-compatibility. So the match-number must be specified after "nomatch".
<p><a href="rsyslog_conf_nomatch.html">nomatch</a> specifies what should
be used in case no match is found.
<p>The following is a sample of an ERE expression that takes the first
submatch from the message string and replaces the expression with
the full field if no match is found:
<p>%msg:R,ERE,1,FIELD:for (vlan[0-9]*):--end%
<p>and this takes the first submatch of the second match of said expression:
<p>%msg:R,ERE,1,FIELD,1:for (vlan[0-9]*):--end%
<p><b>Please note: there is also a
<a href="http://www.rsyslog.com/tool-regex">rsyslog regular expression checker/generator</a>
online tool available.</b> With that tool, you can check your regular expressions and
also generate a valid property replacer sequence. Usage of this tool is recommended.
Depending on the version offered, the tool may not cover all subleties that can
be done with the property replacer. It concentrates on the most often used cases. So it
is still useful to hand-craft expressions for demanding environments.
<p><b>Also, extraction can be done based on so-called
"fields"</b>. To do so, place a "F" into FromChar. A field in its
current definition is anything that is delimited by a delimiter
character. The delimiter by default is TAB (US-ASCII value 9). However,
if can be changed to any other US-ASCII character by specifying a comma
and the <b>decimal</b> US-ASCII value of the delimiter
immediately after the "F". For example, to use comma (",") as a
delimiter, use this field specifier: "F,44".&nbsp; If your syslog
data is delimited, this is a quicker way to extract than via regular
expressions (actually, a *much* quicker way). Field counting starts at
1. Field zero is accepted, but will always lead to a "field not found"
error. The same happens if a field number higher than the number of
fields in the property is requested. The field number must be placed in
the "ToChar" parameter. An example where the 3rd field (delimited by
TAB) from the msg property is extracted is as follows: "%msg:F:3%". The
same example with semicolon as delimiter is "%msg:F,59:3%".</p>
<p>The use of fields does not permit to select substrings, what is rather
unfortunate. To solve this issue, starting with 6.3.9, fromPos and toPos
can be specified for strings as well. However, the syntax is quite ugly, but
it was the only way to integrate this functonality into the already-existing
system. To do so, use ",fromPos" and ",toPos" during field extraction.
Let's assume you want to extract the substring from position 5 to 9 in the previous
example. Then, the syntax is as follows: "%msg:F,59,5:3,9%". As you can see,
"F,59" means field-mode, with semicolon delimiter and ",5" means starting 
at position 5. Then "3,9" means field 3 and string extraction to position 9.
<p>Please note that the special characters "F" and "R" are
case-sensitive. Only upper case works, lower case will return an error.
There are no white spaces permitted inside the sequence (that will lead
to error messages and will NOT provide the intended result).</p>
<p>Each occurence of the field delimiter starts a new field. However, 
if you add a plus sign ("+") after the field delimiter, multiple
delimiters, one  immediately after the others, are treated as separate
fields. This can be useful in cases where the syslog message contains
such sequences. A frequent case may be with code that is written as
follows:</p>
<code><pre>
int n, m;
...
syslog(LOG_ERR, "%d test %6d", n, m);
</pre></code>
<p>This will result into things like this in syslog messages:
"1 test&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;2", 
"1 test&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;23", 
"1 test&nbsp;&nbsp;234567" 
<p>As you can see, the fields are delimited by space characters, but
their exact number is unknown. They can properly be extracted as follows:
<p>
"%msg:F,32:2%" to "%msg:F,32+:2%".
<p>This feature was suggested by Zhuang Yuyao and implemented by him.
It is modeled after perl compatible regular expressions.
</p>

<h2>Property Options</h2>
<b><code>property options</code></b> are
case-insensitive. Currently, the following options are defined:
<p></p>
<table>
<tbody>
<tr>
<td><b>uppercase</b></td>
<td>convert property to lowercase only</td>
</tr>
<tr>
<td><b>lowercase</b></td>
<td>convert property text to uppercase only</td>
</tr>
<tr>
<td><b>json</b></td>
<td>encode the value so that it can be used inside a JSON field. This means
that several characters (according to the JSON spec) are being escaped, for 
example US-ASCII LF is replaced by "\n".
The json option cannot be used together with either jsonf or csv options.
</td>
</tr>
<tr>
<td><b>jsonf</b></td>
<td><i>(available in 6.3.9+)</i>
This signifies that the property should be expressed as a json <b>f</b>ield.
That means not only the property is written, but rather a complete json field in
the format<br>
"fieldname":"value"</b>
where "filedname" is the assigend field name (or the property name if none was assigned)
and value is the end result of property replacer operation. Note that value supports
all property replacer options, like substrings, case converson and the like.
Values are properly json-escaped. However, field names are (currently) not. It is
expected that proper field names are configured.
The jsonf option cannot be used together with either json or csv options.
</td>
</tr>
<tr>
<td valign="top"><b>csv</b></td>
<td>formats the resulting field (after all modifications) in CSV format
as specified in <a href="http://www.ietf.org/rfc/rfc4180.txt">RFC 4180</a>.
Rsyslog will always use double quotes. Note that in order to have full CSV-formatted
text, you need to define a proper template. An example is this one:
<br>$template csvline,"%syslogtag:::csv%,%msg:::csv%"
<br>Most importantly, you need to provide the commas between the fields
inside the template.
The csv option cannot be used together with either json or jsonf options.
<br><i>This feature was introduced in rsyslog 4.1.6.</i>
</td>
</tr>
<tr>
<td><b>drop-last-lf</b></td>
<td>The last LF in the message (if any), is dropped.
Especially useful for PIX.</td>
</tr>
<tr>
<td><b>date-mysql</b></td>
<td>format as mysql date</td>
</tr>
<tr>
<td><b>date-rfc3164</b></td>
<td>format as RFC 3164 date</td>
</tr>
<tr>
<tr>
<td valign="top"><b>date-rfc3164-buggyday</b></td>
<td>similar to date-rfc3164, but emulates a common coding error: RFC 3164 demands
that a space is written for single-digit days. With this option, a zero is
written instead. This format seems to be used by syslog-ng and the
date-rfc3164-buggyday option can be used in migration scenarios where otherwise
lots of scripts would need to be adjusted. It is recommended <i>not</i> to use this
option when forwarding to remote hosts - they may treat the date as invalid
(especially when parsing strictly according to RFC 3164).</td>
<br><i>This feature was introduced in rsyslog 4.6.2 and v4 versions above and
5.5.3 and all versions above.</i>
</tr>
<tr>
<td><b>date-rfc3339</b></td>
<td>format as RFC 3339 date</td>
</tr>
<tr>
<td><b>date-unixtimestamp</b></td>
<td>format as unix timestamp (seconds since epoch)</td>
</tr>
<tr>
<td><b>date-subseconds</b></td>
<td>just the subseconds of a timestamp (always 0 for a low precision timestamp)</td>
</tr>
<tr>
<td valign="top"><b>escape-cc</b></td>
<td>replace control characters (ASCII value 127 and values
less then 32) with an escape sequence. The sequnce is
"#&lt;charval&gt;" where charval is the 3-digit decimal value
of the control character. For example, a tabulator would be replaced by
"#009".<br>
Note: using this option requires that <a href="rsconf1_escapecontrolcharactersonreceive.html">$EscapeControlCharactersOnReceive</a>
is set to off.</td>
</tr>
<tr>
<td valign="top"><b>space-cc</b></td>
<td>replace control characters by spaces<br>
Note: using this option requires that <a href="rsconf1_escapecontrolcharactersonreceive.html">$EscapeControlCharactersOnReceive</a>
is set to off.</td>
</tr>
<tr>
<td valign="top"><b>drop-cc</b></td>
<td>drop control characters - the resulting string will
neither contain control characters, escape sequences nor any other
replacement character like space.<br>
Note: using this option requires that <a href="rsconf1_escapecontrolcharactersonreceive.html">$EscapeControlCharactersOnReceive</a>
is set to off.</td>
</tr>
<tr>
<td valign="top"><b>sp-if-no-1st-sp</b></td>
<td>This option looks scary and should probably not be used by a user. For any field
given, it returns either a single space character or no character at all. Field content
is never returned. A space is returned if (and only if) the first character of the
field's content is NOT a space. This option is kind of a hack to solve a problem rooted
in RFC 3164: 3164 specifies no delimiter between the syslog tag sequence and the actual
message text. Almost all implementation in fact delemit the two by a space. As of
RFC 3164, this space is part of the message text itself. This leads to a problem when
building the message (e.g. when writing to disk or forwarding). Should a delimiting
space be included if the message does not start with one? If not, the tag is immediately
followed by another non-space character, which can lead some log parsers to misinterpret
what is the tag and what the message. The problem finally surfaced when the klog module
was restructured and the tag correctly written. It exists with other message sources,
too. The solution was the introduction of this special property replacer option. Now,
the default template can contain a conditional space, which exists only if the
message does not start with one. While this does not solve all issues, it should
work good enough in the far majority of all cases. If you read this text and have
no idea of what it is talking about - relax: this is a good indication you will never
need this option. Simply forget about it ;)
</td>
</tr>
<tr>
<td valign="top"><b>secpath-drop</b></td>
<td>Drops slashes inside the field (e.g. "a/b" becomes "ab").
Useful for secure pathname generation (with dynafiles).
</td>
</tr>
<tr>
<td valign="top"><b>secpath-replace</b></td>
<td>Replace slashes inside the field by an underscore. (e.g. "a/b" becomes "a_b").
Useful for secure pathname generation (with dynafiles).
</td>
</tr>
<tr>
<td><b>optional-field</b></td>
<td>In templates that are used for building field lists (in particular, ommongodb), completely remove this field if the corresponding property is not present.  Currently implemented only for the <b>$!&lt;name&gt;</b> properties.</td>
</tr>
</tbody>
</table>
<p>To use multiple options, simply place them one after each other with a comma delmimiting
them. For example "escape-cc,sp-if-no-1st-sp". If you use conflicting options together,
the last one will override the previous one. For example, using "escape-cc,drop-cc" will
use drop-cc and "drop-cc,escape-cc" will use escape-cc mode.
<h2>Fieldname</h2>
<p><i>(available in 6.3.9+)</i>
<p>This field permits to specify a field name for structured-data emitting property replacer
options. It was initially introduced to support the "jsonf" option, for which it provides
the capability to set an alternative field name. If it is not specified, it defaults to 
the property name.
<h2>Further Links</h2>
<ul>
<li>Article on "<a href="rsyslog_recording_pri.html">Recording
the Priority of Syslog Messages</a>" (describes use of templates
to record severity and facility of a message)</li>
<li><a href="rsyslog_conf.html">Configuration file
format</a>, this is where you actually use the property replacer.</li>
</ul>
<p>[<a href="manual.html">manual index</a>]
[<a href="rsyslog_conf.html">rsyslog.conf</a>]
[<a href="http://www.rsyslog.com/">rsyslog site</a>]</p>
<p><font size="2">This documentation is part of the
<a href="http://www.rsyslog.com/">rsyslog</a> project.<br>
Copyright &copy; 2008, 2009 by <a href="http://www.gerhards.net/rainer">Rainer Gerhards</a> and
<a href="http://www.adiscon.com/">Adiscon</a>. Released under the GNU GPL
version 2 or higher.</font></p>

</body></html>