The property replacer is a core component in rsyslogd's output system. A syslog message has a number of well-defined properties (see below). Each of this properties can be accessed and manipulated by the property replacer. With it, it is easy to use only part of a property value or manipulate the value, e.g. by converting all characters to lower case.
Syslog message properties are used inside templates. They are accessed by putting them between percent signs. Properties can be modified by the property replacer. The full syntax is as follows:
%propname:fromChar:toChar:options%
propname
is the
name of the property to access. It is case-insensitive (prior to 3.17.0, they were case-senstive).
Currently supported are:
msg | the MSG part of the message (aka "the message" ;)) |
rawmsg | the message excactly as it was received from the socket. Should be useful for debugging. |
uxtradmsg | will disappear soon - do NOT use! |
hostname | hostname from the message |
source | alias for HOSTNAME |
fromhost | hostname of the system the message was received from (in a relay chain, this is the system immediately in front of us and not necessarily the original sender). This is a DNS-resolved name, except if that is not possible or DNS resolution has been disabled. |
fromhost-ip | The same as fromhost, but alsways as an IP address. Local inputs (like imklog) use 127.0.0.1 in this property. |
syslogtag | TAG from the message |
programname | the "static" part of the tag, as defined by BSD syslogd. For example, when TAG is "named[12345]", programname is "named". |
pri | PRI part of the message - undecoded (single value) |
pri-text | the PRI part of the message in a textual form (e.g. "syslog.info") |
iut | the monitorware InfoUnitType - used when talking to a MonitorWare backend (also for phpLogCon) |
syslogfacility | the facility from the message - in numerical form |
syslogfacility-text | the facility from the message - in text form |
syslogseverity | severity from the message - in numerical form |
syslogseverity-text | severity from the message - in text form |
syslogpriority | an alias for syslogseverity - included for historical reasons (be careful: it still is the severity, not PRI!) |
syslogpriority-text | an alias for syslogseverity-text |
timegenerated | timestamp when the message was RECEIVED. Always in high resolution |
timereported | timestamp from the message. Resolution depends on what was provided in the message (in most cases, only seconds) |
timestamp | alias for timereported |
protocol-version | The contents of the PROTCOL-VERSION field from IETF draft draft-ietf-syslog-protcol |
structured-data | The contents of the STRUCTURED-DATA field from IETF draft draft-ietf-syslog-protocol |
app-name | The contents of the APP-NAME field from IETF draft draft-ietf-syslog-protocol |
procid | The contents of the PROCID field from IETF draft draft-ietf-syslog-protocol |
msgid | The contents of the MSGID field from IETF draft draft-ietf-syslog-protocol | inputname | The name of the input module that generated the message (e.g. "imuxsock", "imudp"). Note that not all modules necessarily provide this property. If not provided, it is an empty string. Also note that the input module may provide any value of its liking. Most importantly, it is not necessarily the module input name. Internal sources can also provide inputnames. Currently, "rsyslogd" is defined as inputname for messages internally generated by rsyslogd, for example startup and shutdown and error messages. This property is considered useful when trying to filter messages based on where they originated - e.g. locally generated messages ("rsyslogd", "imuxsock", "imklog") should go to a different place than messages generated somewhere. |
$now | The current date stamp in the format YYYY-MM-DD |
$year | The current year (4-digit) |
$month | The current month (2-digit) |
$day | The current day of the month (2-digit) |
$hour | The current hour in military (24 hour) time (2-digit) |
$hhour | The current half hour we are in. From minute 0 to 29, this is always 0 while from 30 to 59 it is always 1. |
$qhour | The current quarter hour we are in. Much like $HHOUR, but values range from 0 to 3 (for the four quater hours that are in each hour) |
$minute | The current minute (2-digit) |
$myhostname | The name of the current host as it knows itself (probably useful for filtering in a generic way) |
Properties starting with a $-sign are so-called system properties. These do NOT stem from the message but are rather internally-generated.
FromChar
and toChar
are used to build substrings. They specify the offset within the string
that should be copied. Offset counting starts at 1, so if you need to
obtain the first 2 characters of the message text, you can use this
syntax: "%msg:1:2%". If you do not whish to specify from and to, but
you want to specify options, you still need to include the colons. For
example, if you would like to convert the full message text to lower
case, use "%msg:::lowercase%". If you would like to extract from a
position until the end of the string, you can place a dollar-sign ("$")
in toChar (e.g. %msg:10:$%, which will extract from position 10 to the
end of the string).
There is also support for regular expressions. To use them, you need to place a "R" into FromChar. This tells rsyslog that a regular expression instead of position-based extraction is desired. The actual regular expression must then be provided in toChar. The regular expression must be followed by the string "--end". It denotes the end of the regular expression and will not become part of it. If you are using regular expressions, the property replacer will return the part of the property text that matches the regular expression. An example for a property replacer sequence with a regular expression is: "%msg:R:.*Sev:. \(.*\) \[.*--end%"
It is possible to specify some parametes after the "R". These are comma-separated. They are:
R,<regexp-type>,<submatch>,<nomatch>,<match-number>
regexp-type is either "BRE" for Posix basic regular expressions or "ERE" for extended ones. The string must be given in upper case. The default is "BRE" to be consistent with earlier versions of rsyslog that did not support ERE. The submatch identifies the submatch to be used with the result. A single digit is supported. Match 0 is the full match, while 1 to 9 are the acutal submatches. The match-number identifies which match to use, if the expression occurs more than once inside the string. Please note that the first match is number 0, the second 1 and so on. Up to 10 matches (up to number 9) are supported. Please note that it would be more natural to have the match-number in front of submatch, but this would break backward-compatibility. So the match-number must be specified after "nomatch".
nomatch is either "DFLT", "BLANK", ZERO or "FIELD" (all upper case!). It tells what to use if no match is found. With "DFLT", the strig "**NO MATCH**" is used. This was the only supported value up to rsyslog 3.19.5. With "BLANK" a blank text is used (""). With "ZERO", "0" is used. Finally, "FIELD" uses the full property text instead of the expression. Some folks have requested that, so it seems to be useful.
The following is a sample of an ERE expression that takes the first submatch from the message string and replaces the expression with the full field if no match is found:
%msg:R,ERE,1,FIELD:for (vlan[0-9]*):--end%
and this takes the first submatch of the second match of said expression:
%msg:R,ERE,1,FIELD,1:for (vlan[0-9]*):--end%
Please note: there is also a rsyslog regular expression checker/generator online tool available. With that tool, you can check your regular expressions and also generate a valid property replacer sequence. Usage of this tool is recommended. Depending on the version offered, the tool may not cover all subleties that can be done with the property replacer. It concentrates on the most often used cases. So it is still useful to hand-craft expressions for demanding environments.
Also, extraction can be done based on so-called "fields". To do so, place a "F" into FromChar. A field in its current definition is anything that is delimited by a delimiter character. The delimiter by default is TAB (US-ASCII value 9). However, if can be changed to any other US-ASCII character by specifying a comma and the decimal US-ASCII value of the delimiter immediately after the "F". For example, to use comma (",") as a delimiter, use this field specifier: "F,44". If your syslog data is delimited, this is a quicker way to extract than via regular expressions (actually, a *much* quicker way). Field counting starts at 1. Field zero is accepted, but will always lead to a "field not found" error. The same happens if a field number higher than the number of fields in the property is requested. The field number must be placed in the "ToChar" parameter. An example where the 3rd field (delimited by TAB) from the msg property is extracted is as follows: "%msg:F:3%". The same example with semicolon as delimiter is "%msg:F,59:3%".
Please note that the special characters "F" and "R" are case-sensitive. Only upper case works, lower case will return an error. There are no white spaces permitted inside the sequence (that will lead to error messages and will NOT provide the intended result).
Each occurence of the field delimiter starts a new field. However, if you add a plus sign ("+") after the field delimiter, multiple delimiters, one immediately after the others, are treated as separate fields. This can be useful in cases where the syslog message contains such sequences. A frequent case may be with code that is written as follows:
int n, m;
...
syslog(LOG_ERR, "%d test %6d", n, m);
This will result into things like this in syslog messages: "1 test 2", "1 test 23", "1 test 234567"
As you can see, the fields are delimited by space characters, but their exact number is unknown. They can properly be extracted as follows:
"%msg:F,32:2%" to "%msg:F,32+:2%".
This feature was suggested by Zhuang Yuyao and implemented by him. It is modeled after perl compatible regular expressions.
property options
are
case-insensitive. Currently, the following options are defined:
uppercase | convert property to lowercase only |
lowercase | convert property text to uppercase only |
drop-last-lf | The last LF in the message (if any), is dropped. Especially useful for PIX. |
date-mysql | format as mysql date |
date-rfc3164 | format as RFC 3164 date |
date-rfc3339 | format as RFC 3339 date |
date-subseconds | just the subseconds of a timestamp (always 0 for a low precision timestamp) |
escape-cc | replace control characters (ASCII value 127 and values
less then 32) with an escape sequence. The sequnce is
"#<charval>" where charval is the 3-digit decimal value
of the control character. For example, a tabulator would be replaced by
"#009". Note: using this option requires that $EscapeControlCharactersOnReceive is set to off. |
space-cc | replace control characters by spaces Note: using this option requires that $EscapeControlCharactersOnReceive is set to off. |
drop-cc | drop control characters - the resulting string will
neither contain control characters, escape sequences nor any other
replacement character like space. Note: using this option requires that $EscapeControlCharactersOnReceive is set to off. |
sp-if-no-1st-sp | This option looks scary and should probably not be used by a user. For any field given, it returns either a single space character or no character at all. Field content is never returned. A space is returned if (and only if) the first character of the field's content is NOT a space. This option is kind of a hack to solve a problem rooted in RFC 3164: 3164 specifies no delimiter between the syslog tag sequence and the actual message text. Almost all implementation in fact delemit the two by a space. As of RFC 3164, this space is part of the message text itself. This leads to a problem when building the message (e.g. when writing to disk or forwarding). Should a delimiting space be included if the message does not start with one? If not, the tag is immediately followed by another non-space character, which can lead some log parsers to misinterpret what is the tag and what the message. The problem finally surfaced when the klog module was restructured and the tag correctly written. It exists with other message sources, too. The solution was the introduction of this special property replacer option. Now, the default template can contain a conditional space, which exists only if the message does not start with one. While this does not solve all issues, it should work good enough in the far majority of all cases. If you read this text and have no idea of what it is talking about - relax: this is a good indication you will never need this option. Simply forget about it ;) |
secpath-drop | Drops slashes inside the field (e.g. "a/b" becomes "ab"). Useful for secure pathname generation (with dynafiles). |
secpath-replace | Replace slashes inside the field by an underscore. (e.g. "a/b" becomes "a_b"). Useful for secure pathname generation (with dynafiles). |
To use multiple options, simply place them one after each other with a comma delmimiting them. For example "escape-cc,sp-if-no-1st-sp". If you use conflicting options together, the last one will override the previous one. For example, using "escape-cc,drop-cc" will use drop-cc and "drop-cc,escape-cc" will use escape-cc mode.