Introduction
[Top of page]
This page is a 'catch-all' for various topics that are (generally) too small to justify a topic or heading of their own.
it's kind a an 'administrative trivia' page.
Limitations of Syslogd2
[Top of page]
It is reasonable to ask about the implementation limits of Syslogd2:
- Configuration file(s):
- Number of lines per file: 4095
- Total Number of configuration files: 127
- Maximum 'nesting' depth for configuration files: 127
- Number of additional facilities: Default: 32, Range: 0 to 1000 inclusive. (Must recompile to change this value.)
- Maximum lines in a FIFO queue: SEM_VALUE_MAX: (Non-Linux posix systems: 32767. Linux: 2147483647.)
- Maximum number of threadpools: No programmed limit - Depends on system resources and 'ulimit' settings.
- Maximum number of entries in internal-cache: No programmed limit. Cache is implemented as a set of self-optimizing binary trees in memory.
- Maximum number of threads supported: No programmed limit - Depends on system resources and performance.
- Highest threadpool number supported: Largest positive 32-bit integer.
- Max length of a logical config file input-line: 4096 bytes
- Max length of a hostname / pathname: 255 bytes
- Max length of a filename (excluding path): 64 bytes
- Max length of a 'finalized' output line: 1500 bytes
- Max length of the 'msg-string' component of a message: 1024 bytes
- Max number of lines per filter-file: No programmed limit. All filter elements are dynamically-allocated.
- Max number of entries per threadpool: No prorammed limit. Will depend on system resources and performance. (memory speed, thread-counts, etc)
Compile and Install
[Top of page]
Contents:
Initial Compilation for Test & Evaluation
Customizing Syslogd2 for Production
Because there is no one-size-fits-all set of Syslogd2 options that satisfy all network- and host-management needs, Syslogd2 allows you to compile multiple
variations (with each variation or 'variant') having a different set of copmpiled-in features) at one time. This creates a 'deployment set' of binaries, allowing different
hosts to use different variants of Syslogd2 based on their role and needs syslog-processing needs.
In the 'default set' of Syslogd2 binaries, there are (currently) 8 variations as described below, ranging from ultra-small (just 2 threads and no internal queues) to
the 'mega' variant that can easily employ hundreds of threads and process thousands of events per second (targeted for firewalls and central-collection stations).
For those new to Syslogd2 I recommend that you start with the default set of configurations until you get a feel for how Syslogd2 works, then re-copmile a set of
binaries for your actual roll-out after determining how many variations of Syslogd2 you will field and to which 'classes' of hosts.
Initial Compilation for Test & Evaluation
[Top of Section]
The following text is also in the 'INSTALL' file in the
build directory: ----------------------->
Step 1: Obtain and Unpack the Software
- Download the Syslogd2 package file.
- Unpack it with
tar -xvf <filename>
- Change into the created directory. You are now in the build directory.
Step 2: Compile the code.
The autoconf process scans the operating system to find various header files, functions, and library packages needed by Syslogd2.
It creates a script file (./configure) that is run to create header-files and makefiles from template files (./shared/config.h.in
and Makefile.in) provided by Syslogd2.
The package that is most likely to be missing in most distributions is the ncurses development package containing ncurses header files.
The ncurses library would have been installed to support the text menus used to install the Linux operating system, but the
development package containing the header fils required for code copmilation wowuld not have been required. These header files
are needed to support the command-tool.
Run autoconf and configure.
autoconf
./configure
If you need to install additional packages, do so and re-run autoconf and configure
autoreconf
./configure
The configure shell-script creates an include file (shared/config.h) from the template file (shared/config.h.in) that records the
autoconf findings for the compiler.
It also creates ./Makefile from the ./Makefile.in template file to use in compiling the code.
Compile the code:
make
This will produce 8 binary variations of Syslogd2 code, each containing different features and run-time capabilities.
- syslogd2t: A tiny version.
- syslogd2s: A small version.
- syslogd2m: A medium version.
- syslogd2l: A large version.
- syslogd2h: A huge version.
- syslogd2o: An output version.
- syslogd2g: A mega version.
- syslogd2d: A demonstration version.
This will produce 8 binary variations of Syslogd2 code, each containing different features and run-time capabilities.
In the
./tools directory, there will also be a set of compiled binaries:
Most of these will have 3- to 5-letter names (abcxy) where:
- 'a': [t]ransmit or [r]eceive.
- 'b': [s]tream protocol (includes TCP and name-pipe) or [d]atagram protocol.
- 'c': IPv[4], IPv[6], [u]nix, [p]ipe
- 'x': [f]ile (as in 'from a file'), [c]ommand (Collectively, these are the 'command-tool')
- The command-tool files (tsuc*) are matched to the binaries and take the last letter (actually a string) of the binary for identification.
There are a couple of other programs there as well: sscalc: A selector-string calculator for working with syslog selector-strings, and checknet: a tool
using Syslogd2's internal library to check the state of the network. 'popen' is not yet completed code.
<----------------------- End of 'INSTALL' file text
Customizing Syslogd2 for Production
[Top of Section]
Contents:
Planning the number and content of binary versions
Exectuting the plan
Planning the number and content of binary versions
[Return to Subsection]
There are 3 files to edit prior to re-compilation.
- ./shared/config.h : This file is created by the './configure' shell script.
It contains the results of various system-tests that identify installed libraries and header files used by Syslogd2.
It also contains the number of 'extra' facilities that may have been set on the command-line of './configure' in the variable LOG_EXTRAFACILITIES.
- ./Makefile : This file is also created by the './configure' shell script.
It not only details how to compile Syslogd2, but (for our purposes) contains instructions on how many variants of Syslogd2 to compile and what
extenstions they will use.
- ./shared/defines.h: This file contains the list of optional features for each compiled variant of Syslogd2
Once the binaries have been compiled the first time (unless you need to change installation-directory parameters), there is no need to either run autoreconf
or to re-run ./configure. You can directly edit the files concerned and re-run the 'make' command to recompile the set.
(Manual modification of the applicable files is both faster and easier.)
./shared/config.h: This file only has one parameter that is user-settable: LOG_EXTRAFACILITIES. The possible range of values is 0 to 1000 inclusive.
./Makefile: The only lines of interest in this file occur near the top of the file:
# The suffix values cannot be the same character string differentiated only by capitalization. Attempts to do so will result in binaries not being created by make.
VARVALUES = 1 2 3 4 5 6 7 8
# VARNAMES = .tiny .small .medium .large .huge .output .mega .demo
VARNAMES = t s m l h o g d
# These are the variant indicess that will actually be built by this makefile (4 = 'large', 5 = 'huge'). These are the INDICES into the above lists of definitions that will be built..
VARIANTS = 1 2 3 4 5 6 7 8
The first two lines assign numeric values to a list of strings that will be appended to 'syslogd2' to get the names of binaries and to 'tsuc' to get the names
of corresponding command-tools.
The 3rd line (as the comment indicates) determines what gets compiled. If it only contains '3 4 5', then only the medium, large and huge models of Syslogd2 would be compiled.
The line containing the words '.tiny', '.small', etc can be used to replace the one containg just letters. In this case the suffix to 'syslogd2' would be '.tiny instead of 't', '.small' instead of 's', etc).
./shared/defines.h: near the top of this file you will find two sets of '#define' statements:
The first assignment-block defines a human-readable name for eachh numeric value of 'VARIANTs' that the Makefile will pass to the 'C' code via the compiler's command-line.
For each individual binary compiled, a different value of 'VARIANT' will be passed to the 'C' code, resulting a different set of features being assigned to that binary variant.
---------------------------------------------------------------------------
#define DEMO 8
#define MEGA 7
#define OUTPUT 6
#define HUGE 5
#define LARGE 4
#define MEDIUM 3
#define SMALL 2
#define TINY 1
---------------------------------------------------------------------------
The subsequent blocks (below the list of '#undef' statements) declares which symbols will be defined for each iteration of Syslogd2 (each 'binary imaae').
There are a total of 8 [eight] of these blocks similar to:
---------------------------------------------------------------------------
#if VARIANT == MEGA
#define CAP_WHATIF
#define CAP_PIPESIN
#define CAP_TAILFILES
#define CAP_STREAMIN
#define CAP_STREAMOUT
#define CAP_SPOOLFILES
#define CAP_HOUSEKEEPING
#define CAP_FILTERSIN
#define CAP_FILTERSOUT
#define CAP_FILEROTATE
#define CAP_COMMAND
#define CAP_USERTHREADS
#define CAP_CACHE
#define CAP_STATS
#define CAP_KERNELTHREADS
#define CAP_OUTPUTTHREADS
// #define CAP_RECONFIG
#define CAP_WORKERTHREADS
// #define CAP_SINGLETHREAD // not recommended to combine with options above outside of experimentation or testing.
// #define CAP_SINGLEPOOL // not recommended to combine with options above outside of experimentation or testing.
// #define CAP_SINGLEPORT // not recommended to combine with options above outside of experimentation or testing.
#endif
----------------------------------------------------------------------------
Executing the plan
[Return to Subsection]
Once you have decided on how many different variants of Syslogd2 and what suffixes to use for your deployment, the mechanics are a strait-forward, 3-step process.
Don't forget to make backups of the files before editing them.
Two quick notes on comments:
(1) Two slash characters ('//') starts a comment to end-of-line.
(2) The sequences '/*' [slash + asterisk] and '*/' [asterisk + slash] act like 'bookends' for a comment-block anywhere in a line or across multiple lines.
- Edit the file shared/defines.h with a text editor:
- At the top of the file, find the block of lines that assigns configuraiton names to numeric values.
It will be somewhere around line 30 and identified by the text 'Step 1 of custom-compile.'
Edit the list of '#define' statememnts so it fits the number of unique images you wish to comiple.
Feel free to modify the names in these statements as well.
There should be no gaps in the numeric sequence.
- Now find the section of lines that assign features to each configuration-name created above.
This section of code will start somewhere around line 160 and will be identified by the text 'Step 2 of custom-compile.'
--> This section is a sequence of definition-blocks.
There is one block for each configuration-number specified above.
- The '#if ...' statement in each block of definitions must exactly match one of the configuration-names defined above or it will be ignored.
Each line of these '#define' blocks is case-senstivie.
- Each block starts with '#if VARIANT == name' followed by an optional comment.
- Each blcok ends with an '#endif' statement.
- The indentation is optional.
Add or remove blocks based on the quantity of configurations you will have.
Add or remove comments to define the features to be compiled into each binary instance.
- Now edit the file ./Makefile with a text editor. Specifically edit the 3 lines starting around line 19:
- The VARVALUES line should be a simple 'count'. This contains the numeric values for 'VARIANT' that will be sent to the 'C' code.
- The VARNAMES line should contain unique (case-sensitive) suffixes - one for each entry in the VARVALUES array.
(The suffix will be appended to the base name 'syslogd2' and to the base command-tool name 'tsuc'.)
- The VARIANTS line should contain only those entries from the arrays defined above that you wish to actually compile. In most cases this
will be a duplicate of the VARVALUES array.
- Finally, edit the ./shared/config.h file to change the value of LOG_EXTRAFACILITIES if desired. The range of this value is 0 to 1000 inclusive,
The higher the value, the more memory Syslogd2 will require for each output record.
Now that all customizations have been made, run 'make clean', then re-run the make command to re-copmile your binaries:
root#> make clean
root#> make
Now find the ./install sub-directory containing 2 sub-directories: One for installing on systemd-based systems and one for systems that use /etc/init.d files.
Please refer to the 'Install' file in the directory corresponding to your system-startup-type for further instrucitons.
Note: Because Syslogd2 is network-aware and uses delayed-resolution, it does NOT require the network to be up-and-running before it is started.
This allows Syslogd2 to be started in run-states that do not contain network services or before systemd starts the network service.
The ./install/systemd directory contains a service file and instructions for configuring Syslogd2 to start before systemd initializes the network.
Using network-state spooling, traffic collected prior to network startup can still be forwarded to remote network hosts after startup.
The Configuration File
[Top of page]
Contents:
Re-evaluating the syslog configuration-file- (Concepts)
Re-evaluating the syslog configuration-file. (Syntax & Rules)
Re-evaluating the syslog output-line syntax
Using SoftComment and Skip to Share Config Files With Legacy Daemons
Re-evaluating the syslog configuration-file (Concepts)
[Top of Section]
Though the changes introduced to the Syslogd2 configuration file look extensive, they are conceptually quite minimal (as minimal as possible
consistent with providing the configuration information that Syslogd2 is capable of using. The following is a summary of the conceptual changes to the
Syslogd2 configuration file format and syntax.
- The traditional syslog configuration file consists of output-lines and comments.
To this, Syslogd2 adds a 'configuration-line' identified by a tllde ('~') as the first (non-whitespace) character of the line.
The configuration-line contains command-line options that have been moved into the configuration file from the actual command-line.
-
-
For purposes of interoperability (allowing the same configuration file to be used for hosts running either Syslogd2 or legacy syslog deamons),
the identifying tilde of a Syslogd2 command-line may optionally be preceded by a single hashtag ('#')
(to transform the line into a 'comment' from the viewpoint of legacy syslog processors).
-
Syslogd2 introduces the idea of a soft comment. By having the softcomment functionality in an always-on
state for configuration statements, softcomment can be turned on and off as needed to control output-line parsing. (Since it is never 'off' for commands,
using a soft-comment to (re-)enable softcomment is non-contraditory.)
- An output-line-option-list is conceptually added to the end of each output-line to contain Syslogd2-specific options and settings.
This option-list may alternatively be placed in a soft-comment-line that precedes the output-line so the actual output-line itself does not need to be modified.
Re-evaluating the syslog configuration-file (Syntax & Rules)
[Top of Section]
Re-evaluating syslog configuration-file syntax
Re-evaluating syslog configuration-file rules
Re-evaluating the syslog configuration-file (Syntax)
[Top of Sub-Section]
- With only a very few exceptions, any command-line opiton defied by Syslogd2 may be placed into the configuration file instead.
When moving configuraiton statements from the actual command-line into the configuration file, all quotes and escape characters are removed.
The exceptions (for use on actual-command-line-only) are:
- --configfile (-c)
- --help (-h or -?)
- --version (-v)
- --TestConfig (-T)
- Likewise there are a very few 'command-line-options' that only make sense when used in the configuration file. These options
(related to cotrolling the parsing of the file) are:
- --Network
- --Skip
- --SkipTo
- --enable/disable SoftComment
- With only a very few exceptions (parsing-control options and log-file specifications), Syslogd2 reads all configuration entries into temporary
structures before analysing them to build its working structures.
This process removes all dependencies on the order in which various configuration-commands are encountered in the configuration file.
- In general, Syslogd2 will latch each configuration-command and option to detect and prevent duplication or conflicts unless the
documentation specifically states that multiple istances of that setting or command may be used.
- Syslogd2 maintains a strict first-come-first-serve policy toward configuration-file parsing. In some cases, this policy is relied upon to provide
certain results.
- One example is the implementation of the 'default' Linux and IP socket ports.
These ports are defined by Syslogd2 with line-numbers that are higher than the total lines in the file.
If the user re-defines these ports with customized options but fails to disable the default creation of the default ports, upon detecting the conflict during build,
Syslogd2 will apply first-come-first-serve and will (silently in this case) ignore the (automaticly-created) default settings in favor of the user over-rides.
- A second example is that any command placed directly on the comamnd-line (effectively line 0 [zero]) cannot be over-ridden or modified by lines in the configuration
file itself. This can be used to 'force' selected settings from the command-line regardless of configuragion-file contents.
-
All keywords in Syslogd2 are non-case-sensitive except single-letter (short-form) aliases.
(Short-form keywords are case-sensitive to allow for the use of upper- and lower-case values for different purposes).
Whitespace before and after commas and equal-signs is optional.
All equal-signs used in command-line options are optional if replaced with whitespace with one exception:
Any (comma-separated) output-line-option-list that follows a (comma-separated) list of users must start with an option that contains an equal-sign.
The Syslogd2 parser needs the equal-sign to determine the end of the user-list and the start of the option-list.
- A Syslogd2 line may be continued over multiple lines to the maximum size of the read-buffer (currenly 8 KB).
The continuation character is the backslash ('\') and must be the last character in the line (no comments or whitespace may follow it).
When using the continuation character, whitespace at the front of the next line will be ignored.
If continuing a soft-comment, a soft-comment-hashtag may be added at the start of the continuation-line 'for free' (not counting as a 'hard-comment-character').
Re-evaluating the syslog configuration-file (Rules)
[Top of Sub-Section]
Syslogd2 also modifies the way the configuration-file is traditionally processed. The startup process is divided into multiple phases.
- The first (parse phase) reads data into temporary structures. Other than parsing-control entries (below), nothing is actually
analysed or build in this process.
- The second (build phase) uses the parsed structures to build an off-line configuration.
During this process, Syslogd2 rejects any configuration entries that are 'impossible' to satisfy.
- An IP socket definition when the IP (inet) support has been explicitly disabled
- A user-defined socket when CAP_SINGLEPORT is declared
- A filter-file specifications when the filter-support features are not present, etc).
- etc...
- This process also 'weeds out' (rejects) invalid network-specifications (such as 'julie' or 'home') that do not include 'other' or 'local'), as well
as any invalid formats for IP addresses or other sources/targets.
- Other entries (things that *might* be fixable without a complete re-parse of the configuration) such as missing directories or missing input files,
it accepts, but with a 'pending' (non-active, or 'unresolved') status.
- The build phase attempts to resolve all sources and destinations. Success results in sockets being built (but not initialized) and
output lines being 'grouped' by common destination.
IP outputs that can not be resolved but that are provided in numeric form get a special status of 'usable'. These connections will be opened with
the intent that they will eventually be resolved and 'merged' with any other expressions of the same host that may have been previously resolved.
Failure to resolve means that source or destination is not yet usable, though efforts to resolve it during run-time will continue. (Eventually that
missing directory may get created or that unreachable DNS may become available or that missing input file will become available or that missing application
will be started, etc...)
- Any conflicts between any two connections are identified and resolved based on first-come-first-serve. All such collisions are logged to the error file.
Examples include:
- The declaration of both '::' and '0.0.0.0' as listening ports ('0.0.0.0' is included in the definition of '::' per Linux IP implementation).
- The declaration of '*' and any IP address on the same port using the same protocol.
- The declaration of an filesystem path as both a Linux socket and a pipe, a file, or a character device (or any such combination).
- The declaration of the same Linux socket as both stream and datagram in two different declarations
- The declaration of the same (non-default) threadpool-id for both socket and file input. (these two threadpool types are mutually exclusive).
- etc...
- The third (stage phase) does 'clean-up' and moves all temporary-location entries into production locations.
'Clean-up' entails:
- Any threadpools that are declared but that have no actual connections (input or output) to service are deleted.
- Duplicate sources or destinations are deleted.
- Any permanently-invalid entries are deleted. (invalid IP address formats, entries foreclosed by administrative configuration [such as disabled
IPv6 or disabled user threads, cache-support], etc).
The third phase also initializes all memory buffers and system-services required for multi-thread operation.
- The final phase of startup (execution phase) starts the various threads defined for each threadpool, then starts the various schedules.
This phase also makes the initial requests to the parent thread to run the first execution iteration of timed and scheduled routines after which
the routine itself (before terminating) will schedule its next execution.
Because of the phased startup, the order in which the configuration file startup command are given makes no difference.
--> The phased startup is what allows the TestCconfig feature to be called at different points.
--> The phased startup also allows the creation of 'deployment modules'.
For example, a MariaDB module might consist of a small '.conf' file containing one or more input specifications and
one or more output specifications combined with a couple of filter-files.
When copied to a pre-determined directory, these commands would take effect at the next syslog restart.
Using the options to the --includeconfig option, the pre-declared thread pool would be aciivated instead of deleted during the cleanup portion of the staging phase.
Re-evaluating the syslog output-line syntax
[Top of Section]
The output-line (Location Component)
The output-line (Selector-String Component)
Perhaps the most notable change in Syslogd2's version of the syslog output-line is the addition of an optional comma-separated list (offset by a comma)
at the end of the output-line (following the 'location' field). Specifically, the output-line that was previously limited to
selector-string + whitespace + location
might now look the same or might look like:
selector-string + whitespace + location + ',' + comma-separated-options + ['##' + optional in-line comment]
The addition of the option-list allows for better definition of the locations, but more importantly, it provides an open-ended mechanism for specifying parameters
that are specific to a given output-line.
-->For example, the option 'stream' combined with a location of '@hostname' specifies a TCP connection, while the same 'stream' option combined with
a location of '@/tmp.socket' specifies a streaming Linux socket.
-->Additionally options can be used to specify which threadpool a specific connection should be handled by, as well as other location-specific parameters.
Unfortunately, the option-list by itself is not sufficient to meet all the needs of Syslogd2 so additional re-evaluation is warranted.
Regardless of the additions made to the output-line syntax by Syslogd2 it should always be kept in mind that in the degenerate case (the case where
no special output options are applied at all and no unique features (such as TCP or Linux socket pathnames, no threadpool identifiers or other options) are used,
there is no need for the option-list nor is the optional in-line comment needed. This reduces the Syslogd2 output-line to the same traditional format that
most Unix and Linux administrators already recognize and are used to.
A final thought before ending this section: options in the option-list need not be limited to modifying the location. They may also be applied to elements of the selector-string.
The Output-Line Syntax (Location Component))
[Top of Sub-Section]
With the addition of an option-list to the syntax of the output-line, Syslogd2 can provide several additional output-features (not all, but most) of these features
affect the entire (composite) output stream that that is going to a specific location.
Syslogd2 views the 'location' component of the output stream similar to a mass-transit-stop in a mass-transit sytem.
There are multiple destinations (pick-up/drop-off stops) in the system, but for any given location (stop), the sources of traffic (whether selector-string components or people)
will vary as to where they came from and what they are carrying. Just about the only thing the various selector-string components in any given output-line have in common
is that they are all going to the same location (mass-transit stop) and may need access to shared resources (the connection itself and a file for spooling).
Syslogd2 has divided the set of output options into two classifications: Location-specific and
Selector-String-specific.
Location-specific options are those that can not vary between selector-string elements without causing insurmountable
'downstream' issues (imagine a mass-transit (long-distance) bus having 3 drivers - each insisting on a different route to get to the same destination).
Location-options help to uniquely identify the host (port and protocol), generally affect spooling to a particular host (spoolfile size, name, action), or define characteristics
of location (uid/gid/mode) or they affect the format of data trasmitted over the connection (format, allmsgs).
The Output-Line Syntax (Selector-String Component))
[Top of Sub-Section]
Selector-String Element Syntax
Selector-String Options: Parsing and Processing
Selector-String Element Syntax
[Top of Sub-Section]
I believe it may now be classified as a 'sin' to create a new syslog daemon without expanding (in some way) on the syntax of selector-string elements.
Rather than to invoke the wrath of the general pulic, Syslogd2 submits the following as its current 'submission' to the 'great selector-string syntax race':
The general syntax is (with no internal whitespace allowed):
<facility-expression> + '.' [ + <priority-options>] + <priority-expression>
<facility-expression> may be:
- The keyword 'None': This produces a 'no-op' element (a do-nothing) that may be useful in some SoftComment scenarios.
- The keyword '*': This wild card has the usual meaning of 'all facilities'.
- A single facility: This selection is enhanced through the use of Syslogd2's 'reserved' and 'extra' facilities.
- A comma-separated list of facilities followed by a single '.[priority-options]<priority>' clause.
This produces the same effect as listing each facility separately with the same '.[priority-options]<priority>' option.
-
A range of numeric facility values
(<facility1>-<facility2>) followed by a a single priority-clause.
This syntax is similar to the comma-separated list of facilities above, but substitutes an inclusive range of facilities for the comma-separated list of facilities.
kern-ftp.warn;local0-local7.*;reserved0-reserved3.none;extra0-extra31.none; ...
(Facility ranges are available in Syslogd2 release 1.0.3 and higher).
<priority-expression> may be:
- The keyword 'None': This clears all priority levels for this facility (including any settings made earlier in the selector-string).
- The keyword '*': This wild card has the usual meaning of 'all priority levels'.
- A single priority: This selection has the expected meaning of a single priority value.
All <priority-options> may be used in combination with each other:
- <No option provided> :: (user.notice;...) This is the default. The meaning is 'the set of priorities that is numerically less than or
equal to the stated priority value.'
- '<', '=', or '>' :: These options result in the set of priorities that would be expected by a mathematical comparison against the priority-expression component.
(Not all possible values will make sense and multiple expressions may result in the same priority-set:
'user.<=warn' is the same as just 'user.warn' for example.)
- '!' :: This is a 'logical not'. For example: 'user.!=warn' means 'all priority levels that are not equal to 'warn'.
Likewise user.!>notice is the same as 'user.<=notice' which is the same as just 'user.notice'.
- '~' (tilde) :: This is a 'bitwise not' or 'negate'. Instead of setting bits that are specified by the selector-string component, the negate
symbol causes Syslogd2 to clear those bits.
This works something like the prioriy value of '.none' but at a more granular level. ('user.none' and 'user.~*' produce the same results when
encountered in a selector-string.)
Selector-String Options: Parsing and Processing
[Top of Sub-Section]
Traditional thinking about the syslog configuration file has conditioned people to consider the selector-string component as a monolithic,
indivisiable element.
Syslogd2 challenges this assertion based on the grounds that traditional syslog processors and the administrators that use them simply never had a reason
to think otherwise.
Syslogd2 recognizes that each selector-string element represents a different (unique) stream of data that
has been isolated from all other traffic for a reason.
Isolating data in such a away always implies the intent to be able to isolate te handling of the data in this stream in order to facilitate the identification and
routing of the syslog entries to specific destinations.
As long as all selector-string elements are treated the same (as long as there is no reason for individualized treatment), the selector-string will always degenerate to
the simple list of elements that everyone is familiar with.
However Syslogd2 output-line facilities and features
provide abilities and incentives to treat the different data-streams represetned by different selector-string elements as individual and unique entities with
potentially different processing needs.
The primary consequence of these needs is that the (no-longer-monolithic) selector-string needs to be re-evaluated and re-interpreted so that each component may
have its own list of options that will control how that particular data-stream should be processed.
Syslogd2 allows each output-string to be split into multiple lines with each line containing a portion of the overall selector-string.
Splitting the original output-line in this way allows different selector-string options to be specified
in the (new) option-list component located at the end of the output-line.
Where the traditional format said "Send the data-streams represented by selector-components 1, 2 and 3 to location X", the expanded Syslogd2 format appends
"-- but process the data-streams differently before sending them - use options from list 1, list , and list 3 respectively". These options may specify diffent
output filters, different spooling options or other parameters on when or what portions of each data-stream to send.
A selector-string in an output-line may look like this for a traditional syslog processor:
kern,user,mail,daemon.*;local0.warn @<some IP location>
may look like this when specified with options in Syslogd2:
## The 1st data-stream could contain kernel-generated, security-related data that was selected for forwarding (no expiration).
It also sets the location-specific options for this output-location because it has the lowest line number.
kern.* @<some IP location>, filter=filter-file,sf,stream,spoolfileage=0, id=3, format=clock
user,mail.* @<some IP location> ## These streams will be discarded if unable to be transmitted.
## This stream contains err msgs from raid controllers and other applications.
daemon.* @<some IP location>, filter=2nd-filter-file,sf, spoolfileage=5m
## The final stream could contain network-device-access-activity by network-engineers.
local0.warn @<some IP location>, spool, network=other, sf, spoolfileage=0
In the break-out above, only the first entry needs to specify 'stream', 'format', etc since all subsequent entries will have their location options ignored
due to firt-come-first-serve.
Note also that were it not for the options being applied to different elements of the selector-string, the 'broken out' string would 'degenerate' to the same string
traditional selector-string as above.
[Top of Section]
One of the major design goals for Syslogd2 is to facilitate the deployment of Syslogd2 in a variety of potential scenarios.
One of those potential scenarios envisions that Syslogd2 deployments will be intermixed with legacy systems (either on a temporary
during a roll-out or a permanent basis) in a company that wants to standardize their log-files (and therefore the syslog configuration file
that creates the logs).
In this hypothetical scenario, the same syslog file would be pushed to all machines in an ideal situation.
Any requirement for
Syslogd2-equipped hosts to have a unique log file pushed out to them means duplication of work on the part of network-management personnel.
A similar scenario would be a 'test' or 'proof-of-concept' roll-out where SOME systems get Syslogd2 and others continue with their legacy processors.
Pushing the same file to all systems just simplifies the upkeep of the network-management infrastructure.
Of the 3 types of lines in a Syslogd2 configuration file (comments, command-line-configuration-lines and output-lines), The comments are comments
in all files. Unless the legacy system has changed the comment-character, comment-lines won't be a problem.
Input (command-line-configuration-lines) won't be a problem because they can all be 'hidden' by adding a single hashtag character ('#') at the front of the line.
That leaves output-lines - specifically output-lines that contain an option-list, a non-standard facility name or a non-traditional location. It was for the purpose
of 'managing' these lines that the parsing parameter 'SoftComment' was created.
To get the 'traditional' parsing of output-lines (where a single hashtag starts a comment), disable SoftComment. (Its default state is disabled.)
When SoftComment is disabled, a hashtag at the start of a line is treated as a hard-comment and the line is ignored. But how does one use full-featured Syslogd2
configuration lines with 'SoftComment' while not exposing the legacy parser to syntax it cannot handle ?
Using the first-come-first-serve policy of Syslogd2, we can convert a line such as:
extra3,user,kern.warn;*.*;extra4.*;mail,auth,authpriv.none <some_location>, opiton-list
into a series of statements:
# ~ --enable softcomment
# extra3.* <some_locatioin>, selector-and-location-option-list
# extra4.* <some_locatioin>, selector-opiton-list
user,kern.warn;*.*;mail,aut,autpriv.none <some_location>
# ~ --disable softcomment
By preceding the original line with soft-comments, we can assign selector-string-options to any subset of the selector-string or to the 'null' selector: 'none.none'.
We can also assign location-control options for Syslogd2 that the legacy processor will never see.
Should the legacy system use syntax or complex selector-string expressions that Syslogd2 cannot parse, you can write a replacement line as a soft-comment,
then 'skip' over the 'legacy-version' of the line. for example, a modified excerpt from my Ubuntu system (that defaults to rsyslog) might look like:
# --skip 15
module(load="imuxsock") # provides support for local system logging
#module(load="immark") # provides --MARK-- message capability
# provides UDP syslog reception
#module(load="imudp")
#input(type="imudp" port="514")
# provides TCP syslog reception
#module(load="imtcp")
#input(type="imtcp" port="514")
# provides kernel logging support and enable non-kernel klog messages
module(load="imklog" permitnonkernelfacility="on")
In the previous example the last 'module()' line after the last entry was also skipped by Syslogd2. As a second example, the rsyslog line:
*.emerg :omusrmsg:*
'translates' to:
#~ -- enable softcomment
# *.emerg *
#~ -- skip 1
*.emerg :omusrmsg:*
# ~ --disable softcomment
If you wish to skip the rest of the file, a very large 'skip-number' may be entered since skip will terminate on either line-count or end-of-file.
In some cases, you may wish to keep the output-line as-is, but just add some options (such as threadpool-control parameters or a filter-spec that handles all selector-string components):
local0,local1.* <some_location>, id=3,w=6,filter=combination_filter
might also be written as:
# ~ --enable softcomment
# none.none <some_location>, id=3,w=6,filter=combination_filter
local0,local1.* <some_location>
# ~ --disable softcomment
Sample Files
[Top of page]
Configuration files:
- This configuration file is based on the one from my development system. It is written to support all threadpool-types in the 'mega' architecture.
This means that there will be options that are not supported in smaller Syslogd2 variant architectures. The error-file will identify all such entries by filename and line-number so invalid configuration settings can be disabled or removed. This file is written to be Syslogd2-specific.
- This is the same file as in #1 above, but written using SoftComment so it is usable by either Syslogd2 or Rsyslog.
For Syslogd2, the '20-ufw.conf' file is not used since it is not parsable by Syslogd2.
Also the '50-default.conf' file has a single change to tell Syslogd2 to ignore the rsyslog line with invalid content.
The line is then rewritten in traditional format.
-
This is a copy of the default syslog file from CentOS-8 re-written using SoftComment(s) to work with either Syslogd2 or rsyslog.
- The default '50-default.conf' include-file from the ubuntu rsyslog configuration. This file has been modified to allow it to work with Syslogd2.
Specifically, the non-traditional syntax for the '*' user-list has been re-written in traditional format for Syslogd2 and the rsyslog version has been 'skipped over'.
- My '60-extended.conf' include-file.
This file contains the commands to create output files from the 'extra' facilities created by the 'defLinux.input.filter' input-filter. These output files are written to
a log directory (/opt2/log on my development filesystem '/opt2'). By using the '--outputparms' option to the 'include' directive,
I specify that all entries in this file default to output threadpool number 1 instead of number 0.
- My '70-streams.conf' include-file.
This file mostly contains output streams to DBD2 with some dbd2-related files to provide parallel output so I can monitor for data transfer issues.
It is configured to default to output-threadpool 2 by default.
Filter Files:
- My default Linux socket input-filter ('defLinux.input.filter') file.
I use this filter to sort my logs by application instead of by facility.
This allows me to better understand what each application is doing and to see if there are any configguration 'tweaks' I can make to improve their efficiency or
reduce their error-rates.
This gives me many more log files overall, but also provides greater insight because services and applications that do not do much logging do not get 'drowned out' by
more voluminous application logging.
It also allows for far greater understanding of the health of my Linux environment and allows me greater control when selecting data to forward to DBD2.
Because of the number of application (and driver) files generated, I generally use a version of Syslogd2 that enables CAP_WORKERTHREADS so I can process the
application-based log files in parallel with the facility-based log-files that I have not yet decided I no longer need, using two separate output threadpools.
Cache Files:
[Top of page]
The Syslogd2 project compiles a set of tools that are located in the build/tools directory.
While most of tese tools are little more than 'stubs' (small programs that do little more than convert between command-line input/displays and a
socket or named-pipe, some of the tools are intended for more extensive use.
Most of the tools in this directory use a common 3-or-4-letter name-scheme:
- The first character indicates the direction of data-flow from the standpoint of the stub program: either '[t]ransmit' or '[r]eceive'
- The second character indicates the connection-type to use for the connection: either [s]tream (which includes TCP and pipes) or [d]atagram (which includes UDP)
- The third character indicates the protocol to use for the connection: IPv[4], IPv[6], [u]nix, or named-[p]ipe.
- The optional fourth character indicates any special functionality the tool may have over-and-above just 'translating' command-line-input/display to/from a remote connection:
- 't*f': The ability to read input from a [f]ile instead of directly from the command-line.
- 'r*f': The ability to write incoming data to a [f]ile instead of to the terminal.
- 't*cX': The ability to establish and maintain a [c]ommand-connection to an executing copy of Syslogd2.
The 'tsuc' tool is compiled as one-instance-per-Syslogd2 variant that is compiled.
Each command-tool instance is mated to the specific instance of Syslogd2 that it was compiled to support and will not run with any instance of
Syslod2 that it does not share the same suffix with.
This suffix-dependency is a result of variations in the compiler-symbol-makeup of the host binaries that cause capability-differences in the executing host-component
that the command-tool needs to be cognizant of.
Thus the tool named 'td4f' transmits the contents of a specified file over an IPv4 datagram connection
over a streaming unix socket to the syslogd2g binary image (and so on).
Likewise, the tool 'rduf' writes all data received from a Unix datagram socket to a designated file.
As a special exception to the naming rules, the 'rkf' tool writes data received directly from the kernel interface to a file.
The 'r*f' tools (currently only 'rduf' and 'rkf') are useful for creating 'input-files' for use with the 'testFilter tool.
For tools that receive data, the data is displayed (by default) to the the local terminal screen.
Usually these tools precede each line with information about the establised connection (unless the '-q' or 'quiet' option is invoked).
The '-q' option suppresses the extra information. This option may vary slightly from tool-to-tool. (These are not in any way polished, production tools).
The remaining tools (those that do NOT subscribe to the above name-scheme) are:
- sscalc: This is a [s]elector-[s]tring [calc]ulator that uses the actual Syslogd2 selector-string parsing code to
display the results of any selector-string provided to it.
This tool is intended to display the parsed results of compiling complex selector-strings in an attempt to avoid unexpected results.
- checknet: This is a wrapper around the current (and possible subproject) library code that determines which network-state
the host is currently in.
This 'tool' is currently of little practical use, but may one day provide a starting point for better laptop performance by Syslogd2 or other apps or even for other
operating system services.
The 'checknet' program is provided for the 'grins and giggles' of anyone that wants to play with it.
- testFilter: This tool is complex enough to warrant its own subsection.
Besides serving as a 'test-and-development' tool for me, it can also serve as a valuable diagnotic tool for confirming proper filter and tracefilter file syntax and operation.
All tools in the build/tools subdirectory will give usage summaries if run without command-line options.
testFilter
[Top of page]
The testFilter tool takes several filename parameters and (optional) boolean options.
The boolean options control the actions to take as follows:
- <omitted>: Compile and execute the specified filter-file against each line of the specified input-file after (optionally) pre-processing
each input-file line through any applicable input-option-list modifications (such as facility, ignore and forceprintable.
The pre-processing is intended to simulate the same processing steps that Syslogd2 would go through in processing the same input message(s) and
uses copies of the same parsing routines (or what started as the same parsing routines before all superfluous parts were 'pruned' out for coding simplicity).
-
- display (d): Compile and display the contents of the specified filter-file in an intuitive format. This display clearly shows how the match/transform fields,
the throttle-fields and the extraction-fields are interpreted by Syslogd2 as well as any errors in either the syntax or structure of each filter-line.
- nonfilter (n): When present, this option shifts the focus of testFilter from filter-files (CAP_FILTERSIN/CAP_FILTERSOUT) to
tracefilter-files (CAP_TRACEFILTER).
- mode (m): When present, this option changes the execution-mode of filter-files from input-filters to output-filters.
This option is primarily useful to developers to allow me (and others) to confirm proper operation regardless of whether the filter code is executed as
an input-filter or an output-filter. From a user perspecitve, (in the isolated test scenario of testFilter) there should be no
perceptable difference.
The filename options are:
- configfile (c): Spcecifies an abbreviated configuration file containing (at most) an abbreviated '--defaults' option and an abbreviated
'--input' line.
The --defaults options actually implemented are listed below.
Almost all --defaults options will parse without error, though. most of them will not actually do anything within testFilter.
THe most important of the 'actually-implemented' options is the 'configdir' option that allows setting a non-default directory for
use by testFilter.
- hostname
: Sets or modifies the system-wide host-name that is auto-detected at startup.
- domainname
: Sets or modifies the system-wide domain-name that is auto-detected at startup.
- configdir
: Sets or modifies the default directory used for ancillary configuration files.
The
--input (--socket) option in
testFilter implements the following options which allow for the full set of modifications
that
Syslogd2 can make to syslog events as they pass through the system. Other input options supported by
Syslgod2 are recognized, but not
implemented. Only one instance of
--input may be specified in a
testFilter configuration file.
This is because the
--input line is actually a simulated structure and not designed for actual use.
- hostname: Sets the default hostname that will be used for any msgs that do not already contain one.
- facility: Sets the default facility value that will be used for any msgs that do not already contain one.
- priority: Sets the default facility value that will be used for any msgs that do not already contain one.
- nohost: Tells Syslogd2 that incoming data has a time-field but no hostname field.
- noheader: Tells Syslogd2 that incoming data has neither a time-field nor a hostname field.
Use this option only if you know that NONE of the incoming data contains timestamps or hostnames (for example if data is from a remote network device).
- noforceprintable: Tells Syslogd2 not to scan incoming data for binary content and not to convert that
binary content to printable characters.
- ignore: Tells Syslogd2 to ignore incoming facility,priority, or hostname content and to use the
'default' values (usually specified in the same option-list) instead.
- tracefilter: Specifies a filename containing a tracefilter.
- filter: Specifies a filename containing a filter.
inputfile (i): Specifies a file containing syslog messages.
This fille will be used to simulate events incoming to Syslogd2.
A good way to produce this file is to actually record incoming data using tools such as 'rkf' or 'rduf' while the
standard syslog daemon is shut down.
tracefilter (t): Specifies a file containing a tracefilter specification.
filter (f): Specifies a file containing a filter specification.
The '
testFilter' binary is always compiled as if the
CAP_FILTERSIN,
CAP_FILTERSOUT,
CAP_THROTTLE, and
CAP_TRACEFILTER
symbols were declared.
[Top of page]