Welcome to Syslogd2

(The Home of Syslogd2 On-Line Documentation)


Home Page

What is Syslogd2 ?
Why Syslogd2 ?
About Syslogd2
About this site
Key Features of Syslogd2
Vision for Syslogd2

Configure Syslogd2

Sample Files

Deployment

New Concepts

The Config File

Compile & Install

Misc Topics

Capabilities

Demonstrations

Reference

Glossary of Terms

External Links

Syslogd2 Project Site
DBD2 Home Page
DBD2 Project Site

Other References

RFC 3164 (The BSD Syslog Protocol)
RFC 3339 (Internet Time Format)
RFC 5424 (Syslog Version 1)

Focus on Network

Syslogd2 Input Options
Syslogd2 Output Options
Queueing and Data Loss

What is Syslogd2 ?

[Top of page]

Syslogd2 is a replacement for rsyslog (or other Linux syslog processing-service) that is designed to be a 'toolkit' for network-management / host-management applications. In designing/writing Syslogd2, I have avoided as many Linux-specific features as possible to maximize the ability to port Syslogd2 to other platforms. Where a feature MUST use Linux-specific function calls, that feature is either marked as optional or the function calls are bracketed by compiler directives that will substitute alternative code when compiled on the ported OS.

By 'toolkit', I mean that Syslogd2 is a base server with many optional components that (when paired with DBD2) provides the data-collection and database-insertion functions that can provide a working foundation for a syslog-based full-fledged network-/host-managment reporting and management system.

The 'base' version of Syslogd2 will function as a fully-capable syslog processor if all that is needed is basic data-collection from standard syslog inputs and logging data to disk. To fully realize the capabilities of Syslogd2, additional features must be compiled into the binary. There are (as of this writing) over 20 features that can be selected into the code at compile-time.

In support of a network-management / host-management system, Syslogd2 provides the data-collection, filter-based data-volume-reduction, data-transmission and data-extraction functions to produce a 'parameter-list' of name=value pairs that DBD2 then uses as input to create and populate individual data-fields in one or more user-defined MySql databases or database-tables.

To get the most out of Syslogd2, it is necessary to revise many of the assumptions and thought-processes that surround syslog processing today. This website exists to address all the various Syslogd2 options that exist in an attempt to document all the features of Syslogd2. This link and the New Concepts link are overviews or summaries of various ways that Syslogd2 requires users to change the way they think about syslog processing.

A partial list of Syslogd2's most distinguishing features includes:

Syslogd2 is a replacement for whatever syslog daemon is currently running on a Linux host. Syslogd2 is written to be portable between Linux and other Unix-like platforms. At one point it was ported to (and running on) Apple's OSX operating system. That portability was not been kept current due to lack of interest, but can quickly be re-instated if the interest to do so is shown.

Why Syslogd2 ?

[Top of page]

Whenever an application fails or a network 'drops service', the first (and truthfully 'only') place administrators and engineers usually turn to for additional information is their log files. This is because failures rarely occur without some prior 'warning' appearing in some log file somewhere. Usually, competing 'hypotheses' (little more than informed 'guesses') as to the cause of the outage(s) are confirmed or denied by log file entries.

Why then do we not use syslog entries for management of hosts and networks ? If we were to do so, we could (in theory) see the preliminary errors and warnings that could be fixed to 'head off' complete systems failure. We could find minor issues that (if heeded) could improve system performance (both throughput performance and stability) and we could track and verify compliance with a variety of corporate policies (policies such as 'no network changes outside the declared change window', 'no USB devices that could be used for corporate espionage', etc).

The answer to why we do not use syslog for day-to-day network management is a simple one: the software tools required to effectively manage the massive amount of syslog traffic generated by either host-based applications or network appliances on a day-to-day basis have simply never been written.

Syslogd2 was written to address that perceived need. It was written in the belief that network-management systems should always be customized to the networks they are inteded to manage (that there is no 'one-size-fits-all' solution to either network management or host-management).

Network (and host-) management personnel should not have to duplicate (or spend lots of money on) the same (functional) code for each network-management implementation. Syslogd2 (and its sister projects) were written to handle the 'mechanics' of data-collection and consolidation, allowing the network management teams to configure rather than write the 'basic' (fundamental) data-collection code required for virtually all significant network management systems. If you take the time to examine Syslogd2, I think you'll find that Syslogd2 is more about customizing itself to your requirements than it is about customizing your requirements to its capabilities. Syslogd2 (and its siblings) can do the 'mechanical' data collection and even convert the data into a user-defined MySql database. This gives network-management teams a MASSIVE head-start in their efforts to build a successful management infrastructure.

Another issue with today's network management offerings is that commercial enterprises (the larger the enterprise, the more likely this will be) push their own agendas and 'vision' of the one-size-fits-all solutions to network management issues -- visions that always seem to center around their own products or offerings (of course). By making Syslogd2 and its siblings open-source, my intent is to throw a wrench into the existing 'money-trains' of big network companies while simultaneously providing proof that there is no agenda behind Syslogd2. Furthermore, by making Syslogd2 open-source, I MAY be able to encourage a new service industry -- one that exists to implement network-management solutions. Smaller companies seem to be more willing (especially at affordable prices) to put service over product sales.

For whatever reason, the IT industry has chosen to chase all sorts of non-syslog-based 'shiny silver bullets' in the name of 'network management' rather than to demand (or produce) software that actually uses the log files (the primary and often 'only' go-to source for post-failure analysis) that are produced (if not disabled) by applications specifically for the purpose of alerting administrators to potential failure situations. For the purpose of my diatribe, 'shiny silver bullets' include (but are not limited to):


Out of the above approaches to network-management, the obvious question is: "If syslog log files are the 'go-to' for post-failure analysis, whey don't we (the IT community) consider using syslog as our primary notification source for pending network or application failures ? Indeed, most IT managers and engineers do not see the irony in not using syslog events as a primary means of network management -- even though they are often the only source of post-failure information.

After working for years on a fortune-500 campus network, I can say with certainty that there is virtually no place that a commercially-sold syslog-based system using currently-fielded software can be attached where it can handle the requisite volume of incoming syslog events from even a fraction of the network. This shortcoming is not the fault of the protocol itself, but rather the fault of the way the protocol is handled and implemented.

Due to the free-form nature of the syslog protocol (the contents of a syslog msg-string are intentionally undefined other than the specificaiton that it consist of a single string composed of printable ASCII characters), syslog is an ideal protocol for logging the unique requirements of applications and network devices because it does not enforce any structure or limitations on what may be reported. A syslog message string is inheretly limited to a bit less than 1500 bytes by the size of a UDP packet on Ethernet (UDP being the defined protocol for syslog).

The downside of this openness is that there are no limits on how many messages may get generated per unit time (per device) nor any kind of enforcement over what the content must look like (either format or verbiage). There are also no enforced controls over what kinds of messages may be put into each of the 8 named priority-levels in the syslog protocol definition.

A few examples may illustrate my point:


In recognition of the massive volume of available syslog data, a syslog-based network management system would need to be decentralized with remote nodes ruthlessly minimizing the data being forwarded to the central host while simultaneously logging copius amounts of data to local disk in an easy-to access manner (preferably individual files for each application vs a single [mixed] file) for after-action reviews.

Syslogd2 was designed and written to address all the shortcomings I am aware of in today's syslog processing implementations. It is written to be either the primary syslog service for Linux (or other *nix hosts) or as a regular application co-existing with an existing syslog service. Syslogd2 can scale down to the smallest laptop or up to handle high-speed firewalls (it was originally designed to handle a PIX firewall on an OC-3 network pipe at full debug level). When Syslogd2 functions as the central syslog collection/consolidation point of a network-management system, it can be combined with its sister-projects (DBD2 and SLP2) to extract user-specified information fields from the free-form syslog text strings to populate one or more user-defined MariaDB or MySql databases.

About Syslogd2

[Top of page]

Syslogd2 has been re-imagined and re-designed to be the syslog collector that is required for todays corporate networks and their network-management needs. It is focused not just on logging data to disk, but on obtaining and pre-processing all possible sources of information (including user-written processes) locally before it forwards selected information to one or more central hosts to be inserted into databases (with DBD2) or sent to alerting systems. Syslogd2 is open source so that no one company can buy it out and then smother it with proprietary intentions (patents and copyrights). As a result of its re-design, Syslogd2 introduces many new features and implements many new concepts not found in previous syslog processors.


Syslogd2 development has been focused on usage in networked environments to collect and pre-process syslog traffic at point-of-receipt (with initial emphasis on Linux platforms). Syslogd2 is specifically designed for use as part of an overall network-management system (a design that is a superset of the functionality of current syslog processors). As such, Syslogd2 has been developed with several (sometimes competing) goals. Syslogd2 is still under development and will continue to gain features and capabilities as its sister projects (DBD2 and SLP2) mature. The following list of design goals is not in any particular order:


  1. Portability. This has already been discussed above.
  2. Minimal 'learning curve' for administrators. Syslog configuration files should (as much as feasible) look and act the same to minimize surprises due to platform differences.
    • Syslogd2 is designed to be portable and (though initially Linux-only) should be easy to port to other operating systems with (minimal) effort.
    • Syslogd2 configuration file syntax is a proper superset of the traditional UNIX syntax, meaning that all changes to the syslog file format over Unix take the form of 'additions' rather than 'changes'. Most importantly, the Syslogd2 syntax allows for virtually unlimited 'growth' of new features. Its use of compiler-symbols to literally include/exclude selected code and structures from the compiled binaries allows admins to only compile the pieces they need.
    • Syslog abides by as many Linux (and UNIX) software standards as possible regarding file locations and file names:
      • The default location of the syslog configuration file is '/etc/syslog.conf'.
      • Syslogd2's ancillary configuration directory (for ancillary input files) is '/etc/syslog.d'.
      • Syslogd2's default spool directory is /var/spool/syslog'.
      • Syslogd2 stores it's 'pid' (process-id) file in /var/run.
    • Syslod2 attempts to maintain full 'backwards' compatability with Unix-style system log-daemons:
      • Like traditional syslog systems, Syslogd2 disables support for IP by default. The traditional command-line options '-i' or '-r' will enable IP support and the traditional '-f' option will enable the forwarding of received IP traffic to IP destinations (actually '-r' does both).
      • Syslod2 accepts (extended) traditional syntax for output-lines in the configuration file. ('Extended' in the sense of using '.=', and '.none' in selector-strings as well as the 'comma-syntax' to specify multiple facilities with a common priority value.)
  3. Minimize the impact of 'change' to configuration-file syntax while allowing for the addition of configuration elements to control new features and functionality.
    • Syslogd2 uses a highly modular design that allows for a lot of network- and host-management-related features.
    • These features need to be configured. Due to the number of features (and their options), command-line management can become unwieldy (even limiting) due to aggregate length.
    • Syslogd2's solution is to define new-feature-configuratons as command-line parameters (since that 'fits' the traditional syntax), but then to allow any command-line parameters to be re-located into the configuration file (where linear-access [of a single command-line string] and overall length cease to be problematic).
      • To identify a line as containing containing command-line input, Syslogd2 defines the tilde ('~') as a 'recognition character' if found as the 1st character of a line.
      • Extensions to output-lines are implemented by adding an optional 'options-list' at the end of the line.
      • Documentation of the syslog file is enhanced by allowing 'in-line comments' (comments that follow configuration entries in the same line).
    • To allow 'partial' roll-outs of Syslogd2, it may be desirable to define and deploy a common configuration file to be used by different syslog processors depending on need and phase of the roll-out. This allows for a common log-file configuration across multiple machines making both network administration and multi-host management easier (especially when multiple binaries may be involved since only one configuration file will be in use at a fixed file-system location). This ability may also be desirable as a 'hedge' against 'roll-backs' of binary images due to updates or during initial implementations of ported version of Syslogd2 to UNIX systems.
    • Because the addition of any syntax element to an existing syslog-file syntax may cause the existing syslog daemon to fail, Syslogd2 re-imagines the traditional view of a 'comment' (since 'comments' are about the only syntax that all parsers agree on).
      • SoftComment This global boolean variable only affect how Syslogd2 parses files. It causes parsing to continue after finding the first hashtag in a line. SoftComment may be enabled or disabled as often as desired duringg parsing.
        • If the next token found in the line is a tilde ('~') the command will be read as Syslogd2-command-line parameters.
        • If the next token found in the line is a valid facility name or wild-card:
          • If the SoftComment boolean is currently enabled, the line will be parsed as a Syslogd2-compatible output-line.
          • If the SoftComment boolean is currently disabled, the line will be considered a comment and ignored.
        • Otherwise the line is considered to be a comment and is ignored.
        • When SoftComment is enabled, a 2nd hashtag character ('#') marks the start of an unconditional ('hard') comment.
        • Syslogd2 considers all SoftComment to always be 'on' for command-line entries so the "--enable softcomment" instruction can be 'hidden' from legacy syslog daemons.
      • First-come-first-serve parsing policy In order to catch and report on duplicated configuration settns, Syslogd2 implements a strict 'first-come-first-serve' policy when parsing variables. Any attempt to set a variable not identified for 'multiple instance' use that was set by a previous line will result in an error-log message and rejection of the subsequent setting.

        One advantage of this policy is that Syslogd2 can create a 'null' (SoftComment) entry for an output-lines that contains the syslogd2-specific option-list or 'extra' facilities, followed by a 'duplicate' entry that the legacy parser will see. Modifications to the 'active' (legacy) line will then affect both service-daemons. Details
      • --Skip <n> This command-line option is (admittedly) only useful in the configuration file. It causes Syslogd2 to 'skip' (ignore) the next <n> lines or to end-of-file (whichever comes first). This option is provided for those few (primarily Linux) configuration lines that Syslogd2 simply cannot parse or properly support.
  4. Provide an ultra-high-speed (don't you just HATE suparlatives ?) syslog processor capable of keeping up with the high syslog volumes experienced in fast, busy firewalls and larger corporate syslog hubs. (worker-threadpools and output-threadpools
  5. Provide a syslog daemon that is effective for use on laptops and work-from-home systems despite the ability of laptop and work-from-home users to access multiple networks. (Network Awareness
  6. Provide a store-and-forward capability for syslog daemons to provide 'other-end' log data after a network outage has been restored. CAP_SPOOLFILES
  7. Provide a means of controlling the insanely large amount of syslog traffic that is currently required of any syslog-based management system and that routinely congests network links (Filters, Spooling.
  8. Provide a simple method for administrators to select and process text-file input from either existing syslog or non-syslog log files. (CAP_TAILFILES and Filters
  9. Do all the above in a cost-effective, vendor-neutral manner. (Open-source project(s) + Linux OS on PC hardware)

About this site

[Top of page]

This site is the home for on-line Syslogd2 documentation and reference material. If you have downloaded the Syslogd2 project source code and compiled from scratch, you should find a 'docs' directory in your download containing '.odt' files suitable for producing hard-copies. This site will always be more current than the files made available for download and off-line viewing. However, attempting to print hardcopies from this website may lead to frustration due to the eye-saving black background-color that will cause excessive use of printer ink.


This site is still under construction, but I wanted to make as much information on Syslogd2 as available as possible as soon as possible, so please bear with me while I build this documentation site.

Some Key Features of Syslogd2

[Top of page]

Scalable

Syslogd2 is scalable in both capacity and feature-set.


  • Syslogd2 has over 20 optional features to be selected at compile-time (multiple binaries may be compiled at once - each with a different set of features). These optional features are referred to as 'capabilities' (or 'CAP_*-abilities' after the names of the compiler-symbols).
    When capabilities are added to a Syslogd2 binary image, they will generally default to off (or to a 'neutral' or 'inactive' condition until called upon by configuration.

    The modular design of Syslogd2 allows for new capabilities to be added when time permits (I have a growing list).

  • For input, Syslogd2 provides support for reading text-files as input ('tailing' a log file), input from named-pipes, TCP and UDP input (Ipv4 and IPv6) on user-specified ports, and input from arbitrary (user-defined) Linux sockets (supporting both stream and datagram protocols).

    For output, Syslogd2 supports writing to arbitrary character devices (such as pseudo-terminals), TCP hosts and arbitrary (user-defined) Linux sockets (both stream and datagram protocols) in addition to the full suite of traditional syslog output (files, tty, console, udp, named-pipe, user-list, etc). Output support for IP connections includes both IPv4 and IPv6 as well as user-defined connections on user-specified ports.

  • In terms of capacity (throughput), Syslogd2 baselines as a multi-threaded process (much like other multi-threaded systems).
    As the need grows, Syslogd2 can add not just additional threads, but enire new threadpools that allow resources to be concentrated where needed. As the need continues to grow, Syslogd2 can divide the processing of individual messages to further increase processing efficiency, allowing reader-threads to concentrate on accepting input while processing threads focus on parsing and processing the buffered traffic. Output thhreadpools allow the use of many output files or connections with little to no risk of dropped (lost) input by offloading the tasks of testing and writing to devices, files, sockets and pipes.

  • Pre-loadable Name-Cache

    Syslogd2's integrated name cache was originally focused on speeding up host-name resolution wile simultaneously reducing the load that syslog name-resolution places on DNS servers.

    The optional (user-provided) cache-file allows the cache to be pre-loaded at startup, ideally even replacing the need for DNS services.
    Since initial implementation, the cache-file has proven useful in other roles as well (insuring all output remote destinations can be resolved and normalizing secondary and tertiary addresses (and hostnames / aliases) to a the canonical hostname that will be sent to 'downstream processes'.


    Spooling

    Syslogd2 supports two types of spooling:

      Connection Spooling: This is the 'standard' store-and-forward type of spooling that is commonly discussed when the topic of 'spooling' is brought up. In order to work, connection spooling must be using a connection that supports a handshake protocol (Linux streaming socket, TCP or Named-Pipe output) in order to sense a broken connection. Whenever Syslogd2 recovers a previously-down connection, it checks for a spool-file and (if it finds one) it will transmit any (delayed) contents of the spoolfile that are still relevant to the remote host.

      Network State Spooling: This is a new type of spooling. It works with any type of remote-process protocol (any type of socket or name-pipes).
      The basic idea behind network-state spooling is that a syslog host may be on a network that periodically changes state while traffic input is relatively constant (it is always potentially present). Sometimes, traffic is received for a remote host when the network is not in a state that allows transmission to that host. In these cases, Syslogd2 can spool the data for transmission once the network reaches an 'acceptable' state.

    1. Example: Syslog is receiving input before the network is started that should be sent to a remote host (perhaps a disk is failing or a memory fault is found by the kernel's POST test).
      The current network 'state' is 'down' (or perhaps there's a network issue and the state is 'local', but unable to acheive the state of 'other' (fully up and availablea). Syslogd2 can spool this information until the network starts up or otherwise becomes available, THEN transmit the data to the remote host.

    2. Example (Illustration only since the network-check routines that would actually enable this functionality have not yet been written): A linux laptop is configured to forward selected events to a central loghost at the employer's location. Since this user often works from home or other locations where he accesses the internet, 'standard' connection-spooling can not facilitate event-logging from this laptop. Because Syslogd2 is always aware of the current 'network state', it can use network-state spooling to spool events until the laptop encounters the specific set of conditions that have been specified as defining that the laptop has connected to the employer's network. Only when the network state has been confirmed to be 'acceptable' (ie: connected to the correct network), will Syslogd2 'unspool' the stored events to the employer's syslog collector.

      Because network-state spooling is not dependent on the connection or handshake, it can work with either TCP or UDP protocols (or streaming or datagram Linux sockets), but will most likely be used over TCP with connection-state spooling.

    Filters

    Syslogd2 filters act either to modify the contents of a syslog message or in a go/no-go mode to either accept or drop (discard) a message.
    All Syslogd2 filters use the same format and syntax. There are two types of filter: input filters and output filters. The difference is whether they are declared as an option to a an input specification or as an option on an output line.
    Each filter can mix 'default-pass' and 'default-discard' modes of operation. Each filter can inspect or modify any field of the syslog message except the time field.


  • An input-filter is declared as an option to an input source (file, kernel, socket, pipe, etc). An input filter will receive and process all traffic that is received from the source on which it is declared. The same filter may be applied to multiple sources, however each source may only declare one filter.
    Input filters are applied at the point in processing after all the message has been fully parsed and resolved so that 'final' parsing results are presented to the filter. They are applied just before the point where the facility+priority of the message is compared to the selector-strings of the defined output locations.

    The power of input filters is that by changing the facility and/or priority values of a message being processed, that message can be re-routed to different destinations based on the contents of any component (or combination of components) of the message.

  • This allows (for example) the sorting of log events into different files based on the 'tag' field (the application name at the start of of most messages).


    It also allows the extraction and selection and redirection of any message containing the string "username: " or "user=".


  • An output-filter is declared as an option on an output line in the configuration file and applies only to the data-stream that is being sent to that destination based on matching a particular selector-string subset. (Space here does not permit excessive details).
    Output filters are applied after a destination has been determined to match a particular message. The only processing events that occur after the message is processed by an output filter (assuming the message is not discarded) are the adjustments made by the 'localhosts' and 'stripdomains' options to the hostname field and final output-line formatting prior to actually being written to the destination.

  • The placement and timing of the output-filters allows them to provide two key features to Syslogd2:

  • Traffic reduction. By acting as 'discard' filters based on message content, they can act as 'gatekeepers' by discarding all outgoing events except those specificatlly desired by the administrator. This can drastically reduce the overall network traffic of a syslog-based management system, but at the same time provides a control to prevent the central collection-host from being overwhelmed by the sheer volume of incoming traffic -- no matter how much is being generated or how many devices are being monitored.

  • It provides the ability for Syslogd2 to specify both the output format and content of traffic sent to each destination. Syslogd2 can change the facility+priority value (or any other part) of a given message to whatever the remote host wants to see (or can handle). (example: Syslogd2 has internally moved all events containing the string "username" to 'extra5', but the remote host does not support the 'extra5' facility, so Syslogd2 'moves' that message 'down' to 'local0' with an output filter. The same Syslogd2 sends the same message to DBD2 to be inserted into a database, but DBD2 needs to receive it with facility+priority of 'extra12.notice' for proper routiung by DBD2. Syslogd2 uses a different output filter to change the same message to 'extra12.notice' for DBD2.
  • Vision for Syslogd2

    [Top of page]

    My hope for Syslogd2 is that it becomes a popular (even standard) tool for network administrators (and Linux administration teams) to use to collect syslog data from all their various sources.

    Syslogd2's scalable design and open-source license (Aferro GPL) are intended to keep it free from commercialization so it remains a cheap (free-as-in-$$$) tool that can replace standard Linux syslog services to collect, store, filter and forward syslog events at distributed collection-points throughout a network. The goal is to prevent network degredation that occurs today as massive amounts of syslog data overwhelm network links.

    Syslogd2 can act as a vendor-agnostic syslog collection tool for even the largest networks.

    While Syslogd2 specializes in data-collection and transmission, my DBD2 project specializes in inserting name=value data-fields (syslog or command-line data) into databases at relatively high speed and with maximum versatility. This provides a ready-made 'baseline' for either home-grown or open-source (or even commercial) syslog-based management tools.

    Return to top of page