LogstashIntegration » History » Version 5

Version 4 (Lukas Zapletal, 09/02/2015 08:20 AM) → Version 5/6 (Lukas Zapletal, 11/16/2015 10:19 AM)

h1. BitScout Logstash Integration

h2. Summary

Foreman orchestrates several other has many compoments (when involved and when Katello and other plugins are installed):

* Puppet Master
* Puppet CA
* Compute Resources: libvirt, oVirt, VmWare, OpenStack, Docker...
* Candlepin
* Pulp, Crane
* Dynflow
* Cron jobs
* Other configuration management tools: Salt, Chef

Typical transaction of creating new host in Foreman involves:

* acquiring next free IP on DHCP server
* making DHCP reservation
* creating DNS record
* deploying PXELinux configuration, kernel and initramdisk on TFTP
* signing certs on Puppet CA
* contacting compute resrouce (e.g. oVirt) to configure and start the instance
* and more (depending on plugins installed)

All those components generate logs depending on their configuration and they all log to either files or syslog/journald. Centralised logging
plugin is installed with metadata brings transparency it's orchestration capabilities as we currently struggle well as other plugins, possibility to correlate messages coming out of individual systems with orchestration runs, sessions or users. Possibility to pass the metadata IDs into subsystems would be big advantage. Possible correlation identifiers are:

* Foreman user id
* Foreman session id
* Foreman transaction id
* Foreman host IP address
* Foreman host MAC address
* Foreman host provisioning token

h2. Stories

As a Foreman user, I would like to
effectively collect logs from Smart Proxies and managed services at one in a single place
As a Foreman user, I would like to extend the logs
with flexible metadata information
and interface is a Foreman user, I would like big plus.

Logstash open-source project was picked up as the target
to deploy log Dashboard and integrate Foreman with. It is a Ruby modular framework (jRuby actually) with it
As a Dashboard user, I would like to browse Foreman logs
modular backends and correlate them
As a Foreman user, I would like to be able to deploy Dashboard

h2. Owners

* Lukáš Zapletal (Foreman) - `lzap_at_redhat_dot_com`

h2. Current status

* Design WIP done
* Not yet implemented

h2. Implementation

The implementation must build on top of tools which are available in RHEL 6 and 7 today. Looks like rsyslog/systemd would be the correct choice.

Foreman uses "logging" logging gem which is modular log4j-like logging framework with extensibility today. It allows great extensibility. Although direct integration via logging-logstash should be possible ( tries to implement logstash API), integration via journald would provide better flexibility of logging from all other components (foreman-proxy, backend systems) or even scripts (seed, migrate, foreman_hooks) with syslog at support of all metadata. There is a POC journald reader for logstash available:

Using journald as a mediator opens other possibilities with integration to other (even proprietary) logging solutions for those who need it. And
the moment. best thing is it allows to configure Foreman without any additional logging solution, but having all the logs in one place (Foreman server log journal).

TBD This solution has limitation as it can only be used on systems with journald. For the initial version, this is not a problem and if this feature proves to be useful, we can add a fallback mode via syslog. This will not support custom metadata (see below) tho.

h2. Required steps

TBD 1. Extend Foreman logging framework with custom metadata (below)
1. Send session for each message
1. Send corelation ids for Host orchestration
1. Implement logging-journald plugin with support of custom metadata (using gem)
1. Document how to configure journald with logstash for remote collection

Optional steps

1. Hand over correlation id to smart proxy
1. Implement logging gem on smart proxy as well
1. Configure journald on smart-proxy machines via Puppet to send logs on Foreman node

h2. Metadata

Foreman would benefit from the following extra logging fields:

* Session ID: User session - usually a browser session if applicable.

* Transaction/Correlation ID - Usually UUID of the transaction that the logs belong to. This way we can correlate logs from orchestration runs for all services that are under our control: Foreman, Foreman-Proxy, Plugins, Candlepin, Pulp