Project

General

Profile

Tracker #31142

Updated by Lukas Zapletal over 3 years ago

* New model class is created: `ReportTranscript` (can be later renamed to just Report): 
 > * host_id 
 > * reported_at 
 > * status (StatusCalculator from the old Reports are slow, refactor how we store them from scratch. is reused and extended to use 64bits) 
 > * body (as a text PostgreSQL type which is compressed but not indexed on purpose) 
 > * origin (Puppet-9, Ansible, OpenSCAP, Unknown: Puppet-9 stands for report format V9 which is compatible with V10.) 
 * New model class `ReportKeyword(id: int, report_id: int, name: varchar)` associated with `ReportTranscript` via a join table and with B-TREE index on report_keyword.name for quick lookup 
 * Example keywords (this is a free-form value and plugin authors will decide what to use): 
 > * `PuppetHasFailedResource` 
 > * `PuppetHasFailedRestartResource` 
 > * `PuppetHasChangedResource` 
 > * `AnsibleHasUnreachableHost` 
 > * `AnsibleHasFailedTask` 
 > * `AnsibleHasChangedTask` 
 > * `ScapHasFailedRule` 
 > * `ScapHasOtheredRule` 
 > * `ScapHasHighSeverityFailure` 
 > * `ScapHasMediumSeverityFailure` 
 > * `ScapHasLowSeverityFailure` 
 > * `ScapFailure:xccdf_org.ssgproject.content_rule_ensure_redhat_gpgkey_installed` 
 > * `ScapFailure:xccdf_org.ssgproject.content_rule_security_patches_up_to_date` 
 * Even with all plugins enabled (Ansible, Puppet, OpenSCAP) it is expected to have up to 2000 keywords in the worst case 
 * Keywords can be added with detail level (a number constant, one of: IMPORTANT, REGULAR, DETAILED) and proxy this will be unused in the first stage but having this defined by plugin authors enables us in the future on large-scale desployments to filter out some (e.g. DETAILED) keywords to shring the join table down to reasonable level. 
 * Plugin authors have complete control on how to store data in the `body` field. It can be JSON, YAML or plain text. There will be created, we can later on consider merging them two APIs available for plugins to extend: import and view 
 * New import processing pipeline API will discourage plugins from accessing the model directly: 
 > * New report comes in 
 > * Foreman detect the origin 
 > * Foreman creates an instance of a plugin input transformation class 
 > * Report body (as Ruby hash) is passed into core. 

 the class 
 > * Plugin performs transformation: hash-in - hash-out + status (big int) + keywords (hash or set) 
 * The full plan same transformation is done during data migration (upgrade process from legacy reports to the new report) 
 * For report displaying, similar pipeline is available: 
 > * Report is loaded for display 
 > * Foreman creates an instance of a plugin view transformation class 
 > * Report body (as Ruby hash) is passed into the class 
 > * Plugin performs transformations (hash-in - hash-out - JSON for API output or UI) 
 > * Data is passed into views (ERB, RABL) for final display 
 * Plugin authors should not abuse keywords to report things that are likely be set for most reports, for example OpenSCAP should not be creating `ScapPassed:xyz` keywords because there will be too many of them. 
 * Searching is supported via: 
 > * Indexed keywords (e.g. `origin = scap and keyword = ScapHasHighSeverityFailure` or simply just the keyword which will be the default scoped_search field) 
 > * Full text in body (hidden by default in the README: https://github.com/theforeman/foreman_host_reports UI, not advertised in docs) 

 Discussion on how we did get there: 

 https://community.theforeman.org/t/rfc-optimized-reports-storage/15573

Back