Bug #3851
closed
Reports with errors on the puppet master side are not shown as failed
Added by Anonymous almost 11 years ago.
Updated over 6 years ago.
Description
If a puppet agent gets an error like "Could not retrieve catalog from remote server: Error 400 on SERVER: Failed to parse template" or a syntax error on the puppet master, this is correctly shown as
Level "err"
Resource "Puppet"
Message "Could not retrieve catalog..."
when I click in such a report. However, when I browse reports, such reports are marked with 0 "Failed" events. I then have to search for e.g. "log ~ err" to see if such reports exist.
These Reports should be marked as failed.
- Related to Bug #2976: Reports page not showing when an Error was returned added
Is this the same as #2976 do you think? Does the host show the error, but the report metrics don't?
The host has a green zero and "No changes", so it's not set into error state...
I recall that it worked with Puppet 2.7 and the behaviour changed with puppet 3.x (but I can't recall if it still worked with 2.7 clients against a 3.x master).
Also, I just was tinnkering around and the error does correctly set the error state, wenn I do a puppet run with "puppet agent --test" on the shell of such a client, but if the normal 30 minute puppet runs as daemon run into such an error, it does not set the error state.
The difference between daemon and --test mode might be "usecacheonfailure" which is enabled by default I think and disabled with --test. It might be in daemon mode that it's falling back to a copy of the old catalog.
ah, right... if the case of the non-cached catalog I also have the error "err Puppet Could not retrieve catalog; skipping run"
Puppet itself doesn't include these types of errors in it's summary anymore, that seems to be the root cause:
--- !ruby/object:Puppet::Transaction::Report
metrics:
resources: !ruby/object:Puppet::Util::Metric
name: resources
label: Resources
values:
- - total
- Total
- 73
- - skipped
- Skipped
- 0
- - failed
- Failed
- 0
- - failed_to_restart
- "Failed to restart"
- 0
- - restarted
- Restarted
- 0
- - changed
- Changed
- 0
- - out_of_sync
- "Out of sync"
- 0
- - scheduled
- Scheduled
- 0
The error itself is correctly tagged and gets imported correctly:
logs:
- !ruby/object:Puppet::Util::Log
level: !ruby/sym err
tags:
- err
message: "Could not retrieve catalog from remote server: Error 400 on SERVER: Could not parse for environment production: Syntax error at 'mcollective::server::setting'; expected '}' at /etc/puppet/manifests/site.pp:35 on node xxx"
source: Puppet
time: 2014-01-05 22:49:11.379313 +01:00
- !ruby/object:Puppet::Util::Log
level: !ruby/sym notice
tags:
- notice
message: "Using cached catalog"
source: Puppet
time: 2014-01-05 22:49:16.229270 +01:00
- !ruby/object:Puppet::Util::Log
level: !ruby/sym notice
tags:
- notice
message: "Finished catalog run in 10.71 seconds"
source: Puppet
time: 2014-01-05 22:49:29.766312 +01:00
This seems to be the relevant epic on the Puppet side: https://tickets.puppetlabs.com/browse/PUP-283
Since this seems to have low priority on the Puppet end (some of the relevant issues are targetted for 4.x), is there anything that can be done on the Foreman side of things? It's a little user unfriendly to see hosts with no activity, that when you click into their reports you see compilation and other errors.
A simple implementation could be to alter the Foreman report processor to add its own Foreman-internal flag into the report if it detects 'error' log lines in the report. The Foreman UI could then pick up on this flag and display it. (Though I'm not sure what the precedent is for altering a report to add custom data)
- Project changed from Foreman to Installer
- Category changed from Reporting to Foreman modules
- Status changed from New to Ready For Testing
- Assignee set to Dominic Cleal
- Target version set to 1.8.3
- Status changed from Ready For Testing to Closed
- % Done changed from 0 to 100
Applied in changeset puppet-foreman|commit:eb7410a3787432f417dc13465f25e46353362d91.
- Has duplicate Bug #5797: Foreman does not flag catalog retieve failures as a puppet failure if the cached catalog runs without issues added
- Translation missing: en.field_release set to 16
- Bugzilla link set to https://bugzilla.redhat.com/show_bug.cgi?id=1110365
- Has duplicate Bug #855: some non-empty reports are not in interesting reports added
- Related to Bug #8734: Failed puppet report shown as successfull added
Also available in: Atom
PDF