Project

General

Profile

Bug #3851

Reports with errors on the puppet master side are not shown as failed

Added by Anonymous almost 7 years ago. Updated over 2 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
Foreman modules
Target version:
Difficulty:
Triaged:
Bugzilla link:
Pull request:
Fixed in Releases:
Found in Releases:

Description

If a puppet agent gets an error like "Could not retrieve catalog from remote server: Error 400 on SERVER: Failed to parse template" or a syntax error on the puppet master, this is correctly shown as
Level "err"
Resource "Puppet"
Message "Could not retrieve catalog..."
when I click in such a report. However, when I browse reports, such reports are marked with 0 "Failed" events. I then have to search for e.g. "log ~ err" to see if such reports exist.

These Reports should be marked as failed.


Related issues

Related to Foreman - Bug #2976: Reports page not showing when an Error was returnedDuplicate2013-08-28
Related to Foreman - Bug #8734: Failed puppet report shown as successfullRejected2014-12-17
Has duplicate Foreman - Bug #5797: Foreman does not flag catalog retieve failures as a puppet failure if the cached catalog runs without issuesDuplicate2014-05-19
Has duplicate Foreman - Bug #855: some non-empty reports are not in interesting reportsDuplicate2011-04-20

Associated revisions

Revision eb7410a3 (diff)
Added by Dominic Cleal over 6 years ago

fixes #3851 - increment error counter for non-resource Puppet errors

History

#1 Updated by Dominic Cleal almost 7 years ago

  • Related to Bug #2976: Reports page not showing when an Error was returned added

#2 Updated by Dominic Cleal almost 7 years ago

Is this the same as #2976 do you think? Does the host show the error, but the report metrics don't?

#3 Updated by Anonymous almost 7 years ago

The host has a green zero and "No changes", so it's not set into error state...
I recall that it worked with Puppet 2.7 and the behaviour changed with puppet 3.x (but I can't recall if it still worked with 2.7 clients against a 3.x master).
Also, I just was tinnkering around and the error does correctly set the error state, wenn I do a puppet run with "puppet agent --test" on the shell of such a client, but if the normal 30 minute puppet runs as daemon run into such an error, it does not set the error state.

#4 Updated by Dominic Cleal almost 7 years ago

The difference between daemon and --test mode might be "usecacheonfailure" which is enabled by default I think and disabled with --test. It might be in daemon mode that it's falling back to a copy of the old catalog.

#5 Updated by Anonymous almost 7 years ago

ah, right... if the case of the non-cached catalog I also have the error "err Puppet Could not retrieve catalog; skipping run"

#6 Updated by Anonymous almost 7 years ago

Puppet itself doesn't include these types of errors in it's summary anymore, that seems to be the root cause:

--- !ruby/object:Puppet::Transaction::Report
  metrics: 
    resources: !ruby/object:Puppet::Util::Metric
      name: resources
      label: Resources
      values: 
        - - total
          - Total
          - 73
        - - skipped
          - Skipped
          - 0
        - - failed
          - Failed
          - 0
        - - failed_to_restart
          - "Failed to restart" 
          - 0
        - - restarted
          - Restarted
          - 0
        - - changed
          - Changed
          - 0
        - - out_of_sync
          - "Out of sync" 
          - 0
        - - scheduled
          - Scheduled
          - 0

The error itself is correctly tagged and gets imported correctly:
  logs: 
    - !ruby/object:Puppet::Util::Log
      level: !ruby/sym err
      tags: 
        - err
      message: "Could not retrieve catalog from remote server: Error 400 on SERVER: Could not parse for environment production: Syntax error at 'mcollective::server::setting'; expected '}' at /etc/puppet/manifests/site.pp:35 on node xxx" 
      source: Puppet
      time: 2014-01-05 22:49:11.379313 +01:00
    - !ruby/object:Puppet::Util::Log
      level: !ruby/sym notice
      tags: 
        - notice
      message: "Using cached catalog" 
      source: Puppet
      time: 2014-01-05 22:49:16.229270 +01:00
    - !ruby/object:Puppet::Util::Log
      level: !ruby/sym notice
      tags: 
        - notice
      message: "Finished catalog run in 10.71 seconds" 
      source: Puppet
      time: 2014-01-05 22:49:29.766312 +01:00

#7 Updated by Jon McKenzie over 6 years ago

This seems to be the relevant epic on the Puppet side: https://tickets.puppetlabs.com/browse/PUP-283

Since this seems to have low priority on the Puppet end (some of the relevant issues are targetted for 4.x), is there anything that can be done on the Foreman side of things? It's a little user unfriendly to see hosts with no activity, that when you click into their reports you see compilation and other errors.

A simple implementation could be to alter the Foreman report processor to add its own Foreman-internal flag into the report if it detects 'error' log lines in the report. The Foreman UI could then pick up on this flag and display it. (Though I'm not sure what the precedent is for altering a report to add custom data)

#8 Updated by Dominic Cleal over 6 years ago

  • Project changed from Foreman to Installer
  • Category changed from Reporting to Foreman modules
  • Status changed from New to Ready For Testing
  • Assignee set to Dominic Cleal
  • Target version set to 1.8.3

https://github.com/theforeman/puppet-foreman/pull/186

This works on the same principle as some other fixes we make to the counters, which causes Foreman to flag the host as in an error state.

#9 Updated by Dominic Cleal over 6 years ago

  • Status changed from Ready For Testing to Closed
  • % Done changed from 0 to 100

Applied in changeset puppet-foreman|commit:eb7410a3787432f417dc13465f25e46353362d91.

#10 Updated by Dominic Cleal over 6 years ago

  • Has duplicate Bug #5797: Foreman does not flag catalog retieve failures as a puppet failure if the cached catalog runs without issues added

#11 Updated by Dominic Cleal over 6 years ago

  • Legacy Backlogs Release (now unused) set to 16

#12 Updated by Bryan Kearney over 6 years ago

  • Bugzilla link set to https://bugzilla.redhat.com/show_bug.cgi?id=1110365

#13 Updated by Anonymous about 6 years ago

  • Has duplicate Bug #855: some non-empty reports are not in interesting reports added

#14 Updated by Dominic Cleal almost 6 years ago

  • Related to Bug #8734: Failed puppet report shown as successfull added

Also available in: Atom PDF