Bug #1151
closedToo many systems in dashboard summary
Description
I am now noticing that hosts with modifications (active) are treated as both active and good. So I have 16 systems currently and it says 16 are good and 1 is active in my graph on the dashboard when I would expect 15 good and 1 active. This also affects the "Good Host Reports in the last x minutes" text summary.
Updated by Ohad Levy about 13 years ago
looking at the code, I cant figure out why you get duplicates:
- active hosts = hosts that have applied or restarted resource
- good hosts = hosts that don't have applied, restarted, failed or failed restarts
neither care about skipped resources, which I cant see a reason why one host would be in both groups.
any idea?
Updated by Jacob McCann about 13 years ago
I'll do some more digging today. I'm hitting this regularly and its easy to reproduce. I don't think it has anything to do with skipped resources this time. ;)
Updated by Jacob McCann about 13 years ago
I don't know if this will help.
mysql> select id,name,puppet_status from hosts; +----+---------------+ | id | puppet_status | +----+---------------+ | 2 | 150994944 | | 3 | 150994944 | | 10 | 150994944 | | 11 | 150994944 | 9 skipped, no failures/errors | 13 | 150994944 | | 14 | 150994944 | | 15 | 150994944 | | 17 | 150994944 | | 18 | 150994944 | | 24 | 150994944 | | 25 | 184549376 | | 26 | 184549376 | | 27 | 184549376 | | 28 | 184549376 | | 36 | 150994944 | 9 skipped, no failures/errors | 37 | 184549376 | | 38 | 184549376 | | 39 | 184549376 | 11 skipped, no failures/errors | 40 | 184549376 | | 41 | 184549376 | 11 skipped, no failures/errors | 42 | 150994944 | 9 skipped, no failures/errors | 43 | 150994951 | 7 applied, 9 skipped, no failures/errors +----+---------------+
I couldn't figure how exactly to translate the puppet_status, but I correlated it to the last run report to give me an idea that its some mathematical way of showing the status of the system.
Anyways, the above systems/status causes on my dashboard:
Description Data Good Host Reports in the last 60 minutes 22 / 22 hosts (100%) Hosts that had performed modifications 1 Out Of Sync Hosts 0 Hosts in Error State 0 Hosts With Alerts Disabled 0
And its host with id 43 that is both showing as 'good' and 'active'.
Updated by Ohad Levy about 13 years ago
Jacob McCann wrote:
I don't know if this will help.
[...]
I couldn't figure how exactly to translate the puppet_status, but I correlated it to the last run report to give me an idea that its some mathematical way of showing the status of the system.
the status number is actually a bit field, where each 6 bits represent a field.
so it allow us to save in one integer all of the metrics (failed, restarted etc)
its probably easier to read the code, or even better, use the rails console to play with the values.
cd ~foreman ./script/console -e production Host.all.each do |host| puts "#{host}: status => #{host.status.inspect}" end
Updated by Jacob McCann about 13 years ago
Here is a paste with: http://pastebin.com/w18RiGxV
Status of all recent hosts
Status of all hosts
Count of all hosts
Count of hosts recent.successful
Count of hosts recent.with_changes
Count of hosts recent.out_of_sync
So you can see the count is off by 1 currently for successful count ... unless (some) systems with recent changes are part of that count.
I say 'some' because there are times when everything does match up ...
If you want more output let me know. I'm not sure how to dig much deeper into this to help troubleshooting. :(
Updated by Tim Speetjens about 13 years ago
- Status changed from New to Closed
- % Done changed from 0 to 100
Applied in changeset f443d54ae95f47d0e2015ef6ae84753f272d239f.
Updated by Ohad Levy about 13 years ago
- Category set to Dashboard
- Assignee set to Tim Speetjens