Project

General

Profile

Bug #1151

Too many systems in dashboard summary

Added by Jacob McCann almost 8 years ago. Updated over 6 years ago.

Status:
Closed
Priority:
Low
Assignee:
Category:
Dashboard
Target version:
Difficulty:
Triaged:
No
Bugzilla link:
Pull request:
Team Backlog:
Fixed in Releases:
Found in Releases:

Description

I am now noticing that hosts with modifications (active) are treated as both active and good. So I have 16 systems currently and it says 16 are good and 1 is active in my graph on the dashboard when I would expect 15 good and 1 active. This also affects the "Good Host Reports in the last x minutes" text summary.

Associated revisions

Revision f443d54a (diff)
Added by Tim Speetjens almost 8 years ago

fixes #1151 Fix dashboard pie, to contain correct total number of hosts

to me this makes sense. Don't shoot me if the logic isn't 100% correct...

in short, active hosts are counted as ok hosts, which makes the
counters get higher than the number of total hosts sometimes.
To bring the pie to the sum of the hosts, only the missing reports
should be added, which is included too.

Tim

Signed-off-by: Tim Speetjens <>

Revision 87e42d24 (diff)
Added by Tim Speetjens almost 8 years ago

refs #1151 Fix the scopes so they behave as expected and Adapt dashboard pie data to the corrected scopes

Signed-off-by: Tim Speetjens <>

History

#1 Updated by Ohad Levy almost 8 years ago

looking at the code, I cant figure out why you get duplicates:

  • active hosts = hosts that have applied or restarted resource
  • good hosts = hosts that don't have applied, restarted, failed or failed restarts

neither care about skipped resources, which I cant see a reason why one host would be in both groups.

any idea?

#2 Updated by Jacob McCann almost 8 years ago

I'll do some more digging today. I'm hitting this regularly and its easy to reproduce. I don't think it has anything to do with skipped resources this time. ;)

#3 Updated by Jacob McCann almost 8 years ago

I don't know if this will help.

mysql> select id,name,puppet_status from hosts;
+----+---------------+
| id | puppet_status |
+----+---------------+
|  2 |     150994944 | 
|  3 |     150994944 | 
| 10 |     150994944 | 
| 11 |     150994944 | 9 skipped, no failures/errors
| 13 |     150994944 | 
| 14 |     150994944 | 
| 15 |     150994944 | 
| 17 |     150994944 | 
| 18 |     150994944 | 
| 24 |     150994944 | 
| 25 |     184549376 | 
| 26 |     184549376 | 
| 27 |     184549376 | 
| 28 |     184549376 | 
| 36 |     150994944 | 9 skipped, no failures/errors
| 37 |     184549376 | 
| 38 |     184549376 | 
| 39 |     184549376 | 11 skipped, no failures/errors
| 40 |     184549376 | 
| 41 |     184549376 | 11 skipped, no failures/errors
| 42 |     150994944 | 9 skipped, no failures/errors
| 43 |     150994951 | 7 applied, 9 skipped, no failures/errors
+----+---------------+  

I couldn't figure how exactly to translate the puppet_status, but I correlated it to the last run report to give me an idea that its some mathematical way of showing the status of the system.

Anyways, the above systems/status causes on my dashboard:

Description                                      Data
Good Host Reports in the last 60 minutes         22 / 22 hosts (100%)
Hosts that had performed modifications           1
Out Of Sync Hosts                                0
Hosts in Error State                             0
Hosts With Alerts Disabled                       0

And its host with id 43 that is both showing as 'good' and 'active'.

#4 Updated by Ohad Levy almost 8 years ago

Jacob McCann wrote:

I don't know if this will help.

[...]
I couldn't figure how exactly to translate the puppet_status, but I correlated it to the last run report to give me an idea that its some mathematical way of showing the status of the system.

the status number is actually a bit field, where each 6 bits represent a field.
so it allow us to save in one integer all of the metrics (failed, restarted etc)

its probably easier to read the code, or even better, use the rails console to play with the values.

cd ~foreman
./script/console -e production
Host.all.each do |host|
  puts "#{host}: status => #{host.status.inspect}" 
end

#5 Updated by Ohad Levy almost 8 years ago

did you find anything?

thanks

#6 Updated by Jacob McCann almost 8 years ago

Here is a paste with: http://pastebin.com/w18RiGxV

Status of all recent hosts
Status of all hosts
Count of all hosts
Count of hosts recent.successful
Count of hosts recent.with_changes
Count of hosts recent.out_of_sync

So you can see the count is off by 1 currently for successful count ... unless (some) systems with recent changes are part of that count.

I say 'some' because there are times when everything does match up ...

If you want more output let me know. I'm not sure how to dig much deeper into this to help troubleshooting. :(

#7 Updated by Tim Speetjens almost 8 years ago

  • Status changed from New to Closed
  • % Done changed from 0 to 100

#8 Updated by Ohad Levy almost 8 years ago

  • Category set to Dashboard
  • Assignee set to Tim Speetjens

Also available in: Atom PDF