Project

General

Profile

Actions

Bug #1954

closed

Puppet report status incorrectly set to failed

Added by Frederik Nosi over 11 years ago. Updated almost 7 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
Reporting
Target version:
-
Difficulty:
Triaged:
Fixed in Releases:
Found in Releases:

Description

Hi,

i'm using RedHat Enterprise Linux Server release 5.8 fully patched, with this aditional repositories:
epel, puppetlabs and foreman's yum repo.

Relevant software versions used are:

[root@puppet-test ~]# rpm -qa | egrep 'ruby|puppet|foreman'
rubygem-ruby_parser-2.3.1-1.el5
rubygem-ruby2ruby-1.3.1-1.el5
rubygem-mail-2.3.3-2.el5
rubygem-activerecord-3.0.15-1.el5
rubygem-rails-3.0.15-1.el5
rubygem-hirb-0.6.2-1.el5
rubygem-foremancli-1.0-1.el5
foreman-console-1.0.1-1.el5
rubygem-passenger-native-3.0.12-1
puppetlabs-release-5-6
puppet-2.7.19-1.el5
libselinux-ruby-1.33.4-5.7.el5
rubygem-json-1.6.6-2.el5
rubygem-rack-1.2.5-4.el5
rubygem-mime-types-1.18-1.el5
rubygem-uuidtools-2.1.1-2.el5
rubygem-actionmailer-3.0.15-1.el5
rubygem-bundler-1.0.15-1.el5
rubygem-net-ssh-2.3.0-1.el5
rubygem-awesome_print-1.0.2-1.el5
rubygem-fog-1.4.0-1.el5
foreman-sqlite-1.0.1-1.el5
puppetdb-terminus-1.0.2-1.el5
ruby-1.8.7.370-1.el5
ruby-irb-1.8.7.370-1.el5
rubygem-thor-0.14.6-2.el5
rubygem-builder-2.1.2-2.el5
rubygem-rest-client-1.6.7-1.el5
rubygem-rack-mount-0.6.14-1.el5
rubygem-polyglot-0.3.3-1.el5
rubygem-abstract-1.0.0-2.el5
rubygem-audited-3.0.0.rc1-1.el5
rubygem-scoped_search-2.3.7-1.el5
rubygem-rdoc-3.12-2.el5
rubygem-jquery-rails-1.0.19-1.el5
rubygem-sqlite3-1.3.5-2.el5
rubygem-mysql-2.8.1-2.el5
rubygem-multi_json-1.2.0-1.el5
rubygem-formatador-0.2.1-1.el5
foreman-ec2-1.0.1-1.el5
rubygem-rbvmomi-1.5.1-1.el5
foreman-release-1.0.1-1.el5
foreman-vmware-1.0.1-1.el5
rubygem-daemon_controller-0.2.5-1
ruby-augeas-0.4.1-1.el5
ruby-rdoc-1.8.7.370-1.el5
rubygem-activesupport-3.0.15-1.el5
rubygem-sexp_processor-3.1.0-1.el5
rubygem-activemodel-3.0.15-1.el5
rubygem-activeresource-3.0.15-1.el5
rubygem-rack-test-0.5.7-2.el5
rubygem-treetop-1.4.10-1.el5
rubygem-erubis-2.6.6-1.el5
rubygem-arel-2.0.10-1.el5
rubygem-ancestry-1.2.5-1.el5
rubygem-railties-3.0.15-1.el5
foreman-1.0.1-1.el5
rubygem-net-scp-1.0.4-1.el5
rubygem-unicode-display_width-0.1.1-1.el5
rubygem-sinatra-1.0-2.el5
rubygem-net-ping-1.3.7-1.el5
rubygem-wirb-0.4.2-1.el5
foreman-mysql-1.0.1-1.el5
foreman-proxy-1.0.0-3.el5
ruby-libs-1.8.7.370-1.el5
ruby-devel-1.8.7.370-1.el5
rubygem-passenger-3.0.12-1
puppetdb-1.0.2-1.el5
rubygem-pg-0.13.2-2.el5
ruby-shadow-1.4.1-7.el5
rubygems-1.8.10-1.el5
rubygem-i18n-0.5.0-1.el5
rubygem-tzinfo-0.3.32-1.el5
rubygem-net-ldap-0.3.1-1.el5
rubygem-actionpack-3.0.15-1.el5
rubygem-has_many_polymorphs-3.0.0.beta1-3.el5
rubygem-nokogiri-1.5.2-1.el5
rubygem-excon-0.14.0-1.el5
rubygem-ruby-hmac-0.4.0-1.el5
foreman-cli-1.0.1-1.el5
rubygem-fastthread-1.0.7-1.el5
foreman-postgresql-1.0.1-1.el5
ruby-libs-1.8.7.370-1.el5
rubygem-rake-0.9.2.2-2.el5
rubygem-safemode-1.0.1-1.el5
rubygem-will_paginate-3.0.3-1.el5
rubygem-audited-activerecord-3.0.0.rc1-1.el5
rubygem-acts_as_audited-2.0.0-1.el5
rubygem-mysql2-0.2.18-1.el5
rubygem-hirb-unicode-0.0.5-1.el5
rubygem-trollop-1.16.2-1.el5
foreman-mysql2-1.0.1-1.el5
ruby-devel-1.8.7.370-1.el5
rubygem-passenger-native-libs-3.0.12-1_1.8.7.370
puppet-server-2.7.19-1.el5

I'm using MySQL as a DB, versions:

mysql-5.0.95-1.el5_7.1
mysql-server-5.0.95-1.el5_7.1

The database configuration is:

production:
adapter: mysql
database: foreman
username: foreman
password: XXXXXXXXXX
host: localhost
pool: 5

[ similar for development and test environments ]

Puppet agents are running with the --noop option.
The issue i'm seeing is that although the reports do not have errors, changes not applied, the hosts are in error state. Looking at the DB:

(Not

mysql> describe reports;
------------------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra |
------------------------+------+-----+---------+----------------+ | id | int(11) | NO | PRI | NULL | auto_increment | | host_id | int(11) | NO | MUL | NULL | | | reported_at | datetime | YES | MUL | NULL | | | created_at | datetime | YES | | NULL | | | updated_at | datetime | YES | | NULL | | | status | smallint(6) | YES | MUL | NULL | | | metrics | longtext | YES | | NULL | |
------------------------+------+-----+---------+----------------+
7 rows in set (0.00 sec)

mysql> select distinct(status) from reports;
------ | status |
------ | -32768 | | 0 | | 1 | | 74 | | 198 | | 278 | | 4096 | | 8472 | | 32767 |
------
9 rows in set (0.00 sec)

Here's an entry in reports which is shown in error, but when opened (in https://FQDN_FOREMAN/reports/1200) i see just notices:

status column: 32767
metrics column:

changes: 
total: 0
resources:
out_of_sync: 1
total: 19
events:
total: 1
noop: 1
time:
service: 0.394067
exec: 0.000916
file: 0.212078
package: 0.001021
filebucket: 0.000249
total: 1.59548393463135
schedule: 0.002159
config_retrieval: 0.984993934631348 |

I'm attaching the report.yaml of this entry just in case.

Any ideas?

Thanks in advance,
Frederik


Files

report.yaml report.yaml 12.1 KB Frederik Nosi, 11/16/2012 12:22 PM
Actions #1

Updated by Frederik Nosi over 11 years ago

Forgot formatting table info

mysql> show columns from reports; 
+-------------+-------------+------+-----+---------+----------------+
| Field       | Type        | Null | Key | Default | Extra          |
+-------------+-------------+------+-----+---------+----------------+
| id          | int(11)     | NO   | PRI | NULL    | auto_increment | 
| host_id     | int(11)     | NO   | MUL | NULL    |                | 
| reported_at | datetime    | YES  | MUL | NULL    |                | 
| created_at  | datetime    | YES  |     | NULL    |                | 
| updated_at  | datetime    | YES  |     | NULL    |                | 
| status      | smallint(6) | YES  | MUL | 0       |                | 
| metrics     | longtext    | YES  |     | NULL    |                | 
+-------------+-------------+------+-----+---------+----------------+
7 rows in set (0.00 sec)

mysql> select distinct(status) from reports; 
+--------+
| status |
+--------+
| -32768 | 
|      0 | 
|      1 | 
|     74 | 
|    198 | 
|    278 | 
|   4096 | 
|   8472 | 
|  32767 | 
+--------+
9 rows in set (0.00 sec)

Actions #2

Updated by Frederik Nosi over 11 years ago

Ok, today had some time and tried debugging this issue (i know very little Ruby).
Added some logging to app/models/report.rb, specifically to this functions:

Report.metrics_to_hash
Report.calc_status
Report.status

and here's the output of a report which applies resources correctly and makes a single modification (This correctly sets this hosts status in the dashboard as "hosts that had performed modifications without error"):

[Report.metrics_to_hash] - {"applied"=>1, "failed_restarts"=>0, "skipped"=>0, "pending"=>0, "failed"=>0, "restarted"=>0}
[Report.calc_status] - st: 1 BIT_NUM: 6, METRIC: ["applied", "restarted", "failed", "failed_restarts", "skipped", "pending"]
[Report.status] - st is a: Fixnum
[Report.status] - s: 1

Here's the output of the report of the same host which has pending modifications as puppet was launched with the --noop option. This report set's incorrectly the host in foreman dashboard as "Host in error state":

[Report.metrics_to_hash] - {"pending"=>1, "failed_restarts"=>0, "applied"=>0, "failed"=>0, "skipped"=>0, "restarted"=>0}
[Report.calc_status] - st: 1073741824 BIT_NUM: 6, METRIC: ["applied", "restarted", "failed", "failed_restarts", "skipped", "pending"]
[Report.status] - st is a: Fixnum

I would expect the second case to put the host state as "Hosts that had pending changes" ... I have yet to understand the status calculation though.

Ah, another data point, when i open the second report in the "report view" it shows correctly without errors ...

Any ideas?

Thanks,
Frederik

Actions #3

Updated by Frederik Nosi over 11 years ago

Ok i think i understand now why this happens.

In app/models/report.rb , method self.calc_status there is a snippet of code which should insure that the state field fits in a 16 bit integer (in the DB, table hosts, field puppet_status). To do so it checks each value of puppet_status with MAX:

Here's the calculation, with some lines of debugging added:

    hash.each do |type, value|
      value = MAX if value > MAX # we store up to 2^BIT_NUM -1 values as we want to use only BIT_NUM bits.
      st |= value << (BIT_NUM*METRIC.index(type))
      File.open('/tmp/foreman_rep.log', File::WRONLY|File::APPEND|File::CREAT) { |f| f.write("[Report.calc_status] - type: #{type}, index(type): #{METRIC.index(type)}, value: #{value}, st: #{st}\n") }
    end
    File.open('/tmp/foreman_rep.log', File::WRONLY|File::APPEND|File::CREAT) { |f| f.write("[Report.calc_status] - st: #{st} BIT_NUM: #{BIT_NUM}, MAX: #{MAX}, METRIC: #{METRIC.inspect}\n") }
    return st

But from the debugging output i see that MAX is assigned a value of 63 which is way to high:

[...]
[Report.calc_status] - st: 3221225472 BIT_NUM: 6, MAX: 63, METRIC: ["applied", "restarted", "failed", "failed_restarts", "skipped", "pending"]
[Report.status] - st is a: Fixnum
[Report.status] - s: 3221225472

The place where MAX is calculated is in models/report_common.rb, class ReportCommon:

module ReportCommon
  METRIC = %w[applied restarted failed failed_restarts skipped pending]
  BIT_NUM = 6
  MAX = (1 << BIT_NUM) -1 # maximum value per metric

Having a 16 bit for the field, we can use at max 2 bits for the 6 types of statuses so using dynamic calculation like the existing code, that should be:

module ReportCommon
  METRIC = %w[applied restarted failed failed_restarts skipped pending]
  FIELD_BITS = 16
  BIT_NUM = FIELD_BITS / METRIC.length
  MAX = 1 << BIT_NUM -1 # maximum value per metric

With this fix reports are correctly displayed in the dashboard and hosts views, but it tops the maximum number of changes per status type to two which is a bit too low. Would it be more convenient to make the field a bit larger?

Frederik

Actions #4

Updated by Anonymous almost 7 years ago

  • Description updated (diff)
  • Status changed from New to Resolved

should be fixed by now

Actions

Also available in: Atom PDF