Project

General

Profile

Actions

Bug #21705

open

chef-client enteres endless loop with foreman handler

Added by Thomas Berger about 7 years ago. Updated almost 6 years ago.

Status:
New
Priority:
High
Assignee:
-
Category:
-
Target version:
-
Difficulty:
Triaged:
No
Fixed in Releases:
Found in Releases:

Description

On some systems, the chef-client enters a endless loop if the foreman handler is installed and the chef-client runs in daemon mode.

The client resubmitts the facts every few seconds until it gets killed.

This results in a massive CPU usage on the foreman master.


Files

debug1.log debug1.log 563 KB Thomas Berger, 11/20/2017 09:23 PM
client.rb client.rb 491 Bytes Thomas Berger, 11/20/2017 10:35 PM
Actions #1

Updated by Marek Hulán about 7 years ago

Could you please set a loglevel in chef client to debug and upload a log from such run? Feel free to cut it when it starts repeating. Also do you see some error in foreman productuon.log? And what is your Foreman version?

Actions #2

Updated by Thomas Berger about 7 years ago

We use:
- foreman-1.15.6
- foreman-proxy-1.15.6
- foreman_chef-0.5.0

I attached the end of a run with at least a duplicate send of the attributes.

What i have found in the meantime is, that sometime chef-client does multiple sends of the same attributes, and sometimes this ends in an endless loop

Actions #3

Updated by Marek Hulán about 7 years ago

This is weird, nothing indicates any issue but I clearly see facts being sent twice. How did you configure the handler? Could you upload also you chef client.rb (I suppose you added it there)

Actions #4

Updated by Thomas Berger about 7 years ago

I attached the client.rb

Most wired part is, that i didn't see that happen while running in foreground or via cronjob.

Actions #5

Updated by Elias Abacioglu about 7 years ago

I'm also affected by this, running chef-handler-foreman v0.2.0.

[2017-12-19T17:39:38+01:00] INFO: Sending resource update report to foreman (run-id: )
[2017-12-19T17:39:38+01:00] INFO: Sending resource update report to foreman (run-id: )
[2017-12-19T17:39:38+01:00] INFO: Sending resource update report to foreman (run-id: )

There's also this
[2017-12-19T08:34:23+01:00] INFO: No attributes have changed - not uploading to foreman
[2017-12-19T08:34:23+01:00] INFO: No attributes have changed - not uploading to foreman
[2017-12-19T08:34:23+01:00] INFO: No attributes have changed - not uploading to foreman

It's kind of killing foreman, cause it seems to happen to multiple nodes now and foreman runs out of mysql connections.
Don't know if it's a bug with newer chef-clients? This node runs Chef v13.6.4.

Actions #6

Updated by Elias Abacioglu about 7 years ago

I can confirm it's not the latest chef-client.
We've had this issue with nodes that run both chef v12 and v13 under different minor versions.

Actions #7

Updated by Björn Zettergren about 7 years ago

I've done some strace:ing and tcpflow:ing, can see that the first "POST /api/reports", receives a "HTTP/1.1 201 Created" from foreman. But then instead of stopping, there follows a loop of new "POST /api/reports" which all receive "HTTP/1.1 422 Unprocessable Entity", where foreman reports in production.log:

Unprocessable entity ConfigReport (id: new):
Report time has already been taken

Which is correct, foreman has already accepted it.

Actions #8

Updated by Marek Hulán about 7 years ago

  • Translation missing: en.field_release deleted (215)
Actions #9

Updated by Björn Zettergren about 7 years ago

Elias Abacioglu wrote:

I can confirm it's not the latest chef-client.
We've had this issue with nodes that run both chef v12 and v13 under different minor versions.

I'm sorry, this is not valid. I told you that there were v12 chef-clients, but in fact there aren't any. My grep from production.log was faulty and had false positives in it. The actual status is that we have 22 hosts running chef 13.6.4 and one 13.4.24-1 that experience the problem. Appologies for my fake news outlet :)

Actions #10

Updated by Marek Hulán almost 7 years ago

Raboo, were you able to find out specific chef client version and/or whether it could be its fault? I haven't seen other users reporting it.

Actions #11

Updated by Björn Zettergren almost 7 years ago

Marek Hulán wrote:

Raboo, were you able to find out specific chef client version and/or whether it could be its fault? I haven't seen other users reporting it.

I work with Raboo, and we're seeing the same environment :-)
We don't experience the issue with chef 12, only chef 13.6.4-1 and chef 13.4.24-1 (haven't tested other from 13.x).

Actions #12

Updated by Elias Abacioglu almost 7 years ago

I told you I would report this bug to chef. I haven't done it, sorry.
I'll try to do it during next week.

Actions #13

Updated by Thomas Berger over 6 years ago

Any news on this?

Actions #14

Updated by Elias Abacioglu almost 6 years ago

The reason I didn't report this to Chef is that we were running a very old v13 chef-client.
This week I upgraded to Chef v13.12.3. And today I saw that we are still afflicted by the bug..
So I guess I need to take it with Chef..

Actions

Also available in: Atom PDF