Project

General

Profile

Actions

Bug #31113

closed

Unresumable task due to timeout

Added by Markus Bucher over 3 years ago. Updated over 2 years ago.

Status:
Rejected
Priority:
Normal
Assignee:
-
Category:
-
Target version:
Difficulty:
Triaged:
Yes
Fixed in Releases:

Description

After upgrading to foreman-2.1.3, katello-3.16.1.2, we experience hanging tasks on some of our systems.

Those are mainly due to Candlepin blocking requests, because /var had more than 90% disk-space used. This can be solved by adding enough disk-space.
However, after adding enough disk space, the task was still hanging and could not be resumed.

In Foreman-Task the task was marked as stopped with result pending. Given that it was the creation of an Activation Key, the ActivationKey was there in Katello, but not in Candlepin and therefore was not removable.

The dynflow-console shows the step (here Katello::Resources::Candlepin::ActivationKey) as failed, with the following error and the later steps as (pending):

Label: Actions::Katello::ActivationKey::Create
Status: stopped
Result: error
Started at: 2020-10-17 18:45:49 UTC
Ended at: 2020-10-18 09:03:50 UTC

3: Actions::Candlepin::ActivationKey::Create (error) [ 55070.43s / 7200.74s ]
Queue: default
Started at: 2020-10-17 18:46:00 UTC
Ended at: 2020-10-18 10:03:50 UTC
Real time: 55070.43s
Execution time (excluding suspended state): 7200.74s

Error:

RestClient::Exceptions::ReadTimeout

Katello::Resources::Candlepin::ActivationKey: Timed out reading data from server (POST /candlepin/owners/Atix/activation_keys)

---
- "/opt/theforeman/tfm/root/usr/share/gems/gems/rest-client-2.0.2/lib/restclient/request.rb:733:in
  `rescue in transmit'" 
- "/opt/theforeman/tfm/root/usr/share/gems/gems/rest-client-2.0.2/lib/restclient/request.rb:642:in
  `transmit'" 
- "/opt/theforeman/tfm/root/usr/share/gems/gems/rest-client-2.0.2/lib/restclient/request.rb:145:in
  `execute'" 
- "/opt/theforeman/tfm/root/usr/share/gems/gems/rest-client-2.0.2/lib/restclient/request.rb:52:in
  `execute'" 
- "/opt/theforeman/tfm/root/usr/share/gems/gems/rest-client-2.0.2/lib/restclient/resource.rb:67:in
  `post'" 
- "/opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.16.1.2/app/lib/katello/http_resource.rb:101:in
  `post'" 
- "/opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.16.1.2/app/lib/katello/resources/candlepin/activation_key.rb:25:in
  `create'" 
- "/opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.16.1.2/app/lib/actions/candlepin/activation_key/create.rb:15:in
  `run'" 
- "/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-1.4.7/lib/dynflow/action.rb:571:in
  `block (3 levels) in execute_run'" 
- "/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-1.4.7/lib/dynflow/middleware/stack.rb:27:in
  `pass'" 
- "/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-1.4.7/lib/dynflow/middleware.rb:19:in
  `pass'" 
- "/opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.16.1.2/app/lib/actions/middleware/keep_session_id.rb:11:in
  `block in run'" 
...

I guess the Foreman-Task question is: Is stopped:pending a valid Task-state
If so I guess this issue is for Katello to make sure the Task is resumable!?


Files

Screenshot_20201020_160114.png View Screenshot_20201020_160114.png 47.6 KB Markus Bucher, 10/20/2020 03:26 PM
Screenshot_20201020_160105.png View Screenshot_20201020_160105.png 167 KB Markus Bucher, 10/20/2020 03:26 PM
Actions #1

Updated by Adam Ruzicka over 3 years ago

Did it get to stopped-pending by itself?

Actions #2

Updated by Richard Stempfl over 3 years ago

Adam Ruzicka wrote:

Did it get to stopped-pending by itself?

The task was running. since the hard disk had less than 90% free memory, it got stuck and did not change anymore.
Then more disk space was added and the services were restarted after which the task got its status described above.

Actions #3

Updated by Adam Ruzicka over 2 years ago

  • Project changed from foreman-tasks to Katello

Switching the component to Katello, tasks as an engine cannot really do anything about it if an action decides to stop on error.

Actions #4

Updated by Jonathon Turel over 2 years ago

The following line needs to be added to /etc/candlepin/broker.xml somewhere within the main configuration block:

<max-disk-usage>99</max-disk-usage>

After that, do systemctl restart tomcat and see if things work then. Hopefully you've upgraded from 3.16 by now - I think we made this option permanent in 4.0 and later

Actions #5

Updated by Ryan Verdile over 2 years ago

  • Status changed from New to Rejected
  • Target version set to Katello Recycle Bin
  • Triaged changed from No to Yes
Actions

Also available in: Atom PDF