Bug #31113
closedUnresumable task due to timeout
Description
After upgrading to foreman-2.1.3, katello-3.16.1.2, we experience hanging tasks on some of our systems.
Those are mainly due to Candlepin blocking requests, because /var had more than 90% disk-space used. This can be solved by adding enough disk-space.
However, after adding enough disk space, the task was still hanging and could not be resumed.
In Foreman-Task the task was marked as stopped with result pending
. Given that it was the creation of an Activation Key, the ActivationKey was there in Katello, but not in Candlepin and therefore was not removable.
The dynflow-console shows the step (here Katello::Resources::Candlepin::ActivationKey
) as failed, with the following error and the later steps as (pending)
:
Label: Actions::Katello::ActivationKey::Create Status: stopped Result: error Started at: 2020-10-17 18:45:49 UTC Ended at: 2020-10-18 09:03:50 UTC
3: Actions::Candlepin::ActivationKey::Create (error) [ 55070.43s / 7200.74s ] Queue: default Started at: 2020-10-17 18:46:00 UTC Ended at: 2020-10-18 10:03:50 UTC Real time: 55070.43s Execution time (excluding suspended state): 7200.74s
Error: RestClient::Exceptions::ReadTimeout Katello::Resources::Candlepin::ActivationKey: Timed out reading data from server (POST /candlepin/owners/Atix/activation_keys) --- - "/opt/theforeman/tfm/root/usr/share/gems/gems/rest-client-2.0.2/lib/restclient/request.rb:733:in `rescue in transmit'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/rest-client-2.0.2/lib/restclient/request.rb:642:in `transmit'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/rest-client-2.0.2/lib/restclient/request.rb:145:in `execute'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/rest-client-2.0.2/lib/restclient/request.rb:52:in `execute'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/rest-client-2.0.2/lib/restclient/resource.rb:67:in `post'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.16.1.2/app/lib/katello/http_resource.rb:101:in `post'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.16.1.2/app/lib/katello/resources/candlepin/activation_key.rb:25:in `create'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.16.1.2/app/lib/actions/candlepin/activation_key/create.rb:15:in `run'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-1.4.7/lib/dynflow/action.rb:571:in `block (3 levels) in execute_run'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-1.4.7/lib/dynflow/middleware/stack.rb:27:in `pass'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-1.4.7/lib/dynflow/middleware.rb:19:in `pass'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.16.1.2/app/lib/actions/middleware/keep_session_id.rb:11:in `block in run'" ...
I guess the Foreman-Task question is: Is stopped:pending
a valid Task-state
If so I guess this issue is for Katello to make sure the Task is resumable!?
Files
Updated by Adam Ruzicka about 4 years ago
Did it get to stopped-pending by itself?
Updated by Richard Stempfl about 4 years ago
Adam Ruzicka wrote:
Did it get to stopped-pending by itself?
The task was running. since the hard disk had less than 90% free memory, it got stuck and did not change anymore.
Then more disk space was added and the services were restarted after which the task got its status described above.
Updated by Adam Ruzicka about 3 years ago
- Project changed from foreman-tasks to Katello
Switching the component to Katello, tasks as an engine cannot really do anything about it if an action decides to stop on error.
Updated by Jonathon Turel about 3 years ago
The following line needs to be added to /etc/candlepin/broker.xml somewhere within the main configuration block:
<max-disk-usage>99</max-disk-usage>
After that, do systemctl restart tomcat and see if things work then. Hopefully you've upgraded from 3.16 by now - I think we made this option permanent in 4.0 and later
Updated by Ryan Verdile about 3 years ago
- Status changed from New to Rejected
- Target version set to Katello Recycle Bin
- Triaged changed from No to Yes