Bug #25528
closedEnhance resiliency mechanism to avoid memory recycler leading to tasks paused with 'Abnormal termination (previous state: running)' error
Description
Cloned from https://bugzilla.redhat.com/show_bug.cgi?id=1652056
Description of problem:
With memory recycler, it happens more often that the tasks can get interrupted
during the execution. In sake of transparency of the recycling process, we should
try to handle this situation better so that the user doesn't have to deal with
the error explicitly
How reproducible:
Occasionally
Steps to Reproduce:
1. setup memory limit in /etc/sysconfig/foreman-tasks (EXECUTOR_MEMORY_LIMIT=2gb, for easier reproducing, one might decrease
the EXECUTOR_MEMORY_MONITOR_DELAY to get the restarting more often)
2. restart foreman-tasks
3. start using Satellite in larger environment (continuous registration of hosts + content view publishes in combination with multiple capsules)
Actual results:
After some time, some tasks can end up in paused/error state `Abnormal termination (previous state: running)`
Expected results:
We should analyse this cases and find a way how to resume those before requiring
the user to manually interact with those
Additional info:
We will try to find more reliable reproducer, as we will develop the fix for this issue.
Updated by Adam Ruzicka over 5 years ago
- Subject changed from Enhance resiliency mechanism to avoid memory recycler leading to tasks paused with 'Abnormal termination (previous state: running)' error to Enhance resiliency mechanism to avoid memory recycler leading to tasks paused with 'Abnormal termination (previous state: running)' error
- Category set to Dynflow
Updated by Adam Ruzicka over 2 years ago
- Status changed from New to Closed
Since we moved to sidekiq we added some measures to prevent this. It is more of a best-effort solution since there's very little we can do against kill -9. Closing