Project

General

Profile

Actions

Bug #25528

closed

Enhance resiliency mechanism to avoid memory recycler leading to tasks paused with 'Abnormal termination (previous state: running)' error

Added by Adam Ruzicka over 5 years ago. Updated over 2 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Category:
Dynflow
Target version:
-
Difficulty:
Triaged:
No
Fixed in Releases:
Found in Releases:

Description

Cloned from https://bugzilla.redhat.com/show_bug.cgi?id=1652056

Description of problem:
With memory recycler, it happens more often that the tasks can get interrupted
during the execution. In sake of transparency of the recycling process, we should
try to handle this situation better so that the user doesn't have to deal with
the error explicitly

How reproducible:
Occasionally

Steps to Reproduce:
1. setup memory limit in /etc/sysconfig/foreman-tasks (EXECUTOR_MEMORY_LIMIT=2gb, for easier reproducing, one might decrease
the EXECUTOR_MEMORY_MONITOR_DELAY to get the restarting more often)
2. restart foreman-tasks
3. start using Satellite in larger environment (continuous registration of hosts + content view publishes in combination with multiple capsules)

Actual results:

After some time, some tasks can end up in paused/error state `Abnormal termination (previous state: running)`

Expected results:
We should analyse this cases and find a way how to resume those before requiring
the user to manually interact with those

Additional info:

We will try to find more reliable reproducer, as we will develop the fix for this issue.

Actions #1

Updated by Adam Ruzicka over 5 years ago

  • Subject changed from Enhance resiliency mechanism to avoid memory recycler leading to tasks paused with 'Abnormal termination (previous state: running)' error to Enhance resiliency mechanism to avoid memory recycler leading to tasks paused with 'Abnormal termination (previous state: running)' error
  • Category set to Dynflow
Actions #2

Updated by Adam Ruzicka over 2 years ago

  • Status changed from New to Closed

Since we moved to sidekiq we added some measures to prevent this. It is more of a best-effort solution since there's very little we can do against kill -9. Closing

Actions

Also available in: Atom PDF