Feature #17175

max_memory_per_executor support

Added by Ivan Necas about 1 year ago. Updated 6 months ago.

Status:Closed
Priority:Normal
Assigned To:Shimon Shtein
Category:-
Target version:Foreman - Team Ivan Iteration 12
Difficulty: Bugzilla link:1434069
Found in release: Pull request:https://github.com/theforeman/foreman-tasks/pull/216, https://github.com/Dynflow/dynflow/pull/211
Story points-
Velocity based estimate-
Releaseforeman-tasks-0.9.2Release relationshipAuto

Description

Given once Ruby allocates some memory, it doesn't give it back, bigger
set of larger actions can lead to quite big memory consumption that
persists and can accumulate over time. With this, it's hard to keep
memory consumption fully under control, especially in an environment
with other systems (passenger, pulp, candlepin, qpid). Since the
executors can terminate nicely without affecting the tasks itselves,
it should be pretty easy to extend it to watch the memory consumption.

The idea:

1. config options:
max_memory_per_executor - the threshold for the memory size per executor
min_executors_count - minimal count executors (default 1)
minimal_executor_age - the period it will check whether the memory consumption didn't grow (default 1h)

2. the executor will periodically check it's memory usage,
(http://stackoverflow.com/a/24423978/457560 seems to be a sane
approach for us)

3. if memory usage exceeds `max_memory_per_executor`, the executor is
older than `minimal_executor_age` (to prevent situation, where the
memory would grow too fast over the max_memory_per_executor, which
would mean we wouldn't do anything than restarting the executors
without getting anything done and the amount of current executors
would not go under `min_executors_count`, politely terminate executor

4. the polite termination should be able to hand over all the tasks to
the other executors and once everything is finalized on the executor, it would just exit

5. the daemon monitor would notice the executor getting closed and running a new executor

It would be configurable, turned off by default (for development) but we would configure
this in production, where we can rely on the monitor being present.


Related issues

Related to foreman-tasks - Bug #16488: ruby consumes 43GB RSS when there is lots of stucked erra... Closed 09/08/2016
Related to foreman-tasks - Bug #16487: Continuous memory leak while tasks are getting run Closed 09/08/2016
Related to foreman-tasks - Bug #20875: max_memory_per_executor can lead to stuck executor, waiti... Closed 09/07/2017
Blocked by foreman-tasks - Bug #14806: Add option to set the amount of dynflow executors to be r... Closed 04/25/2016

Associated revisions

Revision d3051b91
Added by Shimon Shtein 6 months ago

Fixes #17175 - Added support for memory monitoring.

A world will be terminated if memory limit is exceeded.
This will cause the running process to be killed.
Since we are monitoring our processes, a new process will
be spawned.

Relies on https://github.com/Dynflow/dynflow/pull/211

History

#1 Updated by Ivan Necas about 1 year ago

  • Target version set to Team Ivan Iteration 6

#2 Updated by Ivan Necas about 1 year ago

  • Blocks Bug #16488: ruby consumes 43GB RSS when there is lots of stucked errata apply tasks added

#3 Updated by Ivan Necas about 1 year ago

  • Blocks deleted (Bug #16488: ruby consumes 43GB RSS when there is lots of stucked errata apply tasks)

#4 Updated by Ivan Necas about 1 year ago

  • Related to Bug #16488: ruby consumes 43GB RSS when there is lots of stucked errata apply tasks added

#5 Updated by Ivan Necas about 1 year ago

  • Blocked by Bug #14806: Add option to set the amount of dynflow executors to be running added

#6 Updated by Shimon Shtein about 1 year ago

A couple of thoughts:

First, since the process stabilizes at some point, we are not satisfied with the fact that they are "stuck" in memory for too long.
Maybe we can address it by more aggressive cleanup after the task finishes - maybe calling a full GC after each task, so it's leftovers will be purged.

Second, again, since the process stabilizes at some point, maybe we should enhance the algorithm that spawns new executors.
I mean monitoring the amount of memory consumed by all executors, and when a threshold is passed reduce the amount of live executors. Thus the memory would be divided between fewer executors, allowing them to get to the point where they can stabilize. The same can go the other way around, if the executors have stabilized before reaching the memory threshold, we can spawn extra executor and get the tasks queue cleared faster.

#7 Updated by The Foreman Bot 12 months ago

  • Status changed from New to Ready For Testing
  • Assigned To set to Shimon Shtein
  • Pull request https://github.com/theforeman/foreman-tasks/pull/216 added

#8 Updated by Shimon Shtein 12 months ago

  • Pull request https://github.com/Dynflow/dynflow/pull/211 added

#9 Updated by Ivan Necas 12 months ago

  • Related to Bug #16487: Continuous memory leak while tasks are getting run added

#10 Updated by Ivan Necas 10 months ago

  • Target version changed from Team Ivan Iteration 6 to Team Ivan Iteration 10

#11 Updated by Ivan Necas 8 months ago

  • Target version changed from Team Ivan Iteration 10 to Team Ivan Iteration 12

#12 Updated by Mike McCune 8 months ago

  • Bugzilla link set to 1434069

#13 Updated by Shimon Shtein 6 months ago

  • Status changed from Ready For Testing to Closed
  • % Done changed from 0 to 100

#14 Updated by Ivan Necas 6 months ago

  • Release set to foreman-tasks-0.9.2

#15 Updated by Ivan Necas 2 months ago

  • Related to Bug #20875: max_memory_per_executor can lead to stuck executor, waiting for an event that would not arrive added

Also available in: Atom PDF