Feature #17175

max_memory_per_executor support

Added by Ivan Necas 5 months ago. Updated 3 days ago.

Status:Ready For Testing
Priority:Normal
Assigned To:Shimon Shtein
Category:-
Target version:Foreman - Team Ivan Iteration 12
Difficulty: Bugzilla link:1434069
Found in release: Pull request:https://github.com/Dynflow/dynflow/pull/211, https://github.com/theforeman/foreman-tasks/pull/216
Story points-
Velocity based estimate-

Description

Given once Ruby allocates some memory, it doesn't give it back, bigger
set of larger actions can lead to quite big memory consumption that
persists and can accumulate over time. With this, it's hard to keep
memory consumption fully under control, especially in an environment
with other systems (passenger, pulp, candlepin, qpid). Since the
executors can terminate nicely without affecting the tasks itselves,
it should be pretty easy to extend it to watch the memory consumption.

The idea:

1. config options:
max_memory_per_executor - the threshold for the memory size per executor
min_executors_count - minimal count executors (default 1)
minimal_executor_age - the period it will check whether the memory consumption didn't grow (default 1h)

2. the executor will periodically check it's memory usage,
(http://stackoverflow.com/a/24423978/457560 seems to be a sane
approach for us)

3. if memory usage exceeds `max_memory_per_executor`, the executor is
older than `minimal_executor_age` (to prevent situation, where the
memory would grow too fast over the max_memory_per_executor, which
would mean we wouldn't do anything than restarting the executors
without getting anything done and the amount of current executors
would not go under `min_executors_count`, politely terminate executor

4. the polite termination should be able to hand over all the tasks to
the other executors and once everything is finalized on the executor, it would just exit

5. the daemon monitor would notice the executor getting closed and running a new executor

It would be configurable, turned off by default (for development) but we would configure
this in production, where we can rely on the monitor being present.


Related issues

Related to foreman-tasks - Bug #16488: ruby consumes 43GB RSS when there is lots of stucked erra... Closed 09/08/2016
Related to foreman-tasks - Bug #16487: Continuous memory leak while tasks are getting run Closed 09/08/2016
Blocked by foreman-tasks - Bug #14806: Add option to set the amount of dynflow executors to be r... Closed 04/25/2016

History

#1 Updated by Ivan Necas 5 months ago

  • Target version set to Team Ivan Iteration 6

#2 Updated by Ivan Necas 5 months ago

  • Blocks Bug #16488: ruby consumes 43GB RSS when there is lots of stucked errata apply tasks added

#3 Updated by Ivan Necas 5 months ago

  • Blocks deleted (Bug #16488: ruby consumes 43GB RSS when there is lots of stucked errata apply tasks)

#4 Updated by Ivan Necas 5 months ago

  • Related to Bug #16488: ruby consumes 43GB RSS when there is lots of stucked errata apply tasks added

#5 Updated by Ivan Necas 5 months ago

  • Blocked by Bug #14806: Add option to set the amount of dynflow executors to be running added

#6 Updated by Shimon Shtein 5 months ago

A couple of thoughts:

First, since the process stabilizes at some point, we are not satisfied with the fact that they are "stuck" in memory for too long.
Maybe we can address it by more aggressive cleanup after the task finishes - maybe calling a full GC after each task, so it's leftovers will be purged.

Second, again, since the process stabilizes at some point, maybe we should enhance the algorithm that spawns new executors.
I mean monitoring the amount of memory consumed by all executors, and when a threshold is passed reduce the amount of live executors. Thus the memory would be divided between fewer executors, allowing them to get to the point where they can stabilize. The same can go the other way around, if the executors have stabilized before reaching the memory threshold, we can spawn extra executor and get the tasks queue cleared faster.

#7 Updated by The Foreman Bot 4 months ago

  • Status changed from New to Ready For Testing
  • Assigned To set to Shimon Shtein
  • Pull request https://github.com/theforeman/foreman-tasks/pull/216 added

#8 Updated by Shimon Shtein 4 months ago

  • Pull request https://github.com/Dynflow/dynflow/pull/211 added

#9 Updated by Ivan Necas 4 months ago

  • Related to Bug #16487: Continuous memory leak while tasks are getting run added

#10 Updated by Ivan Necas about 1 month ago

  • Target version changed from Team Ivan Iteration 6 to Team Ivan Iteration 10

#11 Updated by Ivan Necas 12 days ago

  • Target version changed from Team Ivan Iteration 10 to Team Ivan Iteration 12

#12 Updated by Mike McCune 5 days ago

  • Bugzilla link set to 1434069

Also available in: Atom PDF