Bug #16487

Continuous memory leak while tasks are getting run

Added by Ivan Necas 11 months ago. Updated 8 months ago.

Status:Closed
Priority:High
Assigned To:-
Category:-
Target version:Foreman - Team Ivan Iteration 6
Difficulty: Bugzilla link:1362168
Found in release: Pull request:
Story points-
Velocity based estimate-

Description

When dynflow is in use with Katello (might be without it, but have
not checked), the memory consumption grows continuously: this is not
a large grow, but it never drops down, which might be an indicator
for something getting stuck.


Related issues

Related to foreman-tasks - Feature #17175: max_memory_per_executor support Closed 11/01/2016

History

#1 Updated by Ivan Necas 11 months ago

  • Subject changed from Continuous memory leak while tasks are getting run to Continuous memory leak while tasks are getting run
  • Target version set to Team Ivan Iteration 3

#2 Updated by Lukas Zapletal 10 months ago

For the record, when I was looking into the issue the other day, I was using SystemTap. I was not able to continue as I did not know the correct syntax for SCL Ruby, but I finally got it:

https://lukas.zapletalovi.com/2016/08/probing-ruby-20-apps-with-systemtap-in-rhel7.html

That could help maybe:

scl enable rh-ruby22 -- stap rubystack.stp -c "ruby dynflow.rb"

#3 Updated by Shimon Shtein 10 months ago

I have done a bit of investigation, but didn't find anything significant in dynflow engine itself.
What I did:
I was running a simple task based on dynflow example "remote_executor", the task was changed to perform GC.collect (in part of the cases) and write GC.stats to a file in /tmp.
What I found:
while adding GC.collect to the task - I saw that the memory footprint was pretty big, but steady.
without GC.collect - the memory footprint got up, but the number of garbage collections remained the same.
Conclusions:
1. Dynflow as engine is not leaking.
2. It uses a lot of objects to perform even a simple task - it can become a problem, but it's less critical then leak.
3. Since I was using ruby 2.2 where generations were already introduced, I could see a lot of objects surviving the young generation collections and promoted to older generation. This can explain the bloat without forced GC.collect - the system just didn't trigger full GC.
4. (2) leads to (3) - massive allocations cause younger generation collections which in turn promote objects to older generation.

#4 Updated by Ivan Necas 10 months ago

  • Target version changed from Team Ivan Iteration 3 to Team Ivan Iteration 4

#5 Updated by Ivan Necas 9 months ago

  • Target version changed from Team Ivan Iteration 4 to Team Ivan Iteration 5

#6 Updated by Ivan Necas 9 months ago

  • Related to Bug #14806: Add option to set the amount of dynflow executors to be running added

#7 Updated by Ivan Necas 9 months ago

  • Related to deleted (Bug #14806: Add option to set the amount of dynflow executors to be running)

#8 Updated by Ivan Necas 9 months ago

  • Blocked by Bug #14806: Add option to set the amount of dynflow executors to be running added

#9 Updated by Ivan Necas 9 months ago

  • Blocked by deleted (Bug #14806: Add option to set the amount of dynflow executors to be running)

#10 Updated by Ivan Necas 9 months ago

  • Target version changed from Team Ivan Iteration 5 to Team Ivan Iteration 6

#11 Updated by Ivan Necas 8 months ago

#12 Updated by Ivan Necas 8 months ago

  • Status changed from New to Closed

I'm closing this issue, as we were not able to find a place in the code where there would be a leak, and from all angles, it looks like an issue with memory fragmentation. At the core, the issue is Ruby doesn't have memory defragementation capability and this leads to consuming more memory that it actually uses. At the end however, it should converge to some value. The only thing I can think of is the memory limit and restarting a the service once it exceeds it.

Also available in: Atom PDF