Project

General

Profile

Actions

Bug #31200

open

Puma memory leak

Added by David Goetschius over 3 years ago. Updated over 3 years ago.

Status:
Need more information
Priority:
Normal
Assignee:
-
Category:
Packaging
Target version:
-
Difficulty:
Triaged:
No
Fixed in Releases:
Found in Releases:

Description

Notice a memory leak in Katello 3.16.0; Foreman 2.1.2; tfm-rubygem-puma-4.3.3-4.el7.x86_64
Foreman server has 16CPUs and 64GB RAM
Foreman Tuning = Large
Then changed the following PUMA settings:
Environment=FOREMAN_PUMA_THREADS_MIN=8
Environment=FOREMAN_PUMA_THREADS_MAX=32
Environment=FOREMAN_PUMA_WORKERS=8
Foreman server with % memory calculation
Foreman server top command


Files

MemoryLeak.png View MemoryLeak.png 28.5 KB Foreman server with % memory calculation David Goetschius, 10/28/2020 07:47 PM
MemoryLeak2.png View MemoryLeak2.png 226 KB Foreman server top command David Goetschius, 10/28/2020 07:57 PM
MemoryLeak3.PNG View MemoryLeak3.PNG 35.4 KB David Goetschius, 11/05/2020 06:42 PM

Related issues 1 (1 open0 closed)

Related to Foreman - Feature #31274: Reports generation is memory heavyReady For TestingLukas ZapletalActions
Actions #1

Updated by Lukas Zapletal over 3 years ago

  • Category changed from Compute resources to Packaging

I think closest category would be Performance or Packaging. Eric/Ewould should be able to comment on if this is a known issue.

Actions #2

Updated by David Goetschius over 3 years ago

Few extra notes: On 8/25 we upgraded from 3.9 to 3.16. Puma was set to 2 workers; Min 0 and Max 8. We had tasks running long so we updated PUMA on 10/23 to 8 workers; min 8 and max 32. That is when we experienced and exploited the memory leak.

Actions #3

Updated by Lukas Zapletal over 3 years ago

  • Status changed from New to Need more information

If you changed max workers to 32 on 10/24 than what you see is not memory leak. Our app requires 300-900 MB per once worker, it starts low (100MB) and then due to Linux copy-on-write mechanism of child processes, it starts to grow until you reach top.

For 32 workers you need at least 32 GB RAM + some extra RAM for other processes. I'd suggest 40GB if that's a VM.

Actions #4

Updated by Lukas Zapletal over 3 years ago

Oh I see this instance has 64GB, so when you had 8 workers you saw 40% utilization, so that is 25GB (= 3 GB per worker). How many plugins do you have? We know that clean Foreman instance takes 300MB and grows up to 500 for core, 900 with Katello plugin and so on and so on.

In your case, to hold 32 workers 3GB each you will need 96GB RAM. I suggest you lower the number of workers.

I don't believe this is a leak per se, however we obviously do have some memory hogs in the codebase. I'd say many. To fix that, file more concrete report.

Actions #5

Updated by Lukas Zapletal over 3 years ago

I see in the top output up to 9GB memory RSS consuption per worker. That's too much, yeah. Are you able to isolate what operation makes it grow? To identify code paths which needs investigation.

Actions #6

Updated by David Goetschius over 3 years ago

The Foreman add-ons and versions are as follows:

Name Version
foreman-tasks 2.0.2
foreman_ansible 5.1.1
foreman_bootdisk 17.0.2
foreman_column_view 0.4.0
foreman_discovery 16.1.0
foreman_docker 5.0.0
foreman_hooks 0.3.16
foreman_openscap 4.0.2
foreman_remote_execution 3.3.5
foreman_setup 7.0.0
foreman_templates 9.0.1
katello Katello 3.16.0

What do you recommend on how to isolate what operations makes it grow? Please advise

Here is some additional log information if it helps...

Actions #7

Updated by Lukas Zapletal over 3 years ago

That was useful, most of the time spent are facts, which is expected. We plan on improving how we store them, not all of them need to be stored in normal form.

However what I see as more suspicious are reports generation, there were four requests on report_template generate which in total took 30% of all time spent. Can you make sure nobody clicks/starts on report generation, restart the workers and investigate if memory does not spike that hard? Reports are new feature, there could be a memory hog.

To make less stress on the db, I suggest you define some facts to be filtered out. We currently store all of them, if you have some hypervisors or container hosts with many interfaces, disks etc theye can heavily hit our fact tables - try to filter these facts completely (Administer - Settings - Facts exclude).

Actions #9

Updated by Lukas Zapletal over 3 years ago

Actions

Also available in: Atom PDF