Project

General

Profile

Actions

Bug #4565

closed

Foreman consumes all available memory and start swapping.

Added by Peter Gustafsson over 10 years ago. Updated about 6 years ago.

Status:
Closed
Priority:
High
Category:
Search
Target version:
Difficulty:
Triaged:
Fixed in Releases:
Found in Releases:

Description

Hi,

My foreman 1.4.1 system consumes all available memory 8GB and start swapping and that in turn leads to Foreman respond slowly.
I think that all the memory is consumed when an search on an puppet module is triggered on the puppet classes page.
when doing a search on an puppet module no result is shown even after up to 20 min.

Doing an strace on the process hogging all memory just shows:
  1. strace -p 20055
    Process 20055 attached - interrupt to quit
    select(10, [0 9], NULL, NULL, NULL
Below is the output from passenger-memory-stats, the private memory usage is extremely high:
---------- Apache processes ----------
PID PPID VMSize Private Name
--------------------------------------
19798 1 199.5 MB 0.0 MB /usr/sbin/httpd
19817 19798 202.0 MB 0.2 MB /usr/sbin/httpd
19818 19798 202.0 MB 0.1 MB /usr/sbin/httpd
19819 19798 202.0 MB 0.3 MB /usr/sbin/httpd
19820 19798 202.0 MB 0.2 MB /usr/sbin/httpd
19821 19798 202.0 MB 0.1 MB /usr/sbin/httpd
19822 19798 202.0 MB 0.1 MB /usr/sbin/httpd
19823 19798 202.0 MB 0.2 MB /usr/sbin/httpd
19824 19798 202.0 MB 0.1 MB /usr/sbin/httpd
20064 19798 202.0 MB 0.4 MB /usr/sbin/httpd
20295 19798 202.0 MB 0.4 MB /usr/sbin/httpd
21523 19798 201.8 MB ? /usr/sbin/httpd
22798 19798 202.0 MB 0.3 MB /usr/sbin/httpd
22891 19798 201.8 MB 0.2 MB /usr/sbin/httpd
22950 19798 202.0 MB 0.3 MB /usr/sbin/httpd
22957 19798 202.0 MB 0.2 MB /usr/sbin/httpd
  1. Processes: 16
  2. Total private dirty RSS: 3.18 MB (?)

-------- Nginx processes --------

  1. Processes: 0
  2. Total private dirty RSS: 0.00 MB
------ Passenger processes -------
PID VMSize Private Name
----------------------------------
19800 209.0 MB 0.0 MB PassengerWatchdog
19803 1143.2 MB 0.4 MB PassengerHelperAgent
19808 148.5 MB 0.0 MB PassengerLoggingAgent
19946 176.2 MB 2.8 MB Passenger RackApp: /etc/puppet/rack
19964 920.6 MB 155.1 MB Passenger RackApp: /usr/share/foreman
19989 600.8 MB 63.0 MB Passenger RackApp: /usr/share/foreman
20031 983.4 MB 375.8 MB Passenger RackApp: /usr/share/foreman
20055 7672.2 MB 5421.3 MB Passenger RackApp: /usr/share/foreman
22921 468.0 MB 0.0 MB Passenger RackApp: /usr/share/foreman
  1. Processes: 9
  2. Total private dirty RSS: 6018.36 MB
Actions #1

Updated by Peter Gustafsson over 10 years ago

This occur when an free text search is performed. I can see in the production.log that LIMIT 20 and ORDER BY puppetclasses.name is used on the db query and we have 1200+ puppet modules. searching on an module starting with a e.g apache goes relative fast but if i search on e.g pulp the search query use all available memory on the system and foreman freeze. This foreman installation has 100+ users so free text searches on modules will occur and every time it's done it hangs the foreman server.

I can also see that it takes ~80 Sec to retrieve 20 rows and the time is increasing per every 20 rows retrieved and since we have 1200+ rows the search query takes ~60+ min, if not the memory runs out first.

'%ipa%')) ORDER BY puppetclasses.name LIMIT 20 OFFSET 0
Puppetclass Load (79723.1ms) SELECT DISTINCT "puppetclasses".id, puppetclasses.name AS alias_0 FROM "puppetclasses" LEFT OUTER JOIN "environment_classes" ON "environment_classes"."puppetclass_id" = "puppetclasses"."id" LEFT OUTER JOIN "environments" ON "environments"."id" = "environment_classes"."environment_id" LEFT OUTER JOIN "hostgroup_classes" ON "hostgroup_classes"."puppetclass_id" = "puppetclasses"."id" LEFT OUTER JOIN "hostgroups" ON "hostgroups"."id" = "hostgroup_classes"."hostgroup_id" LEFT OUTER JOIN "environment_classes" "environment_classes_puppetclasses_join" ON "environment_classes_puppetclasses_join"."puppetclass_id" = "puppetclasses"."id" LEFT OUTER JOIN "lookup_keys" ON "lookup_keys"."id" = "environment_classes_puppetclasses_join"."lookup_key_id" AND environment_classes.lookup_key_id is NOT NULL WHERE (("puppetclasses"."name" ILIKE '%ipa%' OR "environments"."name" ILIKE '%ipa%' OR "hostgroups"."name" ILIKE '%ipa%' OR "lookup_keys"."key" ILIKE '%ipa%')) ORDER BY puppetclasses.name LIMIT 20 OFFSET 0
Puppetclass Load (80497.4ms) SELECT DISTINCT "puppetclasses".id, puppetclasses.name AS alias_0 FROM "puppetclasses" LEFT OUTER JOIN "environment_classes" ON "environment_classes"."puppetclass_id" = "puppetclasses"."id" LEFT OUTER JOIN "environments" ON "environments"."id" = "environment_classes"."environment_id" LEFT OUTER JOIN "hostgroup_classes" ON "hostgroup_classes"."puppetclass_id" = "puppetclasses"."id" LEFT OUTER JOIN "hostgroups" ON "hostgroups"."id" = "hostgroup_classes"."hostgroup_id" LEFT OUTER JOIN "environment_classes" "environment_classes_puppetclasses_join" ON "environment_classes_puppetclasses_join"."puppetclass_id" = "puppetclasses"."id" LEFT OUTER JOIN "lookup_keys" ON "lookup_keys"."id" = "environment_classes_puppetclasses_join"."lookup_key_id" AND environment_classes.lookup_key_id is NOT NULL WHERE (("puppetclasses"."name" ILIKE '%ipa%' OR "environments"."name" ILIKE '%ipa%' OR "hostgroups"."name" ILIKE '%ipa%' OR "lookup_keys"."key" ILIKE '%ipa%')) ORDER BY puppetclasses.name LIMIT 20 OFFSET 0
Puppetclass Load (80924.2ms) SELECT DISTINCT "puppetclasses".id, puppetclasses.name AS alias_0 FROM "puppetclasses" LEFT OUTER JOIN "environment_classes" ON "environment_classes"."puppetclass_id" = "puppetclasses"."id" LEFT OUTER JOIN "environments" ON "environments"."id" = "environment_classes"."environment_id" LEFT OUTER JOIN "hostgroup_classes" ON "hostgroup_classes"."puppetclass_id" = "puppetclasses"."id" LEFT OUTER JOIN "hostgroups" ON "hostgroups"."id" = "hostgroup_classes"."hostgroup_id" LEFT OUTER JOIN "environment_classes" "environment_classes_puppetclasses_join" ON "environment_classes_puppetclasses_join"."puppetclass_id" = "puppetclasses"."id" LEFT OUTER JOIN "lookup_keys" ON "lookup_keys"."id" = "environment_classes_puppetclasses_join"."lookup_key_id" AND environment_classes.lookup_key_id is NOT NULL WHERE (("puppetclasses"."name" ILIKE '%ipa%' OR "environments"."name" ILIKE '%ipa%' OR "hostgroups"."name" ILIKE '%ipa%' OR "lookup_keys"."key" ILIKE '%ipa%')) ORDER BY puppetclasses.name LIMIT 20 OFFSET 0
Puppetclass Load (82233.8ms) SELECT DISTINCT "puppetclasses".id, puppetclasses.name AS alias_0 FROM "puppetclasses" LEFT OUTER JOIN "environment_classes" ON "environment_classes"."puppetclass_id" = "puppetclasses"."id" LEFT OUTER JOIN "environments" ON "environments"."id" = "environment_classes"."environment_id" LEFT OUTER JOIN "hostgroup_classes" ON "hostgroup_classes"."puppetclass_id" = "puppetclasses"."id" LEFT OUTER JOIN "hostgroups" ON "hostgroups"."id" = "hostgroup_classes"."hostgroup_id" LEFT OUTER JOIN "environment_classes" "environment_classes_puppetclasses_join" ON "environment_classes_puppetclasses_join"."puppetclass_id" = "puppetclasses"."id" LEFT OUTER JOIN "lookup_keys" ON "lookup_keys"."id" = "environment_classes_puppetclasses_join"."lookup_key_id" AND environment_classes.lookup_key_id is NOT NULL WHERE (("puppetclasses"."name" ILIKE '%ipa%' OR "environments"."name" ILIKE '%ipa%' OR "hostgroups"."name" ILIKE '%ipa%' OR "lookup_keys"."key" ILIKE '%ipa%')) ORDER BY puppetclasses.name LIMIT 20 OFFSET 0

Br, Peter

Actions #2

Updated by Lukas Zapletal over 10 years ago

Hello,

are you able to measure how fast is the query when you paste it to the SQL server directly (e.g. without Rails and ActiveRecord involved)?

In other words, is this Rails bug/leak or not optimized SQL?

Actions #3

Updated by Peter Gustafsson over 10 years ago

Hello,

Seems like an Rails bug/leak:

$ time psql <<EOF

SELECT DISTINCT "puppetclasses".id, puppetclasses.name AS alias_0 FROM "puppetclasses" LEFT OUTER JOIN "environment_classes" ON "environment_classes"."puppetclass_id" = "puppetclasses"."id" LEFT OUTER JOIN "environments" ON "environments"."id" = "environment_classes"."environment_id" LEFT OUTER JOIN "hostgroup_classes" ON "hostgroup_classes"."puppetclass_id" = "puppetclasses"."id" LEFT OUTER JOIN "hostgroups" ON "hostgroups"."id" = "hostgroup_classes"."hostgroup_id" LEFT OUTER JOIN "environment_classes" "environment_classes_puppetclasses_join" ON "environment_classes_puppetclasses_join"."puppetclass_id" = "puppetclasses"."id" LEFT OUTER JOIN "lookup_keys" ON "lookup_keys"."id" = "environment_classes_puppetclasses_join"."lookup_key_id" AND environment_classes.lookup_key_id is NOT NULL WHERE (("puppetclasses"."name" ILIKE '%ipa%' OR "environments"."name" ILIKE '%ipa%' OR "hostgroups"."name" ILIKE '%ipa%' OR "lookup_keys"."key" ILIKE '%ipa%')) ORDER BY puppetclasses.name LIMIT 20 OFFSET 0;
EOF

id | alias_0
------+-----------------------
485 | certmonger
1480 | foreman_hooks
380 | ipa
45 | ipa::client
385 | ipa::client::ad_trust
387 | ipa::client::autofs
388 | ipa::client::base
384 | ipa::client::install
389 | ipa::client::package
383 | ipa::client::services
386 | ipa::client::sudo
87 | ipa::params
381 | ipa::replica::base
382 | ipa::replica::package
77 | ipa::server
(15 rows)

real 0m26.541s
user 0m0.004s
sys 0m0.019s

Actions #4

Updated by Peter Gustafsson over 10 years ago

but 26s is long time for the query as well.

Br, Peter

Actions #5

Updated by Ohad Levy over 10 years ago

Peter Gustafsson wrote:

but 26s is long time for the query as well.

you can use EXPLAIN before the select to get more info where its spending time, but i agree.

Actions #6

Updated by Lukas Zapletal over 10 years ago

  • Category set to Search
  • Status changed from New to Assigned
  • Assignee set to Lukas Zapletal
  • Target version set to 1.9.0
amos_ | lzap, puppetclass model  scoped_search :in => :class_params, :on => :key, :complete_value => :true, :only_explicit => true
Actions #7

Updated by Lukas Zapletal over 10 years ago

  • Status changed from Assigned to Ready For Testing
Actions #8

Updated by Dominic Cleal over 10 years ago

  • Translation missing: en.field_release set to 7
Actions #9

Updated by Lukas Zapletal over 10 years ago

  • Status changed from Ready For Testing to Closed
  • % Done changed from 0 to 100
Actions

Also available in: Atom PDF