Bug #20932
openrake process dying with memory errors
Description
Hi,
we're using foreman 1.13.0
Foreman host is provisioned with 8G Memory initially
It worked fine for a few months and then OOM started killing rake process
so we increased RAM from 8 to 16G
After a few months rake again started taking up all memory
now we increased RAM to 32G
Now the issue is I see 2 rake processes running all the time
Even if I kill both of them, After some time I see both processes running again and one of them is getting killed by OOM
Is this a known issue??
Is there an resolution for this???
Thanks in advance,
Bhanu
Updated by Ohad Levy over 7 years ago
which rake task are you actually running? I assume its started from cron?
also, 1.13 is really old at this stage, please consider upgrading.
Updated by Bhanu Prasad Ganguru over 7 years ago
Hi Ohad,
Yes it's a cron for `foreman-rake`
And
I know 1.13 is old, but I'm worried to upgrade since we're in production
What is the impact of upgrading to 1.14.3 from 1.13.0 and do we have to update puppet as well ??
we're using puppet 4.8.2
What are the other dependencies that might break
Bhanu
Updated by Ivan Necas over 7 years ago
There can be a lot of subcommands in foreman-rake
, please provide the full command that is consuming the memory.
Updated by Bhanu Prasad Ganguru over 7 years ago
the two commands that are running
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
18307 foreman 20 0 12.346g 0.012t 1780 R 63.5 38.2 24:42.35 /opt/rh/rh-ruby22/root/usr/bin/ruby /opt/rh/rh-ruby22/root/usr/bin/rake trends:counter
15431 foreman 20 0 13.278g 0.013t 1228 R 62.1 41.2 48:09.28 /opt/rh/rh-ruby22/root/usr/bin/ruby /opt/rh/rh-ruby22/root/usr/bin/rake trends:counter
Updated by Ivan Necas over 7 years ago
Branu: do you think it would be possible to share the data from trends and trend_counters tables from your setup, in case it's doesn't contain sensitive data, for further analysis?
Updated by Bhanu Prasad Ganguru over 7 years ago
We don't have any sensitive data
Here you go
foreman=> SELECT count(*) FROM trends; count
count
---------
4656994
(1 row)
foreman=> select count(*) from trend_counters; count
count
---------
4182107
(1 row)
foreman=> SELECT * FROM trend_counters;
id | trend_id | count | created_at | updated_at | interval_start | interval_end
---------+----------+-------+----------------------------+----------------------------+----------------------------+----------------------------
1216781 | 609217 | 0 | 2017-04-07 16:30:23.460929 | 2017-05-01 00:52:34.951262 | 2017-04-07 16:30:23.460929 | 2017-04-30 23:30:29.987925
1584036 | 795547 | 1 | 2017-04-23 18:30:25.152193 | 2017-04-23 19:42:01.619967 | 2017-04-23 18:30:25.152193 | 2017-04-23 19:00:25.00961
391505 | 195174 | 0 | 2017-03-24 10:00:11.799516 | 2017-05-05 10:17:52.432887 | 2017-03-24 10:00:11.799516 | 2017-05-05 09:00:33.869969
1843682 | 923705 | 0 | 2017-04-28 06:00:28.884791 | 2017-05-11 02:04:03.060378 | 2017-04-28 06:00:28.884791 | 2017-05-11 01:00:36.391698
3482888 | 3209176 | 1 | 2017-07-16 09:31:26.047687 | 2017-07-16 10:10:44.469751 | 2017-07-16 09:31:26.047687 | 2017-07-16 10:01:26.102196
256204 | 128217 | 1 | 2017-03-22 02:00:11.894811 | 2017-03-22 02:38:46.610683 | 2017-03-22 02:00:11.894811 | 2017-03-22 02:30:12.895528
3256428 | 2510624 | 1 | 2017-06-22 06:31:08.186384 | 2017-06-22 07:37:58.758556 | 2017-06-22 06:31:08.186384 | 2017-06-22 07:01:08.487333
617004 | 308905 | 1 | 2017-03-28 07:30:12.558068 | 2017-03-28 08:04:20.130362 | 2017-03-28 07:30:12.558068 | 2017-03-28 08:00:12.775755
1484306 | 746300 | 1 | 2017-04-22 02:00:24.346014 | 2017-04-22 02:41:03.12227 | 2017-04-22 02:00:24.346014 | 2017-04-22 02:30:24.141439
1074555 | 537074 | 1 | 2017-04-05 04:30:19.695444 | 2017-04-05 05:22:43.2218 | 2017-04-05 04:30:19.695444 | 2017-04-05 05:00:20.671482
foreman=> SELECT * FROM trends;
id | trendable_type | trendable_id | name | type | fact_value | fact_name | created_at |
updated_at
---------+----------------+--------------+------------------------------------------------+-----------+------------------------------------------------+---------------+-------------------------+----
------------------------
1 | FactName | 115 | host uptime | FactTrend | | system_uptime | 2017-03-17 15:52:57.564875 | 2017-03-17 15:52:57.564875
2 | FactName | 115 | uptime18 dayshours448days18seconds1612821 | FactTrend | uptime18 dayshours448days18seconds1612821 | system_uptime | 2017-03-17 15:52:57.602467 | 201
7-03-17 15:52:57.602467
3 | FactName | 115 | hours452days18seconds1628830uptime18 days | FactTrend | hours452days18seconds1628830uptime18 days | system_uptime | 2017-03-17 15:52:57.606234 | 201
7-03-17 15:52:57.606234
4 | FactName | 115 | days121uptime121 daysseconds10539341hours2927 | FactTrend | days121uptime121 daysseconds10539341hours2927 | system_uptime | 2017-03-17 15:52:57.609622 | 201
7-03-17 15:52:57.609622
5 | FactName | 115 | uptime170 daysseconds14760749hours4100days170 | FactTrend | uptime170 daysseconds14760749hours4100days170 | system_uptime | 2017-03-17 15:52:57.613055 | 201
7-03-17 15:52:57.613055
6 | FactName | 115 | uptime150 daysdays150seconds13017020hours3615 | FactTrend | uptime150 daysdays150seconds13017020hours3615 | system_uptime | 2017-03-17 15:52:57.616343 | 201
7-03-17 15:52:57.616343
7 | FactName | 115 | hours2106uptime87 daysseconds7582934days87 | FactTrend | hours2106uptime87 daysseconds7582934days87 | system_uptime | 2017-03-17 15:52:57.619632 | 201
7-03-17 15:52:57.619632
8 | FactName | 115 | hours452seconds1629759days18uptime18 days | FactTrend | hours452seconds1629759days18uptime18 days | system_uptime | 2017-03-17 15:52:57.622916 | 201
7-03-17 15:52:57.622916
9 | FactName | 115 | days17seconds1541917hours428uptime17 days | FactTrend | days17seconds1541917hours428uptime17 days | system_uptime | 2017-03-17 15:52:57.626159 | 201
7-03-17 15:52:57.626159
10 | FactName | 115 | days191seconds16504647hours4584uptime191 days | FactTrend | days191seconds16504647hours4584uptime191 days | system_uptime | 2017-03-17
15:52:57.629555 | 201
Updated by Shimon Shtein over 7 years ago
Could you please export your trends and tren_counters tables data to a zip file, so I would be able to reproduce the memory consumption?
For psql you can use:
psql -c "COPY trends TO stdout DELIMITER ',' CSV HEADER" | gzip > trends.csv.gz psql -c "COPY trend_counters TO stdout DELIMITER ',' CSV HEADER" | gzip > trend_counters.csv.gz
Sorry, don't know how to do it on mysql.
Updated by Bhanu Prasad Ganguru over 7 years ago
Hi Shimon,
I am unable to export tables due to the upload size limit
I can email those directly if you can give me your email
Bhanu
Updated by Bhanu Prasad Ganguru about 7 years ago
Hi Ivan,
we've upgraded to foreman 1.14.3
And I found the foreman-rake trends:counter is what taking all the memory
My question is I can't even load trends from foreman api
It's taking almost around 50G, but still sits at loading
we only have one trend named host uptime
I stopped trends:counter cron job
Is there a way to purge some of the trends
By looking at postgres, all the trends that are in db are not older than 6 months
Any help would be appreciated
Bhanu