Project

General

Profile

Actions

Bug #27974

closed

virt-who hypervisor update may cause rhsm certs check to stuck for several minutes which will lead to 503 or connection timeout

Added by Hao Yu over 4 years ago. Updated over 4 years ago.

Status:
Closed
Priority:
High
Assignee:
Category:
Subscriptions
Target version:
Difficulty:
Triaged:
Yes
Fixed in Releases:
Found in Releases:

Description

Description of problem:
I think there may be a regression in the following commit.

https://github.com/Katello/katello/commit/81530a06de177a78275b229d0ec491579ce016f4#diff-bf897becee6d218f2e9b589c5f66dcfdR21

The transaction can be huge and takes time to commit(I guess) if there are many hosts with thousands of guests to update. It seems that during the commit, most of the rows in katello_subscription_facet table are locked due to the following line. If I comment out this line from my reproducer, the "/rhsm/<uuid>/certificates/serials requests didn't get block while the hypervisor update is running.

https://github.com/Katello/katello/blob/master/app/models/katello/host/subscription_facet.rb#L131

To minimize to performance issue, I think we may need to move the transaction to under each host or remove the transaction completely.

For example:
@hosts.each do |uuid, host|
ActiveRecord::Base.transaction do
update_subscription_facet(uuid, host)
end
end

How reproducible:
I use a stupid way to reproduce the issue so it might not be accurate to reflect the real environment.

1. I modified the code to run the update 100 times within the transaction.

ActiveRecord::Base.transaction do
100.times do
@hosts.each do |uuid, host|
update_subscription_facet(uuid, host)
end
end
end

2. And then trigger the "virt-who -do"

3. On the Satellite, run the following to capture the passenger requests

watch passenger-status --show=requests

4. On a content host run the request many times until it is blocked.

curl -k --cert /etc/pki/consumer/cert.pem --key /etc/pki/consumer/key.pem https://my_satellite_fqdn/rhsm/consumers/&lt;uuid&gt;/certificates/serials

Actual results:
RHSM certs checks request is stuck

passenger-status --show=requests
Version : 4.0.18
Date : 2019-09-30 14:34:34 +1000
Instance: 30428
1 clients:
Client 19:
host = my_satellite.com
uri = /rhsm/consumers/a40cc335-8ba9-481c-8d10-59bc5420601a/certificates/serials
connected at = 2019-09-30 14:33:50 (43 sec ago)
state = FORWARDING_BODY_TO_APP

Expected results:
RHSM certs checks request should process quicker.

Actions

Also available in: Atom PDF