Bug #16170
Updated by Tomáš Strachota over 8 years ago
Cloned from https://bugzilla.redhat.com/show_bug.cgi?id=1325879 +++ This bug was initially created as a clone of Bug #1320557 +++ Description of problem: When unregistering a content host (either by subscription-manager unregister, or deleting Content Host, or deleting Host), Actions::Katello::System::Destroy task is being processed. This task has a concurrency bug that with some probability causes Actions::Candlepin::ListenOnCandlepinEvents task to be paused with error: Katello::Resources::Candlepin::Consumer: 410 Gone {"displayMessage":"Unit b3012a90-41b6-4788-98a9-5b41839b6dca has been deleted","requestUuid":"04dbfeda-75cb-4ffa-ad9f-9f4c818fc868","deletedId":"b3012a90-41b6-4788-98a9-5b41839b6dca"} (GET /candlepin/consumers/b3012a90-41b6-4788-98a9-5b41839b6dca) (UUID matches the UUID of just being deleted Content Host) sequence of steps leading to the bug: - Actions::Candlepin::Consumer::Destroy sub-task executed - it deletes the consumer from candlepin and recalculates compliance for it (imho redundant step when we delete the consumer, but katello is aware of it) - it announces the compliance.create event to ListenOnCandlepinEvents task - katello finds the katello_system is present so it runs "reindex_consumer" "re-indexing content host .." - the parent task Actions::Katello::System::Destroy even _now_ enters Actions::Katello::System::Destroy - this subtask deletes the system from katello_systems _after_ the check before "re-indexing content host .." is made, so the re-index is _not_ skipped - the re-index of the content host calls GET consumer/<uuid> on candlepin, what triggers the 410 Gone in ListenOnCandlepinEvents task Version-Release number of selected component (if applicable): Sat 6.1.7 How reproducible: 100% within few minutes Steps to Reproduce: 1. tail -f /var/log/foreman/production.log | grep -e "in phase Finalize Actions::Katello::System::Destroy" -e "skip re-indexing of non-existent content host" -e "re-indexing content host" 2. Have opened Actions::Candlepin::ListenOnCandlepinEvents task in WebUI 3. On some Content Host, register and unregister it in a loop (here RHEL7 used, update if using RHEL5 or 6 accordingly): while true; do subscription-manager register --force --org="Default_Organization" --environment="Library" --username=admin --password=faYakexMm5XN543x subscription-manager subscribe --pool=8aa2d415526494380152732fc8d20dd7 subscription-manager repos --enable rhel-7-server-rpms --enable rhel-7-server-satellite-tools-6.1-rpms date subscription-manager unregister date sleep 5 done 4. monitor the tail -f output and ListenOnCandlepinEvents task Actual results: - tail shows: 2016-03-23 14:17:05 [D] re-indexing content host pmoravec-rhel7.gsslab.brq.redhat.com 2016-03-23 14:17:05 [D] Step f5705893-2577-41c4-9af6-9b7c10ccb646: 6 running >> success in phase Finalize Actions::Katello::System::Destroy - ListenOnCandlepinEvents task is paused/error with error: Katello::Resources::Candlepin::Consumer: 410 Gone {"displayMessage":"Unit b3012a90-41b6-4788-98a9-5b41839b6dca has been deleted","requestUuid":"04dbfeda-75cb-4ffa-ad9f-9f4c818fc868","deletedId":"b3012a90-41b6-4788-98a9-5b41839b6dca"} (GET /candlepin/consumers/b3012a90-41b6-4788-98a9-5b41839b6dca) Expected results: - tail can be ok (rather symptom for devels), just the ListenOnCandlepinEvents task needs to be running without an error Additional info: seems like a lack of locking / concurrency bug where reindex_pool_subscription_handler.rb: def reindex_consumer(message) if message.content['newEntity'] uuid = JSON.parse(message.content['newEntity'])['consumer']['uuid'] system = ::Katello::System.find_by_uuid(uuid) if system.nil? @logger.debug "skip re-indexing of non-existent content host #{uuid}" else @logger.debug "re-indexing content host #{system.name}" system.update_index end needs to be executed as atomic operation (not concurrently with Actions::Katello::System::Destroy)