Bug #21110
closedvirt-who cant talk to foreman anymore
Description
Hi there,
i have a problem with virt-who and foreman / katello.
Versions:
katello 3.4.5
virt-who-0.19-6.el7_4.noarch
candlepin-2.0.40-1.el7.noarch
Heres the output of virt-who -o -d
https://thepasteb.in/p/k5hYzyvyBGAfE
and the corresponding candlepin.log and production.log
https://thepasteb.in/p/y8h65P3qkOKcO
https://thepasteb.in/p/3lh7zgvQ8KLu1
In the Dynflow Console i can see the following task running for 6 days:
3: Actions::Candlepin::AsyncHypervisors (waiting for Candlepin to finish the task) [ 522541.42s / 1701.01s ]
I also have many other "Hypervisor Tasks" in Dynflow that seem to be stuck.
In Foreman -> Tasks i have none
Thank you for the help.
Updated by Justin Sherrill over 7 years ago
can you navigate to Monitor > tasks, find the async hypervisors task that you are seeing the issue with, then click on the 'raw' tab.
Finally can you copy and paste the full 'raw input ' and full 'raw output' into the ticket?
thanks,
Justin
Updated by Michael Stead over 7 years ago
Could you also show the output from the following query against the candlepin database:
select * from cp_job where id like 'hypervisor_update%';
Updated by Philipp Mueller over 7 years ago
Michael Stead wrote:
Could you also show the output from the following query against the candlepin database:
[...]
the query produces a 12MB large output. Basically it contains lots of lanes like this
hypervisor_update_e48d618e-50b3-4c7c-b000-bce2ec402f55 | 2017-09-26 16:42:37.207+02 | 2017-09-26 16:42:37.207+02 | | async group | foreman_admin | | | 6 | rs | 0 | org.candlepin.pinsetter.tasks.HypervisorUpdateJob | rs |
Updated by Philipp Mueller over 7 years ago
Justin Sherrill wrote:
can you navigate to Monitor > tasks, find the async hypervisors task that you are seeing the issue with, then click on the 'raw' tab.
Finally can you copy and paste the full 'raw input ' and full 'raw output' into the ticket?
thanks,
Justin
Id: 2d47c542-b87e-4b86-8dfd-f602ac5922e8
Label: Actions::Katello::Host::Hypervisors
Duration: less than a minute
Raw input:
{"services_checked"=>["candlepin", "candlepin_auth"],
"hypervisors"=>Step(3).output[:hypervisors]}
Raw output:
{}
External Id: 791a438e-4ea2-42c7-92fe-bf560d104c91
Updated by Michael Stead over 7 years ago
Thanks for the update.
Could you please run the following query on the candlepin DB to get a feel for how many of these jobs there are and the state they are in:
select distinct(state), count(id) as total from cp_job where id like 'hypervisor_update%' group by state;
Updated by Philipp Mueller over 7 years ago
Michael Stead wrote:
Thanks for the update.
Could you please run the following query on the candlepin DB to get a feel for how many of these jobs there are and the state they are in:
[...]
117 rows
Updated by Michael Stead over 7 years ago
Philipp Mueller wrote:
Michael Stead wrote:
Thanks for the update.
Could you please run the following query on the candlepin DB to get a feel for how many of these jobs there are and the state they are in:
[...]117 rows
Please provide the full output.
Updated by Justin Sherrill over 7 years ago
- Status changed from New to Need more information
- Assignee set to Justin Sherrill
- Target version set to 217
Updated by Philipp Mueller over 7 years ago
Michael Stead wrote:
Philipp Mueller wrote:
Michael Stead wrote:
Thanks for the update.
Could you please run the following query on the candlepin DB to get a feel for how many of these jobs there are and the state they are in:
[...]117 rows
Please provide the full output.
sorry... here is the full output.
candlepin=# select distinct(state), count(id) as total from cp_job where id like 'hypervisor_update%' group by state;
state | total
-------+-------
6 | 3
0 | 2
(2 rows)
thanks for your help
Updated by Michael Stead over 7 years ago
- Assignee deleted (
Justin Sherrill) - Target version deleted (
217)
1. Please run the following query and paste the full output:
select * from cp_job where state in (0, 6) order by state;
2. We need to determine what candlepin job status records katello is looking for. Please look in the candlepin.log and grep for the latest instances of:
uri=/candlepin/jobs/hypervisor_update
3. I would like to see the full candlepin log, but I'm not sure that it can be uploaded here. If you could find a way for me to download it, it would be very helpful.
Thanks.
Updated by Philipp Mueller over 7 years ago
Michael Stead wrote:
1. Please run the following query and paste the full output:
[...]2. We need to determine what candlepin job status records katello is looking for. Please look in the candlepin.log and grep for the latest instances of:
uri=/candlepin/jobs/hypervisor_update
3. I would like to see the full candlepin log, but I'm not sure that it can be uploaded here. If you could find a way for me to download it, it would be very helpful.
Thanks.
Here is the Candlepin.log as tar.gz
https://pm92.de/nc/index.php/s/Xr6neXwgGwxI3VD
and here is the query
https://pm92.de/nc/index.php/s/9UBJAyMGOeTR1cZ
Thank you.
Updated by Michael Stead over 7 years ago
Some notes:
Based on the info provided, it appears that candlepin's hypervisor update async job is not migrating out of the CREATED state, therefore katello never stops checking the Job status.
At this point, I'm not sure why the checkin jobs are remaining stuck in this state. Based on info from Philipp, the stuck jobs are both targeting the same Org.
Looking into this further, both for the issue at hand and a work-around.
Updated by Philipp Mueller over 7 years ago
Michael Stead wrote:
Some notes:
Based on the info provided, it appears that candlepin's hypervisor update async job is not migrating out of the CREATED state, therefore katello never stops checking the Job status.
At this point, I'm not sure why the checkin jobs are remaining stuck in this state. Based on info from Philipp, the stuck jobs are both targeting the same Org.
Looking into this further, both for the issue at hand and a work-around.
Hi,
youre absolutely right, the job never exists the created state. when i set the state to created manually in the psql DB, it will stay in waiting state:
{"id":"hypervisor_update_c28c8d1c-9777-48e5-bdac-9b0bf4aa67fe","state":"CREATED","startTime":null,"finishTime":null,"result":null,"principalName":"foreman_admin","targetType":"owner","targetId":"rs","ownerId":"rs","correlationId":null,"resultData":null,"statusPath":"/jobs/hypervisor_update_c28c8d1c-9777-48e5-bdac-9b0bf4aa67fe","done":false,"group":"async group","created":"2017-10-02T10:56:41+0000","updated":"2017-10-02T10:56:41+0000"}
...
{"id":"hypervisor_update_c28c8d1c-9777-48e5-bdac-9b0bf4aa67fe","state":"WAITING","startTime":null,"finishTime":null,"result":null,"principalName":"foreman_admin","targetType":"owner","targetId":"rs","ownerId":"rs","correlationId":null,"resultData":null,"statusPath":"/jobs/hypervisor_update_c28c8d1c-9777-48e5-bdac-9b0bf4aa67fe","done":false,"group":"async group","created":"2017-10-02T10:56:41+0000","updated":"2017-10-02T10:56:41+0000"}
Updated by Michael Stead over 7 years ago
youre absolutely right, the job never exists the created state. when i set the state to created manually in the psql DB, it will stay in waiting state:
The jobs that are currently in the candlepin DB... you set them to the CREATED state manually... from WAITING? If that is the case, candlepin will never pick them up.
Out of curiosity, did you at any point restart tomcat after virt-who started the hypervisor checkin request?
Something you could try (though I haven't tested it) would be to set all the cp Jobs to the Cancelled state (4) and see if that stops the katello tasks. Once this is done, and the candlepin log is silent again, we can try another virt-who checkin while capturing the candlepin logs.
1) First check to make sure that no hypervisor update jobs are currently in the running state:
select count(*) from cp_job where id like 'hypervisor_update%' and state=2;
2) If there are no running jobs lets cancel all the other hypervisor checkin jobs:
-- Cancel all of the hypervisor_update jobs.
update cp_job set state=4 where id like 'hypervisor_update%';
-- Make sure that all the hypervisor_update jobs have a state of 4 (cancelled)
select id, state from cp_job where id like 'hypervisor_update%';
3) Make sure that the katello tasks have now all stopped due to the cancelled candlepin jobs.
4) Make sure that the candlepin log is relatively silent.
5) Enable candlepin DEBUG logging and restart tomcat (all katello services if possible).
6) Once up and running again, try a manual virt-who checkin again while capturing the candlepin logs from start to finish.
Let me know if you have questions with anything here. I work AST but I'll try and log on earlier than normal in the morning to help out.
Updated by Philipp Mueller over 7 years ago
did all the steps, virt-who still says there are unfinished hypervisor_update jobs
2017-10-06 12:02:39,618 [virtwho.destination_-4231981242631048948 DEBUG] MainProcess(14173):Thread-3 @subscriptionmanager.py:check_report_state:233 - Checking status of job hypervisor_update_189d8b51-eb61-4a8d-85d2-e551a408b27e
2017-10-06 12:02:39,673 [rhsm.connection DEBUG] MainProcess(14173):Thread-3 @connection.py:_request:602 - Response: status=200
2017-10-06 12:02:39,673 [virtwho.destination_-4231981242631048948 DEBUG] MainProcess(14173):Thread-3 @subscriptionmanager.py:check_report_state:247 - Job hypervisor_update_189d8b51-eb61-4a8d-85d2-e551a408b27e not finished
Updated by Michael Stead over 7 years ago
Did you capture the candlepin log while virtwho did its checkin? I'm assuming that you did only one manual start of virt-who? Need to make sure that it eventually finishes.
Please provide:
1) /var/log/candlepin/candlepin.log
2) Current output of:
select id, state from cp_job where id like 'hypervisor_update%';
NOTE: With a large virt environment, it may take a while for candlepin to finish processing the host/guest updates.
Updated by Philipp Mueller over 7 years ago
Michael Stead wrote:
Did you capture the candlepin log while virtwho did its checkin? I'm assuming that you did only one manual start of virt-who? Need to make sure that it eventually finishes.
Please provide:
1) /var/log/candlepin/candlepin.log
2) Current output of:
[...]NOTE: With a large virt environment, it may take a while for candlepin to finish processing the host/guest updates.
i started virt-who as service, it produces the same behaviour as before settings the jobs to canceled
the candlepin.log doenst show any difference, even with debug log:
{"id":"hypervisor_update_8daa745d-0925-4ab8-90d2-9ec7bdb78996","state":"CREATED","startTime":null,"finishTime":null,"result":null,"principalName":"foreman_admin","targetType":"owner","targetId":"rs","ownerId":"rs","correlationId":null,"resultData":null,"statusPath":"/jobs/hypervisor_update_8daa745d-0925-4ab8-90d2-9ec7bdb78996","done":false,"group":"async group","created":"2017-10-11T07:34:14+0000","updated":"2017-10-11T07:34:14+0000"}
The hypervisor job is stuck.
Updated by Philipp Mueller over 7 years ago
candlepin=# select id, state from cp_job where id like 'hypervisor_update%';
id | state
--------------------------------------------------------+-------
hypervisor_update_d43a1813-a807-41f2-ba73-693a77118f84 | 4
hypervisor_update_1a052e72-2959-4cd6-b4d3-4859829361e0 | 4
hypervisor_update_78bdad15-caea-409b-87d3-b677615d23de | 4
hypervisor_update_5c0a1380-6cc0-41b5-aec5-1bb3608cb9e0 | 4
hypervisor_update_cdee6944-31eb-4e2b-a57e-22647732db38 | 0
hypervisor_update_8daa745d-0925-4ab8-90d2-9ec7bdb78996 | 0
hypervisor_update_189d8b51-eb61-4a8d-85d2-e551a408b27e | 4
Updated by Michael Stead over 7 years ago
I'm looking for the full candlepin log to see if I can spot any errors during the initial checkins.
Also, please provide the output from:
select id, result from cp_job where id like 'hypervisor_update%';
Updated by Philipp Mueller over 7 years ago
Michael Stead wrote:
I'm looking for the full candlepin log to see if I can spot any errors during the initial checkins.
Also, please provide the output from:
[...]
here the candlepin.log with debug enabled
https://pm92.de/nc/index.php/s/2z6V2A0Bi2EHZTk
output of query:
candlepin=# select id, result from cp_job where id like 'hypervisor_update%';
id | result
--------------------------------------------------------+--------
hypervisor_update_d43a1813-a807-41f2-ba73-693a77118f84 |
hypervisor_update_1a052e72-2959-4cd6-b4d3-4859829361e0 |
hypervisor_update_78bdad15-caea-409b-87d3-b677615d23de |
hypervisor_update_5c0a1380-6cc0-41b5-aec5-1bb3608cb9e0 |
hypervisor_update_cdee6944-31eb-4e2b-a57e-22647732db38 |
hypervisor_update_8daa745d-0925-4ab8-90d2-9ec7bdb78996 |
hypervisor_update_9ec361ad-755a-4509-8e4f-066e0f917bfd |
hypervisor_update_34d28a4e-8c94-4f86-9700-1ddbbcddcf86 |
hypervisor_update_189d8b51-eb61-4a8d-85d2-e551a408b27e |
Also i have this in foreman task log:
Exception:
NoMethodError: undefined method `[]' for nil:NilClass
Backtrace:
/opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.4.5/app/lib/actions/katello/host/hypervisors.rb:23:in `block in parse_hypervisors'
/opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.4.5/app/lib/actions/katello/host/hypervisors.rb:22:in `each'
/opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.4.5/app/lib/actions/katello/host/hypervisors.rb:22:in `parse_hypervisors'
/opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.4.5/app/lib/actions/candlepin/async_hypervisors.rb:12:in `poll_external_task'
/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.24/lib/dynflow/action/polling.rb:98:in `poll_external_task_with_rescue'
/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.24/lib/dynflow/action/polling.rb:21:in `run'
/opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.4.5/app/lib/actions/candlepin/abstract_async_task.rb:9:in `run'
/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.24/lib/dynflow/action.rb:512:in `block (3 levels) in execute_run'
/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.24/lib/dynflow/middleware/stack.rb:26:in `call'
/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.24/lib/dynflow/middleware/stack.rb:26:in `pass'
/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.24/lib/dynflow/middleware.rb:17:in `pass'
/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.24/lib/dynflow/middleware.rb:30:in `run'
/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.24/lib/dynflow/middleware/stack.rb:22:in `call'
/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.24/lib/dynflow/middleware/stack.rb:26:in `pass'
/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.24/lib/dynflow/middleware.rb:17:in `pass'
/opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.4.5/app/lib/actions/middleware/propagate_candlepin_errors.rb:9:in `block in run'
/opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.4.5/app/lib/actions/middleware/propagate_candlepin_errors.rb:19:in `propagate_candlepin_errors'
/opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.4.5/app/lib/actions/middleware/propagate_candlepin_errors.rb:9:in `run'
/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.24/lib/dynflow/middleware/stack.rb:22:in `call'
/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.24/lib/dynflow/middleware/stack.rb:26:in `pass'
/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.24/lib/dynflow/middleware.rb:17:in `pass'
/opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.4.5/app/lib/actions/middleware/remote_action.rb:16:in `block in run'
/opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.4.5/app/lib/actions/middleware/remote_action.rb:40:in `block in as_remote_user'
/opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.4.5/app/models/katello/concerns/user_extensions.rb:21:in `cp_config'
/opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.4.5/app/lib/actions/middleware/remote_action.rb:27:in `as_cp_user'
/opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.4.5/app/lib/actions/middleware/remote_action.rb:39:in `as_remote_user'
/opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.4.5/app/lib/actions/middleware/remote_action.rb:16:in `run'
/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.24/lib/dynflow/middleware/stack.rb:22:in `call'
/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.24/lib/dynflow/middleware/stack.rb:26:in `pass'
/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.24/lib/dynflow/middleware.rb:17:in `pass'
/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.24/lib/dynflow/action/progress.rb:30:in `with_progress_calculation'
/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.24/lib/dynflow/action/progress.rb:16:in `run'
/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.24/lib/dynflow/middleware/stack.rb:22:in `call'
/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.24/lib/dynflow/middleware/stack.rb:26:in `pass'
/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.24/lib/dynflow/middleware.rb:17:in `pass'
/opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.4.5/app/lib/actions/middleware/keep_locale.rb:11:in `block in run'
/opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.4.5/app/lib/actions/middleware/keep_locale.rb:22:in `with_locale'
/opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.4.5/app/lib/actions/middleware/keep_locale.rb:11:in `run'
/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.24/lib/dynflow/middleware/stack.rb:22:in `call'
/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.24/lib/dynflow/middleware/stack.rb:26:in `pass'
/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.24/lib/dynflow/middleware.rb:17:in `pass'
/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.24/lib/dynflow/middleware.rb:30:in `run'
/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.24/lib/dynflow/middleware/stack.rb:22:in `call'
/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.24/lib/dynflow/middleware/world.rb:30:in `execute'
/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.24/lib/dynflow/action.rb:511:in `block (2 levels) in execute_run'
/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.24/lib/dynflow/action.rb:510:in `catch'
/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.24/lib/dynflow/action.rb:510:in `block in execute_run'
/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.24/lib/dynflow/action.rb:425:in `call'
/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.24/lib/dynflow/action.rb:425:in `block in with_error_handling'
/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.24/lib/dynflow/action.rb:425:in `catch'
/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.24/lib/dynflow/action.rb:425:in `with_error_handling'
/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.24/lib/dynflow/action.rb:505:in `execute_run'
/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.24/lib/dynflow/action.rb:266:in `execute'
/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.24/lib/dynflow/execution_plan/steps/abstract_flow_step.rb:9:in `block (2 levels) in execute'
/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.24/lib/dynflow/execution_plan/steps/abstract.rb:155:in `call'
/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.24/lib/dynflow/execution_plan/steps/abstract.rb:155:in `with_meta_calculation'
/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.24/lib/dynflow/execution_plan/steps/abstract_flow_step.rb:8:in `block in execute'
/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.24/lib/dynflow/execution_plan/steps/abstract_flow_step.rb:22:in `open_action'
/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.24/lib/dynflow/execution_plan/steps/abstract_flow_step.rb:7:in `execute'
/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.24/lib/dynflow/director.rb:55:in `execute'
/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.24/lib/dynflow/executors/parallel/worker.rb:11:in `on_message'
/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.3/lib/concurrent/actor/context.rb:46:in `on_envelope'
/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.3/lib/concurrent/actor/behaviour/executes_context.rb:7:in `on_envelope'
/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.3/lib/concurrent/actor/behaviour/abstract.rb:25:in `pass'
/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.24/lib/dynflow/actor.rb:26:in `on_envelope'
/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.3/lib/concurrent/actor/behaviour/abstract.rb:25:in `pass'
/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.3/lib/concurrent/actor/behaviour/awaits.rb:15:in `on_envelope'
/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.3/lib/concurrent/actor/behaviour/abstract.rb:25:in `pass'
/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.3/lib/concurrent/actor/behaviour/sets_results.rb:14:in `on_envelope'
/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.3/lib/concurrent/actor/behaviour/abstract.rb:25:in `pass'
/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.3/lib/concurrent/actor/behaviour/buffer.rb:38:in `process_envelope'
/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.3/lib/concurrent/actor/behaviour/buffer.rb:31:in `process_envelopes?'
/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.3/lib/concurrent/actor/behaviour/buffer.rb:20:in `on_envelope'
/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.3/lib/concurrent/actor/behaviour/abstract.rb:25:in `pass'
/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.3/lib/concurrent/actor/behaviour/termination.rb:55:in `on_envelope'
/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.3/lib/concurrent/actor/behaviour/abstract.rb:25:in `pass'
/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.3/lib/concurrent/actor/behaviour/removes_child.rb:10:in `on_envelope'
/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.3/lib/concurrent/actor/behaviour/abstract.rb:25:in `pass'
/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.3/lib/concurrent/actor/behaviour/sets_results.rb:14:in `on_envelope'
/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.3/lib/concurrent/actor/core.rb:161:in `process_envelope'
/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.3/lib/concurrent/actor/core.rb:95:in `block in on_envelope'
/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.3/lib/concurrent/actor/core.rb:118:in `block (2 levels) in schedule_execution'
/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-1.0.3/lib/concurrent/synchronization/mri_lockable_object.rb:38:in `block in synchronize'
/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-1.0.3/lib/concurrent/synchronization/mri_lockable_object.rb:38:in `synchronize'
/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-1.0.3/lib/concurrent/synchronization/mri_lockable_object.rb:38:in `synchronize'
/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.3/lib/concurrent/actor/core.rb:115:in `block in schedule_execution'
/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-1.0.3/lib/concurrent/executor/serialized_execution.rb:18:in `call'
/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-1.0.3/lib/concurrent/executor/serialized_execution.rb:18:in `call'
/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-1.0.3/lib/concurrent/executor/serialized_execution.rb:96:in `work'
/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-1.0.3/lib/concurrent/executor/serialized_execution.rb:77:in `block in call_job'
/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-1.0.3/lib/concurrent/executor/ruby_thread_pool_executor.rb:348:in `call'
/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-1.0.3/lib/concurrent/executor/ruby_thread_pool_executor.rb:348:in `run_task'
/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-1.0.3/lib/concurrent/executor/ruby_thread_pool_executor.rb:337:in `block (3 levels) in create_worker'
/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-1.0.3/lib/concurrent/executor/ruby_thread_pool_executor.rb:320:in `loop'
/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-1.0.3/lib/concurrent/executor/ruby_thread_pool_executor.rb:320:in `block (2 levels) in create_worker'
/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-1.0.3/lib/concurrent/executor/ruby_thread_pool_executor.rb:319:in `catch'
/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-1.0.3/lib/concurrent/executor/ruby_thread_pool_executor.rb:319:in `block in create_worker'
/opt/theforeman/tfm/root/usr/share/gems/gems/logging-1.8.2/lib/logging/diagnostic_context.rb:323:in `call'
/opt/theforeman/tfm/root/usr/share/gems/gems/logging-1.8.2/lib/logging/diagnostic_context.rb:323:in `block in create_with_logging_context'
Updated by Justin Sherrill over 7 years ago
- Has duplicate Bug #21191: Error update VM Information with virt-who added
Updated by Ian Ginn about 7 years ago
I'm also having this issue. The Hypervisors job is stuck at 17%. I killed all of these running jobs manually then started it again by resetting the virt-who service but the issue remains. This is the output from the Raw tab of the task.
Id: 5d37625e-01df-4dc9-8195-aed95d3c1915
Label: Actions::Katello::Host::Hypervisors
Duration: about 2 hours
Raw input:
{"services_checked"=>["candlepin", "candlepin_auth"],
"hypervisors"=>Step(3).output[:hypervisors]}
Raw output:
{}
External Id: 61751a58-3abc-445c-911e-7a0ddd74b04a
Updated by Kart Nico about 7 years ago
Hi,
I have the same problem on my Production environment > Katello 3.4.4 and also after update on Katello 3.4.5.
On my QA environment (only 2 RHEL registered) it works. > Katello 3.4.5
Regards,
Nicolas
Updated by Kart Nico about 7 years ago
Kart Nico wrote:
Hi,
I have the same problem on my Production environment > Katello 3.4.4 and also after update on Katello 3.4.5.
On my QA environment (only 2 RHEL registered) it works. > Katello 3.4.5
Regards,
Nicolas
My products versions :
CentOS Linux release 7.3.1611 (Core)
virt-who 0.17-11.el7_3
katello 3.4.5-1.el7
candlepin 2.0.40-1.el7
The Hypervisors job is stuck also at 17%.
Regards,
Nicolas
Updated by Vritant Jain about 7 years ago
Kart, any chance I can get a hand on a copy of your database? would help me to investigate further and verify any fix.
Kart Nico wrote:
Kart Nico wrote:
Hi,
I have the same problem on my Production environment > Katello 3.4.4 and also after update on Katello 3.4.5.
On my QA environment (only 2 RHEL registered) it works. > Katello 3.4.5
Regards,
Nicolas
My products versions :
CentOS Linux release 7.3.1611 (Core)
virt-who 0.17-11.el7_3
katello 3.4.5-1.el7
candlepin 2.0.40-1.el7The Hypervisors job is stuck also at 17%.
Regards,
Nicolas
Updated by Vritant Jain about 7 years ago
Ian, any chance I can get a hand on a copy of your candlepin database? would help me to investigate further and verify any fix.
Ian Ginn wrote:
I'm also having this issue. The Hypervisors job is stuck at 17%. I killed all of these running jobs manually then started it again by resetting the virt-who service but the issue remains. This is the output from the Raw tab of the task.
Id: 5d37625e-01df-4dc9-8195-aed95d3c1915
Label: Actions::Katello::Host::Hypervisors
Duration: about 2 hours
Raw input: {"services_checked"=>["candlepin", "candlepin_auth"],
"hypervisors"=>Step(3).output[:hypervisors]}
Raw output: {}
External Id: 61751a58-3abc-445c-911e-7a0ddd74b04a
Updated by Kart Nico about 7 years ago
Hi,
In which format do you want the export ?
pg_dump candlepin | gzip -c > /var/backup/db
Is OK for you ?
Regards,
Nicolas
Vritant Jain wrote:
Kart, any chance I can get a hand on a copy of your database? would help me to investigate further and verify any fix.
Kart Nico wrote:
Kart Nico wrote:
Hi,
I have the same problem on my Production environment > Katello 3.4.4 and also after update on Katello 3.4.5.
On my QA environment (only 2 RHEL registered) it works. > Katello 3.4.5
Regards,
Nicolas
My products versions :
CentOS Linux release 7.3.1611 (Core)
virt-who 0.17-11.el7_3
katello 3.4.5-1.el7
candlepin 2.0.40-1.el7The Hypervisors job is stuck also at 17%.
Regards,
Nicolas
Updated by Vritant Jain about 7 years ago
Kart,
a pg_dump is perfect.
Kart Nico wrote:
Hi,
In which format do you want the export ?
pg_dump candlepin | gzip -c > /var/backup/db
Is OK for you ?
Regards,
Nicolas
Vritant Jain wrote:
Kart, any chance I can get a hand on a copy of your database? would help me to investigate further and verify any fix.
Kart Nico wrote:
Kart Nico wrote:
Hi,
I have the same problem on my Production environment > Katello 3.4.4 and also after update on Katello 3.4.5.
On my QA environment (only 2 RHEL registered) it works. > Katello 3.4.5
Regards,
Nicolas
My products versions :
CentOS Linux release 7.3.1611 (Core)
virt-who 0.17-11.el7_3
katello 3.4.5-1.el7
candlepin 2.0.40-1.el7The Hypervisors job is stuck also at 17%.
Regards,
Nicolas
Updated by Kart Nico about 7 years ago
Hi,
I sent you all the database in private message. If you want, you can publish only generic data ;).
Regards,
Nicolas
Vritant Jain wrote:
Kart,
a pg_dump is perfect.Kart Nico wrote:
Hi,
In which format do you want the export ?
pg_dump candlepin | gzip -c > /var/backup/db
Is OK for you ?
Regards,
Nicolas
Vritant Jain wrote:
Kart, any chance I can get a hand on a copy of your database? would help me to investigate further and verify any fix.
Kart Nico wrote:
Kart Nico wrote:
Hi,
I have the same problem on my Production environment > Katello 3.4.4 and also after update on Katello 3.4.5.
On my QA environment (only 2 RHEL registered) it works. > Katello 3.4.5
Regards,
Nicolas
My products versions :
CentOS Linux release 7.3.1611 (Core)
virt-who 0.17-11.el7_3
katello 3.4.5-1.el7
candlepin 2.0.40-1.el7The Hypervisors job is stuck also at 17%.
Regards,
Nicolas
Updated by Philipp Mueller about 7 years ago
Hello Jain,
our Foreman installation is also affected by this bug. Since the problem is becoming very urgent, i was wondering if you think deleting all hosts / subscriptions / hypervisors will fix this problem?
KR
Philipp
Updated by Vritant Jain about 7 years ago
While we work on a fix, I have some sql to alleviate and unblock any candlepin effected by this issue. could you:
1. stop katello / candlepin services.
2. login as postgres user: su - postgres
3. psql candlepin -c " delete from qrtz_paused_trigger_grps;"
4. start katello / candlepin services.
NOTE: this will allow us to schedule future candlepin jobs, but the already created jobs will not successfully execute. any future virt-who reports should be successfully consumed.
Philipp Mueller wrote:
Hello Jain,
our Foreman installation is also affected by this bug. Since the problem is becoming very urgent, i was wondering if you think deleting all hosts / subscriptions / hypervisors will fix this problem?
KR
Philipp
Updated by Philipp Mueller about 7 years ago
Vritant Jain wrote:
While we work on a fix, I have some sql to alleviate and unblock any candlepin effected by this issue. could you:
1. stop katello / candlepin services.
2. login as postgres user: su - postgres
3. psql candlepin -c " delete from qrtz_paused_trigger_grps;"
4. start katello / candlepin services.NOTE: this will allow us to schedule future candlepin jobs, but the already created jobs will not successfully execute. any future virt-who reports should be successfully consumed.
Philipp Mueller wrote:
Hello Jain,
our Foreman installation is also affected by this bug. Since the problem is becoming very urgent, i was wondering if you think deleting all hosts / subscriptions / hypervisors will fix this problem?
KR
Philipp
Hi,
thanks for the reply. Unfortunately the delete statement didnt fix the problem.
3 rows got deleted. After that i restarted the katello-services and run virt-who -d -o
I still get the same errors as before:
Job hypervisor_update_8a4e519b-811d-4266-b287-3a862df8b1ea not finished
thank you.
Updated by Vritant Jain about 7 years ago
Can you join Freenode #candlepin please?
Philipp Mueller wrote:
Vritant Jain wrote:
While we work on a fix, I have some sql to alleviate and unblock any candlepin effected by this issue. could you:
1. stop katello / candlepin services.
2. login as postgres user: su - postgres
3. psql candlepin -c " delete from qrtz_paused_trigger_grps;"
4. start katello / candlepin services.NOTE: this will allow us to schedule future candlepin jobs, but the already created jobs will not successfully execute. any future virt-who reports should be successfully consumed.
Philipp Mueller wrote:
Hello Jain,
our Foreman installation is also affected by this bug. Since the problem is becoming very urgent, i was wondering if you think deleting all hosts / subscriptions / hypervisors will fix this problem?
KR
PhilippHi,
thanks for the reply. Unfortunately the delete statement didnt fix the problem.
3 rows got deleted. After that i restarted the katello-services and run virt-who -d -oI still get the same errors as before:
Job hypervisor_update_8a4e519b-811d-4266-b287-3a862df8b1ea not finishedthank you.
Updated by Philipp Mueller about 7 years ago
Philipp Mueller wrote:
Vritant Jain wrote:
While we work on a fix, I have some sql to alleviate and unblock any candlepin effected by this issue. could you:
1. stop katello / candlepin services.
2. login as postgres user: su - postgres
3. psql candlepin -c " delete from qrtz_paused_trigger_grps;"
4. start katello / candlepin services.NOTE: this will allow us to schedule future candlepin jobs, but the already created jobs will not successfully execute. any future virt-who reports should be successfully consumed.
Philipp Mueller wrote:
Hello Jain,
our Foreman installation is also affected by this bug. Since the problem is becoming very urgent, i was wondering if you think deleting all hosts / subscriptions / hypervisors will fix this problem?
KR
PhilippHi,
thanks for the reply. Unfortunately the delete statement didnt fix the problem.
3 rows got deleted. After that i restarted the katello-services and run virt-who -d -oI still get the same errors as before:
Job hypervisor_update_8a4e519b-811d-4266-b287-3a862df8b1ea not finishedthank you.
FIX IS WORKING!
It was my bad being to impatient. Thank you
Updated by Kart Nico about 7 years ago
Could you share it please :)
I will try it on my environment.
Regards,
Nicolas
Philipp Mueller wrote:
Philipp Mueller wrote:
Vritant Jain wrote:
While we work on a fix, I have some sql to alleviate and unblock any candlepin effected by this issue. could you:
1. stop katello / candlepin services.
2. login as postgres user: su - postgres
3. psql candlepin -c " delete from qrtz_paused_trigger_grps;"
4. start katello / candlepin services.NOTE: this will allow us to schedule future candlepin jobs, but the already created jobs will not successfully execute. any future virt-who reports should be successfully consumed.
Philipp Mueller wrote:
Hello Jain,
our Foreman installation is also affected by this bug. Since the problem is becoming very urgent, i was wondering if you think deleting all hosts / subscriptions / hypervisors will fix this problem?
KR
PhilippHi,
thanks for the reply. Unfortunately the delete statement didnt fix the problem.
3 rows got deleted. After that i restarted the katello-services and run virt-who -d -oI still get the same errors as before:
Job hypervisor_update_8a4e519b-811d-4266-b287-3a862df8b1ea not finishedthank you.
FIX IS WORKING!
It was my bad being to impatient. Thank you
Updated by Ian Ginn about 7 years ago
I just want to add that Vritants fix worked for me as well. My hosts last check in is today. Thanks for the fix!
Updated by Eric Helms about 7 years ago
Vritant,
Given a fix has been working for users, can you let us know if we need to do anything to address this for users going forward? A fix on our side? Is there a Candlepin fix or build thats needed?
Updated by Vritant Jain about 7 years ago
Eric,
This issue occurs when there are jobs scheduled but candlepin is shutdown / suspended.
I am currently investigating a fix for this issue so we do not get into this situation in the first place.
Eric Helms wrote:
Vritant,
Given a fix has been working for users, can you let us know if we need to do anything to address this for users going forward? A fix on our side? Is there a Candlepin fix or build thats needed?
Updated by Philipp Mueller about 7 years ago
Hi,
after implementing the fix with the delete from qrtz_paused_trigger_grps, virt-who was working fine for some days, but now we are expieriencing the same problems again.
Updated by Andrew Kofink about 7 years ago
- Assignee set to Justin Sherrill
- Translation missing: en.field_release set to 329
This requires a new version of candlepin to be released and packaged with Katello.
Updated by Justin Sherrill about 7 years ago
candlepin-2.1.12-1.el7 tagged to 3.5.1
Updated by Justin Sherrill about 7 years ago
- Status changed from Need more information to Closed