Project

General

Profile

Actions

Bug #26684

open

Jobs runs correctly but tasks keeps running and error occurs

Added by Elias Pereira over 5 years ago. Updated over 5 years ago.

Status:
New
Priority:
High
Assignee:
-
Category:
Smart Proxy Dynflow
Target version:
-
Difficulty:
Triaged:
No
Found in Releases:

Description

Hello,

When executing a Remote Execution Job on a host the job runs successfully but the task remains in Pending.

foreman: 1.21.3
foreman-proxy: 1.21.3

Active features:
Dynflow
Logs
Puppet
Puppet CA
SSH
TFTP


Files

dynflow_executor.log dynflow_executor.log 21.4 KB Elias Pereira, 04/25/2019 04:17 PM
proxy.log proxy.log 219 KB Elias Pereira, 04/25/2019 04:17 PM
remote_execution_log.txt remote_execution_log.txt 12.1 KB Elias Pereira, 04/25/2019 04:21 PM
Actions #1

Updated by Elias Pereira over 5 years ago

  1. /etc/init.d/dynflowd status
    ● dynflowd.service - Foreman jobs daemon
    Loaded: loaded (/lib/systemd/system/dynflowd.service; enabled; vendor preset: enabled)
    Active: active (running) since Thu 2019-04-25 13:07:57 -03; 6min ago
    Docs: https://theforeman.org
    Process: 20670 ExecStop=/usr/bin/bundle exec script/dynflowd stop (code=exited, status=0/SUCCESS)
    Process: 20731 ExecStart=/usr/bin/bundle exec script/dynflowd start (code=exited, status=0/SUCCESS)
    Tasks: 11 (limit: 4915)
    CGroup: /system.slice/dynflowd.service
    ├─20789 dynflow_executor
    └─20792 dynflow_executor_monitor

Apr 25 13:07:53 puppet systemd1: Starting Foreman jobs daemon...
Apr 25 13:07:56 puppet bundle20731: Dynflow Executor: start in progress
Apr 25 13:07:57 puppet bundle20731: /usr/share/foreman/vendor/ruby/2.3.0/gems/daemons-1.3.1/lib/daemons/daemonize.rb:75: warning: conflicting…chdir block
Apr 25 13:07:57 puppet bundle20731: /usr/share/foreman/vendor/ruby/2.3.0/gems/daemons-1.3.1/lib/daemons/daemonize.rb:108: warning: conflictin…chdir block
Apr 25 13:07:57 puppet bundle20731: dynflow_executor: process with pid 20789 started.
Apr 25 13:07:57 puppet systemd1: Started Foreman jobs daemon.
Hint: Some lines were ellipsized, use -l to show in full.

Actions #2

Updated by Elias Pereira over 5 years ago

proxy.log 26 apr, 09:11

> proxy.log <
2019-04-26T09:06:28 [D] accept: 200.132.218.151:44922
2019-04-26T09:06:28 [D] Rack::Handler::WEBrick is invoked.
2019-04-26T09:06:28 [I] Execution plan cleaner removing 0 execution plans.
2019-04-26T09:06:28 35b61a3a [I] Started GET /tasks/count state=running
2019-04-26T09:06:28 35b61a3a [I] Finished GET /tasks/count with 200 (5.01 ms)
2019-04-26T09:06:28 [D] close: 200.132.218.151:44922
2019-04-26T09:06:29 [D] accept: 200.132.218.151:44924
2019-04-26T09:06:29 [D] Rack::Handler::WEBrick is invoked.
2019-04-26T09:06:29 66c67360 [I] Started POST /tasks/
2019-04-26T09:06:29 66c67360 [D] ExecutionPlan 8ef71a10-aee9-4371-b5e3-e6a6ba961c3d pending >> planning
2019-04-26T09:06:29 66c67360 [D] Step 8ef71a10-aee9-4371-b5e3-e6a6ba961c3d: 1 pending >> running in phase Plan ForemanRemoteExecutionCore::Actions::RunScript
2019-04-26T09:06:29 66c67360 [D] Step 8ef71a10-aee9-4371-b5e3-e6a6ba961c3d: 4 pending >> running in phase Plan SmartProxyDynflowCore::Callback::Action
2019-04-26T09:06:29 66c67360 [D] Step 8ef71a10-aee9-4371-b5e3-e6a6ba961c3d: 4 running >> success in phase Plan SmartProxyDynflowCore::Callback::Action
2019-04-26T09:06:29 66c67360 [D] Step 8ef71a10-aee9-4371-b5e3-e6a6ba961c3d: 1 running >> success in phase Plan ForemanRemoteExecutionCore::Actions::RunScript
2019-04-26T09:06:29 66c67360 [D] ExecutionPlan 8ef71a10-aee9-4371-b5e3-e6a6ba961c3d planning >> planned
2019-04-26T09:06:29 66c67360 [I] Finished POST /tasks/ with 200 (46.59 ms)
2019-04-26T09:06:29 [D] ExecutionPlan 8ef71a10-aee9-4371-b5e3-e6a6ba961c3d planned >> running
2019-04-26T09:06:29 [D] Step 8ef71a10-aee9-4371-b5e3-e6a6ba961c3d: 2 pending >> running in phase Run ForemanRemoteExecutionCore::Actions::RunScript
2019-04-26T09:06:29 [D] Step 8ef71a10-aee9-4371-b5e3-e6a6ba961c3d: 2 running >> suspended in phase Run ForemanRemoteExecutionCore::Actions::RunScript
2019-04-26T09:06:29 [D] start runner ec84f157-126b-4165-a94e-3b5cf3ef384e
2019-04-26T09:06:29 [D] setting timeout for ec84f157-126b-4165-a94e-3b5cf3ef384e to 2019-04-26 09:06:39 -0300
2019-04-26T09:06:29 [D] opening session to
2019-04-26T09:06:29 [D] close: 200.132.218.151:44924
2019-04-26T09:06:29 [D] Sending data to /var/tmp/foreman-ssh-cmd-ec84f157-126b-4165-a94e-3b5cf3ef384e/script on remote host:
cat /etc/network/interfaces
2019-04-26T09:06:29 [D] executing script: | sh <<WRAPPER | (/var/tmp/foreman-ssh-cmd-ec84f157-126b-4165-a94e-3b5cf3ef384e/script < /dev/null; echo \$?>/var/tmp/foreman-ssh-cmd-ec84f157-126b-4165-a94e-3b5cf3ef384e/exit_code) | /usr/bin/tee /var/tmp/foreman-ssh-cmd-ec84f157-126b-4165-a94e-3b5cf3ef384e/output | exit \$(cat /var/tmp/foreman-ssh-cmd-ec84f157-126b-4165-a94e-3b5cf3ef384e/exit_code) | WRAPPER

2019-04-26T09:06:29 [D] refresh runner ec84f157-126b-4165-a94e-3b5cf3ef384e
2019-04-26T09:06:29 [D] refreshing runner
2019-04-26T09:06:29 [D] planning to refresh ec84f157-126b-4165-a94e-3b5cf3ef384e
2019-04-26T09:06:30 [D] Ticker ticking for 1 events
2019-04-26T09:06:30 [D] refresh runner ec84f157-126b-4165-a94e-3b5cf3ef384e
2019-04-26T09:06:30 [D] refreshing runner
2019-04-26T09:06:30 [D] finish runner ec84f157-126b-4165-a94e-3b5cf3ef384e
2019-04-26T09:06:30 [D] closing session for command [ec84f157-126b-4165-a94e-3b5cf3ef384e],0 actors left
2019-04-26T09:06:30 [D] terminate ec84f157-126b-4165-a94e-3b5cf3ef384e
2019-04-26T09:06:30 [D] Step 8ef71a10-aee9-4371-b5e3-e6a6ba961c3d: 2 got event #<ForemanTasksCore::Runner::Update:0x005617ec2567a0>
2019-04-26T09:06:30 [D] Step 8ef71a10-aee9-4371-b5e3-e6a6ba961c3d: 2 suspended >> running in phase Run ForemanRemoteExecutionCore::Actions::RunScript
2019-04-26T09:06:30 [D] Step 8ef71a10-aee9-4371-b5e3-e6a6ba961c3d: 2 running >> success in phase Run ForemanRemoteExecutionCore::Actions::RunScript
2019-04-26T09:06:30 [D] Step 8ef71a10-aee9-4371-b5e3-e6a6ba961c3d: 5 pending >> running in phase Run SmartProxyDynflowCore::Callback::Action
2019-04-26T09:06:30 [W] Error details
RestClient::SSLCertificateNotVerified: SSL_connect returned=1 errno=0 state=error: certificate verify failed
/usr/lib/ruby/vendor_ruby/restclient/request.rb:445:in `rescue in transmit'
/usr/lib/ruby/vendor_ruby/restclient/request.rb:350:in `transmit'
/usr/lib/ruby/vendor_ruby/restclient/request.rb:176:in `execute'
/usr/lib/ruby/vendor_ruby/restclient/request.rb:41:in `execute'
/usr/lib/ruby/vendor_ruby/restclient/resource.rb:67:in `post'
/usr/lib/ruby/vendor_ruby/smart_proxy_dynflow_core/callback.rb:51:in `callback'
/usr/lib/ruby/vendor_ruby/smart_proxy_dynflow_core/callback.rb:15:in `send_to_foreman_tasks'
/usr/lib/ruby/vendor_ruby/smart_proxy_dynflow_core/callback.rb:72:in `run'
/usr/lib/ruby/vendor_ruby/dynflow/action.rb:539:in `block (3 levels) in execute_run'
/usr/lib/ruby/vendor_ruby/dynflow/middleware/stack.rb:26:in `pass'
/usr/lib/ruby/vendor_ruby/dynflow/middleware.rb:18:in `pass'
/usr/lib/ruby/vendor_ruby/dynflow/action/progress.rb:30:in `with_progress_calculation'
/usr/lib/ruby/vendor_ruby/dynflow/action/progress.rb:16:in `run'
/usr/lib/ruby/vendor_ruby/dynflow/middleware/stack.rb:22:in `call'
/usr/lib/ruby/vendor_ruby/dynflow/middleware/stack.rb:26:in `pass'
/usr/lib/ruby/vendor_ruby/dynflow/middleware.rb:18:in `pass'
/usr/lib/ruby/vendor_ruby/dynflow/middleware.rb:31:in `run'
/usr/lib/ruby/vendor_ruby/dynflow/middleware/stack.rb:22:in `call'
/usr/lib/ruby/vendor_ruby/dynflow/middleware/world.rb:30:in `execute'
/usr/lib/ruby/vendor_ruby/dynflow/action.rb:538:in `block (2 levels) in execute_run'
/usr/lib/ruby/vendor_ruby/dynflow/action.rb:537:in `catch'
/usr/lib/ruby/vendor_ruby/dynflow/action.rb:537:in `block in execute_run'
/usr/lib/ruby/vendor_ruby/dynflow/action.rb:451:in `block in with_error_handling'
/usr/lib/ruby/vendor_ruby/dynflow/action.rb:451:in `catch'
/usr/lib/ruby/vendor_ruby/dynflow/action.rb:451:in `with_error_handling'
/usr/lib/ruby/vendor_ruby/dynflow/action.rb:532:in `execute_run'
/usr/lib/ruby/vendor_ruby/dynflow/action.rb:278:in `execute'
/usr/lib/ruby/vendor_ruby/dynflow/execution_plan/steps/abstract_flow_step.rb:17:in `block (2 levels) in execute'
/usr/lib/ruby/vendor_ruby/dynflow/execution_plan/steps/abstract.rb:162:in `with_meta_calculation'
/usr/lib/ruby/vendor_ruby/dynflow/execution_plan/steps/abstract_flow_step.rb:16:in `block in execute'
/usr/lib/ruby/vendor_ruby/dynflow/execution_plan/steps/abstract_flow_step.rb:30:in `open_action'
/usr/lib/ruby/vendor_ruby/dynflow/execution_plan/steps/abstract_flow_step.rb:15:in `execute'
/usr/lib/ruby/vendor_ruby/dynflow/director.rb:43:in `execute'
/usr/lib/ruby/vendor_ruby/dynflow/executors/parallel/worker.rb:14:in `block in on_message'
/usr/lib/ruby/vendor_ruby/dynflow/executors.rb:12:in `run_user_code'
/usr/lib/ruby/vendor_ruby/dynflow/executors/parallel/worker.rb:13:in `on_message'
/usr/lib/ruby/vendor_ruby/concurrent/actor/context.rb:46:in `on_envelope'
/usr/lib/ruby/vendor_ruby/concurrent/actor/behaviour/executes_context.rb:7:in `on_envelope'
/usr/lib/ruby/vendor_ruby/concurrent/actor/behaviour/abstract.rb:25:in `pass'
/usr/lib/ruby/vendor_ruby/dynflow/actor.rb:26:in `on_envelope'
/usr/lib/ruby/vendor_ruby/concurrent/actor/behaviour/abstract.rb:25:in `pass'
/usr/lib/ruby/vendor_ruby/concurrent/actor/behaviour/awaits.rb:15:in `on_envelope'
/usr/lib/ruby/vendor_ruby/concurrent/actor/behaviour/abstract.rb:25:in `pass'
/usr/lib/ruby/vendor_ruby/concurrent/actor/behaviour/sets_results.rb:14:in `on_envelope'
/usr/lib/ruby/vendor_ruby/concurrent/actor/behaviour/abstract.rb:25:in `pass'
/usr/lib/ruby/vendor_ruby/concurrent/actor/behaviour/buffer.rb:38:in `process_envelope'
/usr/lib/ruby/vendor_ruby/concurrent/actor/behaviour/buffer.rb:31:in `process_envelopes?'
/usr/lib/ruby/vendor_ruby/concurrent/actor/behaviour/buffer.rb:20:in `on_envelope'
/usr/lib/ruby/vendor_ruby/concurrent/actor/behaviour/abstract.rb:25:in `pass'
/usr/lib/ruby/vendor_ruby/concurrent/actor/behaviour/termination.rb:55:in `on_envelope'
/usr/lib/ruby/vendor_ruby/concurrent/actor/behaviour/abstract.rb:25:in `pass'
/usr/lib/ruby/vendor_ruby/concurrent/actor/behaviour/removes_child.rb:10:in `on_envelope'
/usr/lib/ruby/vendor_ruby/concurrent/actor/behaviour/abstract.rb:25:in `pass'
/usr/lib/ruby/vendor_ruby/concurrent/actor/behaviour/sets_results.rb:14:in `on_envelope'
/usr/lib/ruby/vendor_ruby/concurrent/actor/core.rb:161:in `process_envelope'
/usr/lib/ruby/vendor_ruby/concurrent/actor/core.rb:95:in `block in on_envelope'
/usr/lib/ruby/vendor_ruby/concurrent/actor/core.rb:118:in `block (2 levels) in schedule_execution'
/usr/lib/ruby/vendor_ruby/concurrent/synchronization/mri_lockable_object.rb:38:in `block in synchronize'
/usr/lib/ruby/vendor_ruby/concurrent/synchronization/mri_lockable_object.rb:38:in `synchronize'
/usr/lib/ruby/vendor_ruby/concurrent/synchronization/mri_lockable_object.rb:38:in `synchronize'
/usr/lib/ruby/vendor_ruby/concurrent/actor/core.rb:115:in `block in schedule_execution'
/usr/lib/ruby/vendor_ruby/concurrent/executor/serialized_execution.rb:18:in `call'
/usr/lib/ruby/vendor_ruby/concurrent/executor/serialized_execution.rb:96:in `work'
/usr/lib/ruby/vendor_ruby/concurrent/executor/serialized_execution.rb:77:in `block in call_job'
/usr/lib/ruby/vendor_ruby/concurrent/executor/ruby_thread_pool_executor.rb:333:in `run_task'
/usr/lib/ruby/vendor_ruby/concurrent/executor/ruby_thread_pool_executor.rb:322:in `block (3 levels) in create_worker'
/usr/lib/ruby/vendor_ruby/concurrent/executor/ruby_thread_pool_executor.rb:305:in `loop'
/usr/lib/ruby/vendor_ruby/concurrent/executor/ruby_thread_pool_executor.rb:305:in `block (2 levels) in create_worker'
/usr/lib/ruby/vendor_ruby/concurrent/executor/ruby_thread_pool_executor.rb:304:in `catch'
/usr/lib/ruby/vendor_ruby/concurrent/executor/ruby_thread_pool_executor.rb:304:in `block in create_worker'
/usr/lib/ruby/vendor_ruby/logging/diagnostic_context.rb:448:in `block in create_with_logging_context'
2019-04-26T09:06:30 [E] SSL_connect returned=1 errno=0 state=error: certificate verify failed (RestClient::SSLCertificateNotVerified)
2019-04-26T09:06:30 [D] Step 8ef71a10-aee9-4371-b5e3-e6a6ba961c3d: 5 running >> error in phase Run SmartProxyDynflowCore::Callback::Action
2019-04-26T09:06:30 [D] ExecutionPlan 8ef71a10-aee9-4371-b5e3-e6a6ba961c3d running >> paused
2019-04-26T09:06:40 [D] accept: 200.132.218.151:44930
2019-04-26T09:06:40 [D] Rack::Handler::WEBrick is invoked.
2019-04-26T09:06:40 2c05366e [I] Started GET /tasks/8ef71a10-aee9-4371-b5e3-e6a6ba961c3d/status
2019-04-26T09:06:40 2c05366e [I] Finished GET /tasks/8ef71a10-aee9-4371-b5e3-e6a6ba961c3d/status with 200 (5.13 ms)
2019-04-26T09:06:40 [D] close: 200.132.218.151:44930
2019-04-26T09:06:40 [D] accept: 200.132.218.151:44932
2019-04-26T09:06:40 [D] Rack::Handler::WEBrick is invoked.
2019-04-26T09:06:40 2c05366e [I] Started GET /tasks/8ef71a10-aee9-4371-b5e3-e6a6ba961c3d/status
2019-04-26T09:06:40 2c05366e [I] Finished GET /tasks/8ef71a10-aee9-4371-b5e3-e6a6ba961c3d/status with 200 (4.46 ms)
2019-04-26T09:06:40 [D] close: 200.132.218.151:44932
2019-04-26T09:06:43 [D] Executor heartbeat

Actions #3

Updated by Adam Ruzicka over 5 years ago

When the job finishes, smart proxy tries to call back to foreman. Apparently this fails because of a SSL related error

2019-04-26T09:06:30 [E] SSL_connect returned=1 errno=0 state=error: certificate verify failed (RestClient::SSLCertificateNotVerified)

Check that you have ssl_ca_file, ssl_certificate and ssl_private_key (and possibly foreman_ssl_ca, foreman_ssl_cert and foreman_ssl_key) configured correctly in /etc/foreman-proxy/settings.yml

Actions #4

Updated by Elias Pereira over 5 years ago

Adam Ruzicka wrote:

When the job finishes, smart proxy tries to call back to foreman. Apparently this fails because of a SSL related error
[...]

Check that you have ssl_ca_file, ssl_certificate and ssl_private_key (and possibly foreman_ssl_ca, foreman_ssl_cert and foreman_ssl_key) configured correctly in /etc/foreman-proxy/settings.yml

Thanks for the feedback.

I was able to resolve it by upgrading the version and running the foreman-installer again.

I had configured https with self-signed certificate, but now it came back with the old certificate. Via tutorial: https://theforeman.org/2015/11/foreman-ssl.html informs that I can automate through "hiera" data, as in the example shown:

foreman::ssl: true
puppet::server_foreman_ssl_ca: '/etc/pki/tls/certs/cachain.crt'
puppet::server_foreman_url: 'https://puppet.example.com'
foreman::server_ssl_key: '/etc/pki/tls/private/puppet.example.com.key'
foreman::server_ssl_cert: '/etc/pki/tls/certs/puppet.example.com.crt'
foreman::server_ssl_chain: '/etc/pki/tls/certs/cachain.crt'
foreman::servername: 'puppet.example.com'
foreman::foreman_url: 'https://puppet.example.com'
foreman::websockets_ssl_key: '/etc/pki/tls/private/puppet.example.key'
foreman::websockets_ssl_cert: '/etc/pki/tls/certs/puppet.example.crt'

How can I do this?

Actions #5

Updated by Adam Ruzicka over 5 years ago

I was able to resolve it by upgrading the version and running the foreman-installer again.

My bet would be that rerunning the installer fixed it. Especially if it overwrote your custom certs.

How can I do this?

Writing that into /etc/foreman-installer/custom-hiera.yaml should be enough to make the installer pick it up.

Actions #6

Updated by Elias Pereira over 5 years ago

Adam Ruzicka wrote:

I was able to resolve it by upgrading the version and running the foreman-installer again.

My bet would be that rerunning the installer fixed it. Especially if it overwrote your custom certs.

How can I do this?

Writing that into /etc/foreman-installer/custom-hiera.yaml should be enough to make the installer pick it up.

But will I have to run the foreman-installer again? If so, is there another way to automate it through a module?

Actions #7

Updated by Elias Pereira over 5 years ago

Adam Ruzicka wrote:

I was able to resolve it by upgrading the version and running the foreman-installer again.

My bet would be that rerunning the installer fixed it. Especially if it overwrote your custom certs.

How can I do this?

Writing that into /etc/foreman-installer/custom-hiera.yaml should be enough to make the installer pick it up.

I made the changes to the file and ran the foreman-installer again. Https worked correctly, but remote execution had problems again.

What would be the corelation?

Actions #8

Updated by Adam Ruzicka over 5 years ago

As I mentioned previously, smart proxy has a problem with validating foreman's certificate. I assume it used to work, then it stopped working when you replaced the certificates with custom ones, then it started working again when you run the installer and it replaced your custom certificates with the original ones. And lastly when you configured the installer to deploy your custom certs, it broke again.

Actions #9

Updated by Elias Pereira over 5 years ago

Adam Ruzicka wrote:

As I mentioned previously, smart proxy has a problem with validating foreman's certificate. I assume it used to work, then it stopped working when you replaced the certificates with custom ones, then it started working again when you run the installer and it replaced your custom certificates with the original ones. And lastly when you configured the installer to deploy your custom certs, it broke again.

Hello @Adam

Any news about the problem (smart proxy has a problem with validating foreman's certificate) that you mencioned in your last answer?

Actions

Also available in: Atom PDF