Project

General

Profile

Feature #26428

Ability to restart the machine while the remote execution job is still acting as running - backend

Added by Ivan Necas over 1 year ago. Updated about 1 year ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
-
Target version:
-
Difficulty:
Triaged:
No
Bugzilla link:

Description

Some scenarios in rex involve need to disconnect from network and continue some work offline. While we have support for `async`, it still marks the job as failed when the managed hosts goes offline.

With this feature, we should be able to take control over the job status from the remote host, so that the job is marked as running even when the host goes down temporary as part of execution of the job.

Example of such a template:

$CONTROL_SCRIPT manual-mode
cat <<HELP | $CONTROL_SCRIPT update >/dev/null 
The script has switched to manual-mode. It will be acting
as running after this script finishes.

The control script is available in \$CONTROL_SCRIPT
env varaible.

To send output data to the job, on can do something like this:

    echo Hello world | $CONTROL_SCRIPT update

To mark the script as finished, one can do

   $CONTROL_SCRIPT finish 0

there the second argument should be the exit code the 
script ended with.
HELP

After running this, one should be able to go to the remote host, reboot it and the job should still be running until $CONTROL_SCRIPT finish 0 is finished. Additional output can be sent to the job with echo Hello world | $CONTROL_SCRIPT update

Additional note:

the satellite needs to be installed with --foreman-proxy-plugin-remote-execution-ssh-async-ssh=true in order for this to work.


Related issues

Blocked by foreman-tasks - Feature #26691: Allow sending partial updates via smart_proxy_dynflowClosed
Blocked by foreman-tasks - Feature #26692: Allow sending partial updates from smart_proxy_dynflow to runnerClosed

Associated revisions

Revision cc78145f (diff)
Added by Ivan Necas about 1 year ago

Fixes #26428 - exposed $CONTROL_SCRIPT and manual-mode (#413)

  • Fixes #26428 - exposed $CONTROL_SCRIPT and manual-mode

This patch addes ability to take control over the job, allowing things
like surviving restarts and other advanced stuff. The main commands
allowing this is:

- `$CONTROL_SCRIPT manual-control` : turn on the manual control
- `echo hello | $CONTROL_SCRIPT update` : send output to the job
- `$CONTROL_SCRIPT finish $EXIT_CODE` : mark the job as finished
with corresponding $EXIT_CODE

Currently, this only works when proxy rex plugin is configured with
`:async: true`.

It also involves refoctoring to make majority of the control scripts
static files, and dynamically generate only `env.sh` script, containing
dynamic data only, that other scripts can source from.

  • Refs #26428 - refresh in non-manual mode and fix exit code
  • Refs #26428 - more resiliency on failed curl

Make sure to move on with output only when the curl was successful.

  • Refs #26428 - Make exit_code the first key in payload

To prevent it overriding other fields

  • Refs #26428 - introduce retrieve script timeouts
  • Refs #26428 - prevent loosing of output

History

#1 Updated by Ivan Necas over 1 year ago

  • Description updated (diff)

#2 Updated by Bryan Kearney over 1 year ago

  • Bugzilla link set to 1691453

#3 Updated by Ivan Necas over 1 year ago

  • Blocked by Feature #26691: Allow sending partial updates via smart_proxy_dynflow added

#4 Updated by Ivan Necas over 1 year ago

  • Blocked by Feature #26692: Allow sending partial updates from smart_proxy_dynflow to runner added

#5 Updated by The Foreman Bot over 1 year ago

  • Assignee set to Ivan Necas
  • Status changed from New to Ready For Testing
  • Pull request https://github.com/theforeman/foreman_remote_execution/pull/413 added

#6 Updated by The Foreman Bot about 1 year ago

  • Fixed in Releases foreman_remote_execution 1.8.5 added

#7 Updated by Ivan Necas about 1 year ago

  • Status changed from Ready For Testing to Closed

#8 Updated by Adam Ruzicka about 1 year ago

  • Fixed in Releases foreman_remote_execution_core 1.3.0 added
  • Fixed in Releases deleted (foreman_remote_execution 1.8.5)

Also available in: Atom PDF