Project

General

Profile

Actions

Support #14993

closed

SSH provisioning always results in a non-zero exit code

Added by Dan Dunckel almost 8 years ago. Updated almost 8 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
Orchestration
Target version:
-
Triaged:
Fixed in Releases:
Found in Releases:

Description

I'm testing the latest version (v1.11.1), and in my tests, I came across an issue where our Finish script always results in this error:

2016-05-06T18:24:42 [app] [W] Failed to launch script on test-new-foreman-01.ta.local: ERF42-0453 [Foreman::Exception]: Provision script had a non zero exit
 | Foreman::Exception: ERF42-0453 [Foreman::Exception]: Provision script had a non zero exit                                                                
 | /usr/share/foreman/app/models/concerns/orchestration/ssh_provision.rb:89:in `setSSHProvision'

Hopefully, this is happening because of something silly. As a result, this automatically deletes the newly created host. This required me to set clean_up_failed_deployment to false. Then I could dig into debugging.

As I understand it, this happens because the code checks the exit code of every command run during the SSH session. Therefore, when 1 of those commands returns an exit code that IS NOT 0, then it will throw an error and keep the host in a Pending Installation state (instead of being "successfully built").

Here's the relevant code (link):

def deploy!
  logger.debug "about to upload #{template} to remote system at #{remote_script}" 
  scp.upload(template, remote_script)
  logger.debug "about to execute #{command}" 
  @results = ssh.run(command)
  log_stdout
  log_stderr
  success?
end

private

def success?
  return true if results.empty?
  results.map(&:status).compact == [0]
end

def log_stdout
  results.each do |r|
    r.stdout.split("\n").each { |l| logger.debug l }
  end
end

def log_stderr
  results.each do |r|
    r.stderr.split("\n").each { |l| logger.warn l }
  end
end

Here's the content of the Finish template:

set -ex

<%#
kind: finish
name: Preseed default finish
oses:
- Debian 6.0
- Debian 7.
- Debian 8.
- Ubuntu 10.04
- Ubuntu 12.04
- Ubuntu 13.04
- Ubuntu 14.04
-%>
<%
  # safemode renderer does not support unary negation
  pm_set = @host.puppetmaster.empty? ? false : true
  puppet_enabled = pm_set || @host.params['force-puppet'] && @host.params['force-puppet'] == 'true'
-%>
/bin/hostname <%= @host.shortname %>
echo "== INFO == Setting hostname resulted in EXIT CODE: $?" 

/bin/echo <%= @host.shortname %> > /etc/hostname
echo "== INFO == Writing to /etc/hostname resulted in EXIT CODE: $?" 

/bin/sed -i "/127.0.1.1/d" /etc/hosts
echo "== INFO == Removing entry from hosts file resulted in EXIT CODE: $?" 

/bin/echo -e "\n<%= @host.ip %>  <%= @host.shortname %>.<%= @host.domain %>  <%= @host.shortname %>" >> /etc/hosts
echo "== INFO == Adding entry to hosts file resulted in EXIT CODE: $?" 

if [ -d "/var/lib/puppet/ssl" ]; then
  # clear out any previous ssl certs
  find /var/lib/puppet/ssl -type f -print | xargs rm -v
  echo "== INFO == Removing puppet ssl certs resulted in EXIT CODE: $?" 
fi

<% subnet = @host.subnet -%>
<% if subnet.respond_to?(:dhcp_boot_mode?) -%>
  <% dhcp = subnet.dhcp_boot_mode? && !@static -%>
<% else -%>
  <% dhcp = !@static -%>
<% end -%>
<% unless dhcp -%>
echo "== INFO == DHCP isn't enabled" 
# host and domain name need setting as these values may have come from dhcp if pxe booting
/bin/sed -i "s/^search.*$/search <%= @host.domain %>/g" /etc/resolv.conf
echo "== INFO == Updating search in resolv.conf resulted in EXIT CODE: $?" 

/bin/sed -i "s/.*dns-search.*/\tdns-search <%= @host.domain %>/g" /etc/network/interfaces
echo "== INFO == Updating search in /etc/network/interfaces resulted in EXIT CODE: $?" 

/bin/sed -i "s/^<%= @host.ip %>.*/<%= @host.ip %>\t<%= @host.shortname %>.<%= @host.domain %>\t<%= @host.shortname %>/g" /etc/hosts
echo "== INFO == Adding entry to hosts file resulted in EXIT CODE: $?" 

/bin/echo <%= @host.shortname %> > /etc/hostname
echo "== INFO == Setting hostname resulted in EXIT CODE: $?" 
<% end -%>
<% if puppet_enabled %>
cat > /etc/puppet/puppet.conf << EOF
<%= snippet 'puppet.conf' %>
EOF
echo "== INFO == Writing puppet.conf resulted in EXIT CODE: $?" 

if [ -f "/etc/default/puppet" ]; then
  /bin/sed -i 's/^START=no/START=yes/' /etc/default/puppet
  echo "== INFO == Making puppet run at startup resulted in EXIT CODE: $?" 
fi

/bin/touch /etc/puppet/namespaceauth.conf
echo "== INFO == Creating /etc/puppet/namespaceauth.conf resulted in EXIT CODE: $?" 

/usr/bin/puppet agent --enable
echo "== INFO == Enabling puppet agent resulted in EXIT CODE: $?" 

/usr/bin/puppet agent --config /etc/puppet/puppet.conf --onetime --tags no_such_tag <%= @host.puppetmaster.blank? ? '' : "--server #{@host.puppetmaster}" %> --no-daemonize
echo "== INFO == Syncing puppet client with master resulted in EXIT CODE: $?" 

/usr/bin/puppet agent
echo "== INFO == Daemonizing puppet agent resulted in EXIT CODE: $?" 
<% end -%>

/usr/bin/wget --no-proxy --quiet --output-document=/dev/null --no-check-certificate <%= foreman_url %>
echo "== INFO == Making wget request to complete the build resulted in EXIT CODE: $?" 

As you can see, I ended up adding an echo statement for each command, so the exit code of each command would be sent to the log file on the host. Here's the content of the log file:

+ /bin/hostname test-ubuntu-on-new-foreman-01
+ echo == INFO == Setting hostname resulted in EXIT CODE: 0
== INFO == Setting hostname resulted in EXIT CODE: 0
+ /bin/echo test-ubuntu-on-new-foreman-01
+ echo == INFO == Writing to /etc/hostname resulted in EXIT CODE: 0
== INFO == Writing to /etc/hostname resulted in EXIT CODE: 0
+ /bin/sed -i /127.0.1.1/d /etc/hosts
+ echo == INFO == Removing entry from hosts file resulted in EXIT CODE: 0
== INFO == Removing entry from hosts file resulted in EXIT CODE: 0
+ /bin/echo -e \n10.0.24.251  test-ubuntu-on-new-foreman-01.ta.local  test-ubuntu-on-new-foreman-01
+ echo == INFO == Adding entry to hosts file resulted in EXIT CODE: 0
== INFO == Adding entry to hosts file resulted in EXIT CODE: 0
+ [ -d /var/lib/puppet/ssl ]
+ cat
+ echo == INFO == Writing puppet.conf resulted in EXIT CODE: 0
== INFO == Writing puppet.conf resulted in EXIT CODE: 0
+ [ -f /etc/default/puppet ]
+ /bin/sed -i s/^START=no/START=yes/ /etc/default/puppet
+ echo == INFO == Making puppet run at startup resulted in EXIT CODE: 0
== INFO == Making puppet run at startup resulted in EXIT CODE: 0
+ /bin/touch /etc/puppet/namespaceauth.conf
+ echo == INFO == Creating /etc/puppet/namespaceauth.conf resulted in EXIT CODE: 0
== INFO == Creating /etc/puppet/namespaceauth.conf resulted in EXIT CODE: 0
+ /usr/bin/puppet agent --enable
+ echo == INFO == Enabling puppet agent resulted in EXIT CODE: 0
== INFO == Enabling puppet agent resulted in EXIT CODE: 0
+ /usr/bin/puppet agent --config /etc/puppet/puppet.conf --onetime --tags no_such_tag --server foreman-test-01.ta.local --no-daemonize
+ echo == INFO == Syncing puppet client with master resulted in EXIT CODE: 0
== INFO == Syncing puppet client with master resulted in EXIT CODE: 0
+ /usr/bin/puppet agent
+ echo == INFO == Daemonizing puppet agent resulted in EXIT CODE: 0
== INFO == Daemonizing puppet agent resulted in EXIT CODE: 0
+ /usr/bin/wget --no-proxy --quiet --output-document=/dev/null --no-check-certificate http://foreman-test-01.ta.local/unattended/provision?token=bf970197-a74f-436b-8991-21ca19902667
+ echo == INFO == Making wget request to complete the build resulted in EXIT CODE: 0
== INFO == Making wget request to complete the build resulted in EXIT CODE: 0

According to the script, all commands return a 0 exit code. So, I'm not sure what the problem is here.

Thanks for any help you can provide. Please let me know if there's any more info I can provide to help.

Notes: Questions:
  • Should the log_stderr and log_stdout methods result in log entries appearing in the production.log? If so, I'm not seeing any. If they're going somewhere else, then where?
Actions #1

Updated by Dominic Cleal almost 8 years ago

  • Status changed from New to Feedback

This may be #14957, which will happen on Debian/Ubuntu systems where sh is dash. Try applying the patch linked from that ticket, or reverting the original change.

Actions #2

Updated by Dan Dunckel almost 8 years ago

I actually updated to 1.11.2 this morning, which contained #14957. After upgrading, I provisioned the same VM and voila! No issues.

Thanks!

Actions #3

Updated by Dominic Cleal almost 8 years ago

  • Status changed from Feedback to Resolved

Thanks for confirming!

Actions

Also available in: Atom PDF