Support #14993
closedSSH provisioning always results in a non-zero exit code
Description
I'm testing the latest version (v1.11.1), and in my tests, I came across an issue where our Finish script always results in this error:
2016-05-06T18:24:42 [app] [W] Failed to launch script on test-new-foreman-01.ta.local: ERF42-0453 [Foreman::Exception]: Provision script had a non zero exit | Foreman::Exception: ERF42-0453 [Foreman::Exception]: Provision script had a non zero exit | /usr/share/foreman/app/models/concerns/orchestration/ssh_provision.rb:89:in `setSSHProvision'
Hopefully, this is happening because of something silly. As a result, this automatically deletes the newly created host. This required me to set clean_up_failed_deployment to false. Then I could dig into debugging.
As I understand it, this happens because the code checks the exit code of every command run during the SSH session. Therefore, when 1 of those commands returns an exit code that IS NOT 0, then it will throw an error and keep the host in a Pending Installation state (instead of being "successfully built").
Here's the relevant code (link):
def deploy! logger.debug "about to upload #{template} to remote system at #{remote_script}" scp.upload(template, remote_script) logger.debug "about to execute #{command}" @results = ssh.run(command) log_stdout log_stderr success? end private def success? return true if results.empty? results.map(&:status).compact == [0] end def log_stdout results.each do |r| r.stdout.split("\n").each { |l| logger.debug l } end end def log_stderr results.each do |r| r.stderr.split("\n").each { |l| logger.warn l } end end
Here's the content of the Finish template:
set -ex <%# kind: finish name: Preseed default finish oses: - Debian 6.0 - Debian 7. - Debian 8. - Ubuntu 10.04 - Ubuntu 12.04 - Ubuntu 13.04 - Ubuntu 14.04 -%> <% # safemode renderer does not support unary negation pm_set = @host.puppetmaster.empty? ? false : true puppet_enabled = pm_set || @host.params['force-puppet'] && @host.params['force-puppet'] == 'true' -%> /bin/hostname <%= @host.shortname %> echo "== INFO == Setting hostname resulted in EXIT CODE: $?" /bin/echo <%= @host.shortname %> > /etc/hostname echo "== INFO == Writing to /etc/hostname resulted in EXIT CODE: $?" /bin/sed -i "/127.0.1.1/d" /etc/hosts echo "== INFO == Removing entry from hosts file resulted in EXIT CODE: $?" /bin/echo -e "\n<%= @host.ip %> <%= @host.shortname %>.<%= @host.domain %> <%= @host.shortname %>" >> /etc/hosts echo "== INFO == Adding entry to hosts file resulted in EXIT CODE: $?" if [ -d "/var/lib/puppet/ssl" ]; then # clear out any previous ssl certs find /var/lib/puppet/ssl -type f -print | xargs rm -v echo "== INFO == Removing puppet ssl certs resulted in EXIT CODE: $?" fi <% subnet = @host.subnet -%> <% if subnet.respond_to?(:dhcp_boot_mode?) -%> <% dhcp = subnet.dhcp_boot_mode? && !@static -%> <% else -%> <% dhcp = !@static -%> <% end -%> <% unless dhcp -%> echo "== INFO == DHCP isn't enabled" # host and domain name need setting as these values may have come from dhcp if pxe booting /bin/sed -i "s/^search.*$/search <%= @host.domain %>/g" /etc/resolv.conf echo "== INFO == Updating search in resolv.conf resulted in EXIT CODE: $?" /bin/sed -i "s/.*dns-search.*/\tdns-search <%= @host.domain %>/g" /etc/network/interfaces echo "== INFO == Updating search in /etc/network/interfaces resulted in EXIT CODE: $?" /bin/sed -i "s/^<%= @host.ip %>.*/<%= @host.ip %>\t<%= @host.shortname %>.<%= @host.domain %>\t<%= @host.shortname %>/g" /etc/hosts echo "== INFO == Adding entry to hosts file resulted in EXIT CODE: $?" /bin/echo <%= @host.shortname %> > /etc/hostname echo "== INFO == Setting hostname resulted in EXIT CODE: $?" <% end -%> <% if puppet_enabled %> cat > /etc/puppet/puppet.conf << EOF <%= snippet 'puppet.conf' %> EOF echo "== INFO == Writing puppet.conf resulted in EXIT CODE: $?" if [ -f "/etc/default/puppet" ]; then /bin/sed -i 's/^START=no/START=yes/' /etc/default/puppet echo "== INFO == Making puppet run at startup resulted in EXIT CODE: $?" fi /bin/touch /etc/puppet/namespaceauth.conf echo "== INFO == Creating /etc/puppet/namespaceauth.conf resulted in EXIT CODE: $?" /usr/bin/puppet agent --enable echo "== INFO == Enabling puppet agent resulted in EXIT CODE: $?" /usr/bin/puppet agent --config /etc/puppet/puppet.conf --onetime --tags no_such_tag <%= @host.puppetmaster.blank? ? '' : "--server #{@host.puppetmaster}" %> --no-daemonize echo "== INFO == Syncing puppet client with master resulted in EXIT CODE: $?" /usr/bin/puppet agent echo "== INFO == Daemonizing puppet agent resulted in EXIT CODE: $?" <% end -%> /usr/bin/wget --no-proxy --quiet --output-document=/dev/null --no-check-certificate <%= foreman_url %> echo "== INFO == Making wget request to complete the build resulted in EXIT CODE: $?"
As you can see, I ended up adding an echo statement for each command, so the exit code of each command would be sent to the log file on the host. Here's the content of the log file:
+ /bin/hostname test-ubuntu-on-new-foreman-01 + echo == INFO == Setting hostname resulted in EXIT CODE: 0 == INFO == Setting hostname resulted in EXIT CODE: 0 + /bin/echo test-ubuntu-on-new-foreman-01 + echo == INFO == Writing to /etc/hostname resulted in EXIT CODE: 0 == INFO == Writing to /etc/hostname resulted in EXIT CODE: 0 + /bin/sed -i /127.0.1.1/d /etc/hosts + echo == INFO == Removing entry from hosts file resulted in EXIT CODE: 0 == INFO == Removing entry from hosts file resulted in EXIT CODE: 0 + /bin/echo -e \n10.0.24.251 test-ubuntu-on-new-foreman-01.ta.local test-ubuntu-on-new-foreman-01 + echo == INFO == Adding entry to hosts file resulted in EXIT CODE: 0 == INFO == Adding entry to hosts file resulted in EXIT CODE: 0 + [ -d /var/lib/puppet/ssl ] + cat + echo == INFO == Writing puppet.conf resulted in EXIT CODE: 0 == INFO == Writing puppet.conf resulted in EXIT CODE: 0 + [ -f /etc/default/puppet ] + /bin/sed -i s/^START=no/START=yes/ /etc/default/puppet + echo == INFO == Making puppet run at startup resulted in EXIT CODE: 0 == INFO == Making puppet run at startup resulted in EXIT CODE: 0 + /bin/touch /etc/puppet/namespaceauth.conf + echo == INFO == Creating /etc/puppet/namespaceauth.conf resulted in EXIT CODE: 0 == INFO == Creating /etc/puppet/namespaceauth.conf resulted in EXIT CODE: 0 + /usr/bin/puppet agent --enable + echo == INFO == Enabling puppet agent resulted in EXIT CODE: 0 == INFO == Enabling puppet agent resulted in EXIT CODE: 0 + /usr/bin/puppet agent --config /etc/puppet/puppet.conf --onetime --tags no_such_tag --server foreman-test-01.ta.local --no-daemonize + echo == INFO == Syncing puppet client with master resulted in EXIT CODE: 0 == INFO == Syncing puppet client with master resulted in EXIT CODE: 0 + /usr/bin/puppet agent + echo == INFO == Daemonizing puppet agent resulted in EXIT CODE: 0 == INFO == Daemonizing puppet agent resulted in EXIT CODE: 0 + /usr/bin/wget --no-proxy --quiet --output-document=/dev/null --no-check-certificate http://foreman-test-01.ta.local/unattended/provision?token=bf970197-a74f-436b-8991-21ca19902667 + echo == INFO == Making wget request to complete the build resulted in EXIT CODE: 0 == INFO == Making wget request to complete the build resulted in EXIT CODE: 0
According to the script, all commands return a 0 exit code. So, I'm not sure what the problem is here.
Thanks for any help you can provide. Please let me know if there's any more info I can provide to help.
Notes:- The log level is set to debug
- Our environment uses VMware vSphere images
- The Operating System only has code in a Finish script (documentation)
- The pxe/provision scripts only contain a comment
- Tarball from foreman-debug: http://debugs.theforeman.org/foreman-debug-ZiRQJ.tar.xz
- Should the log_stderr and log_stdout methods result in log entries appearing in the production.log? If so, I'm not seeing any. If they're going somewhere else, then where?
Updated by Dominic Cleal over 8 years ago
- Status changed from New to Feedback
This may be #14957, which will happen on Debian/Ubuntu systems where sh is dash. Try applying the patch linked from that ticket, or reverting the original change.
Updated by Dan Dunckel over 8 years ago
I actually updated to 1.11.2 this morning, which contained #14957. After upgrading, I provisioned the same VM and voila! No issues.
Thanks!
Updated by Dominic Cleal over 8 years ago
- Status changed from Feedback to Resolved
Thanks for confirming!