Ansible stops connecting to the host via ssh

karobar picture karobar · Aug 3, 2018 · Viewed 9k times · Source

Introduction

For over a month I've been running the following command:

ansible-playbook -vvvvi host_test rhel-tests.yml

Which connected via SSH and ran tests on a host successfully without any problems. But as of the last couple days, I've received the following when running:

fatal: [10.2.16.2]: UNREACHABLE! => {
    "changed": false, 
    "unreachable": true
}

MSG:

Failed to connect to the host via ssh: OpenSSH_7.6p1, LibreSSL 2.6.2
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: /etc/ssh/ssh_config line 48: Applying options for *
debug1: auto-mux: Trying existing master
debug2: fd 3 setting O_NONBLOCK
debug2: mux_client_hello_exchange: master version 4
debug3: mux_client_forwards: request forwardings: 0 local, 0 remote
debug3: mux_client_request_session: entering
debug3: mux_client_request_alive: entering
debug3: mux_client_request_alive: done pid = 35742
debug3: mux_client_request_session: session request sent
debug1: mux_client_request_session: master session id: 2
debug3: mux_client_read_packet: read header failed: Broken pipe
debug2: Control master terminated unexpectedly
Shared connection to 10.2.16.2 closed.

Even though I can establish a normal SSH connection from bash to 10.2.16.2 just fine from the host I'm running.

Details

The contents of host_test are as follows:

[rhel]
10.2.16.2 node_type=xxx

[rhel:vars]
ansible_become=yes
ansible_become_method=su
ansible_become_user=root
ansible_connection=ssh
ansible_user=yyy
node_name=""


[cisco]

[cisco:vars]
node_name=""

[curtiss-wright]

[zzz]

[other]

[nmap:children]
rhel
cisco
curtiss-wright
other
zzz

[password-test]

Here's my ansible.cfg:

[defaults]
ask_vault_pass = True
filter_plugins = filter_plugins
host_key_checking = False
retry_files_enabled = False
inventory = hosts
stdout_callback = debug

[paramiko_connection]
record_host_keys=False

[ssh_connection]
ssh_args = -o LogLevel=QUIET -o ControlMaster=auto -o ControlPersist=2m -o UserKnownHostsFile=/dev/null
scp_if_ssh = True

Here's the full -vvvv output of the task which fails:

<10.2.16.2> ESTABLISH SSH CONNECTION FOR USER: yyy
<10.2.16.2> SSH: EXEC sshpass -d51 ssh -vvv -o LogLevel=QUIET -o ControlMaster=auto -o ControlPersist=2m -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -o User=yyy -o ConnectTimeout=10 -o ControlPath=/Users/presslertj/.ansible/cp/4c003d67e6 10.2.16.2 '/bin/sh -c '"'"'echo ~yyy && sleep 0'"'"''
<10.2.16.2> (0, '/home/yyy\n', 'OpenSSH_7.6p1, LibreSSL 2.6.2\r\ndebug1: Reading configuration data /etc/ssh/ssh_config\r\ndebug1: /etc/ssh/ssh_config line 48: Applying options for *\r\ndebug1: auto-mux: Trying existing master\r\ndebug2: fd 3 setting O_NONBLOCK\r\ndebug2: mux_client_hello_exchange: master version 4\r\ndebug3: mux_client_forwards: request forwardings: 0 local, 0 remote\r\ndebug3: mux_client_request_session: entering\r\ndebug3: mux_client_request_alive: entering\r\ndebug3: mux_client_request_alive: done pid = 38421\r\ndebug3: mux_client_request_session: session request sent\r\ndebug1: mux_client_request_session: master session id: 2\r\ndebug3: mux_client_read_packet: read header failed: Broken pipe\r\ndebug2: Received exit status from master 0\r\n')
<10.2.16.2> ESTABLISH SSH CONNECTION FOR USER: yyy
<10.2.16.2> SSH: EXEC sshpass -d51 ssh -vvv -o LogLevel=QUIET -o ControlMaster=auto -o ControlPersist=2m -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -o User=yyy -o ConnectTimeout=10 -o ControlPath=/Users/presslertj/.ansible/cp/4c003d67e6 10.2.16.2 '/bin/sh -c '"'"'( umask 77 && mkdir -p "` echo /home/yyy/.ansible/tmp/ansible-tmp-1533309683.05-194983968798054 `" && echo ansible-tmp-1533309683.05-194983968798054="` echo /home/yyy/.ansible/tmp/ansible-tmp-1533309683.05-194983968798054 `" ) && sleep 0'"'"''
<10.2.16.2> (0, 'ansible-tmp-1533309683.05-194983968798054=/home/yyy/.ansible/tmp/ansible-tmp-1533309683.05-194983968798054\n', 'OpenSSH_7.6p1, LibreSSL 2.6.2\r\ndebug1: Reading configuration data /etc/ssh/ssh_config\r\ndebug1: /etc/ssh/ssh_config line 48: Applying options for *\r\ndebug1: auto-mux: Trying existing master\r\ndebug2: fd 3 setting O_NONBLOCK\r\ndebug2: mux_client_hello_exchange: master version 4\r\ndebug3: mux_client_forwards: request forwardings: 0 local, 0 remote\r\ndebug3: mux_client_request_session: entering\r\ndebug3: mux_client_request_alive: entering\r\ndebug3: mux_client_request_alive: done pid = 38421\r\ndebug3: mux_client_request_session: session request sent\r\ndebug1: mux_client_request_session: master session id: 2\r\ndebug3: mux_client_read_packet: read header failed: Broken pipe\r\ndebug2: Received exit status from master 0\r\n')
Using module file /usr/local/Cellar/ansible/2.6.2/libexec/lib/python2.7/site-packages/ansible/modules/commands/command.py
<10.2.16.2> PUT /Users/presslertj/.ansible/tmp/ansible-local-38409ihnO5i/tmpWo6ZH_ TO /home/yyy/.ansible/tmp/ansible-tmp-1533309683.05-194983968798054/command.py
<10.2.16.2> SSH: EXEC sshpass -d51 scp -vvv -o LogLevel=QUIET -o ControlMaster=auto -o ControlPersist=2m -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -o User=yyy -o ConnectTimeout=10 -o ControlPath=/Users/presslertj/.ansible/cp/4c003d67e6 /Users/presslertj/.ansible/tmp/ansible-local-38409ihnO5i/tmpWo6ZH_ '[10.2.16.2]:/home/yyy/.ansible/tmp/ansible-tmp-1533309683.05-194983968798054/command.py'
<10.2.16.2> (0, '', 'Executing: program /usr/bin/ssh host 10.2.16.2, user (unspecified), command scp -v -t /home/yyy/.ansible/tmp/ansible-tmp-1533309683.05-194983968798054/command.py\nOpenSSH_7.6p1, LibreSSL 2.6.2\r\ndebug1: Reading configuration data /etc/ssh/ssh_config\r\ndebug1: /etc/ssh/ssh_config line 48: Applying options for *\r\ndebug1: auto-mux: Trying existing master\r\ndebug2: fd 3 setting O_NONBLOCK\r\ndebug2: mux_client_hello_exchange: master version 4\r\ndebug3: mux_client_forwards: request forwardings: 0 local, 0 remote\r\ndebug3: mux_client_request_session: entering\r\ndebug3: mux_client_request_alive: entering\r\ndebug3: mux_client_request_alive: done pid = 38421\r\ndebug3: mux_client_request_session: session request sent\r\ndebug1: mux_client_request_session: master session id: 2\r\nSending file modes: C0600 69597 tmpWo6ZH_\nSink: C0600 69597 tmpWo6ZH_\ndebug3: mux_client_read_packet: read header failed: Broken pipe\r\ndebug2: Received exit status from master 0\r\n')
<10.2.16.2> ESTABLISH SSH CONNECTION FOR USER: yyy
<10.2.16.2> SSH: EXEC sshpass -d51 ssh -vvv -o LogLevel=QUIET -o ControlMaster=auto -o ControlPersist=2m -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -o User=yyy -o ConnectTimeout=10 -o ControlPath=/Users/presslertj/.ansible/cp/4c003d67e6 10.2.16.2 '/bin/sh -c '"'"'chmod u+x /home/yyy/.ansible/tmp/ansible-tmp-1533309683.05-194983968798054/ /home/yyy/.ansible/tmp/ansible-tmp-1533309683.05-194983968798054/command.py && sleep 0'"'"''
<10.2.16.2> (0, '', 'OpenSSH_7.6p1, LibreSSL 2.6.2\r\ndebug1: Reading configuration data /etc/ssh/ssh_config\r\ndebug1: /etc/ssh/ssh_config line 48: Applying options for *\r\ndebug1: auto-mux: Trying existing master\r\ndebug2: fd 3 setting O_NONBLOCK\r\ndebug2: mux_client_hello_exchange: master version 4\r\ndebug3: mux_client_forwards: request forwardings: 0 local, 0 remote\r\ndebug3: mux_client_request_session: entering\r\ndebug3: mux_client_request_alive: entering\r\ndebug3: mux_client_request_alive: done pid = 38421\r\ndebug3: mux_client_request_session: session request sent\r\ndebug1: mux_client_request_session: master session id: 2\r\ndebug3: mux_client_read_packet: read header failed: Broken pipe\r\ndebug2: Received exit status from master 0\r\n')
<10.2.16.2> ESTABLISH SSH CONNECTION FOR USER: yyy
<10.2.16.2> SSH: EXEC sshpass -d51 ssh -vvv -o LogLevel=QUIET -o ControlMaster=auto -o ControlPersist=2m -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -o User=yyy -o ConnectTimeout=10 -o ControlPath=/Users/presslertj/.ansible/cp/4c003d67e6 -tt 10.2.16.2 '/bin/sh -c '"'"'su  root -c '"'"'"'"'"'"'"'"'/bin/sh -c '"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'echo BECOME-SUCCESS-mmedeuuflgmstddchcdfxcwjjcqpzsam; /usr/bin/python /home/yyy/.ansible/tmp/ansible-tmp-1533309683.05-194983968798054/command.py'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"''"'"'"'"'"'"'"'"' && sleep 0'"'"''
Escalation succeeded

My thoughts

  • Configuration changes are happening constantly on the target, so it's possible something was configured in ssh to limit connections in some way.
  • Tests are being added to rhel-tests.yml, so it's possible some sort of timeout is now being triggered that wasn't before. I've tried reverting back the version of rhel7 to about a month back, and the command still fails, so I believe that this is not likely to be the cause.
  • I'm using ansible version 2.5.4 installed via brew. I've tried updating to Ansible 2.6.2, but that seems to have done nothing.
  • I've tried several other suggestions found online, including using the paramiko_ssh connection type, which also fails.
  • I can run ansible -i hosts_test -m ping 10.2.16.2 and get a pong back
  • This question seems pretty close to my issue, but there aren't any lines in rhel-tests.yml that reboot or shutdown.

Question

What's causing my playbook to fail and how can I fix it?

Answer

karobar picture karobar · Aug 3, 2018

I believe the connection may be dropping due to the lack of output from your play.

Add the following to your ssh_args in ansible.cfg:

-o ServerAliveInterval=50