Reputation: 1664
I am running into a weird ssh related error in ansible 1.9.4 on ubuntu 14.04.
In my inventory file I have several servers defined in groups something like:
[group1]
g1-server1
g1-server2
[group2]
g2-server1
g2-server2
....
[dev]
g1-server1 ....
g2-server1 ...
etc.....
All servers are now pointing to localhost for testing. There are also several variables I am assigning to each server.
I can run playbook task by task (using tags) and sometimes the same task works, sometimes it doesn't. If I run the entire playbook it will stop at random location with this error:
fatal: [hostname] => SSH Error: Shared connection to 127.0.0.1 closed.
It is sometimes useful to re-run the command using -vvvv, which prints
SSH debug output to help diagnose the issue.
I suspect it may be an issue with many ssh connections at the same time from localhost to localhost, but I am not sure how to confirm this. Also, I have much greater success with tasks if they are marked run_once: true.
Does anyone have any ideas on this on?
Note: I tried to find some resources on interwebs on this; there are several discussions related to server rebooting. However keep in mind I am not rebooting anything here.
Upvotes: 0
Views: 5640
Reputation: 11
I had a similar issue and serial: 1 was not an option for me. I updated the ansible.cfg and set the properties under
[ssh_connection]
ssh_args = -o ControlMaster=no -o ControlPersist=60s
Mainly set ControlMaster to no. This made my playbooks more stable.
Upvotes: 1
Reputation: 12173
If I got you right you are opening multiple connections which do the same thing (for example changing the same files). This of course will cause unpredictable results.
Despite the fact that I do not see any sense in this sort of testing, you can eliminate the error setting
serial: 1
in your playbook, see: http://docs.ansible.com/ansible/playbooks_delegation.html#rolling-update-batch-size
which will cause the tasks to run one after another
Upvotes: 2