user18007
user18007

Reputation: 2445

Ansible task timeout max length

I execute a shell: docker ps ... task in some of my playbooks. This normally works but sometimes the docker daemon hangs and docker ps does not return for ~2 hours.

How can I configure Ansible to timeout in a reasonable amount of time if docker ps does not return?

Upvotes: 27

Views: 66996

Answers (5)

Charles
Charles

Reputation: 4352

This option can be set in your ansible config.

For example: ~/.ansible.cfg

[persistent_connection]
command_timeout=120

This sets a timeout of 2 minutes for each command.

Hope that helps

(You can also set the environment variable $export ANSIBLE_PERSISTENT_COMMAND_TIMEOUT=30)

Upvotes: 0

Vijesh
Vijesh

Reputation: 1208

A task timeout (in seconds) is added in 2.10 release, which is useful in such scenarios.

For example, below playbook fails in 2.10 version:

---
- hosts: localhost
  connection: local
  gather_facts: false
  tasks:
  - shell: |
      while true; do
        sleep 1
      done
    timeout: 5
...

with an error message like:

TASK [shell] **************************************************************************************************************************
fatal: [localhost]: FAILED! => {"changed": false, "msg": "The shell action failed to execute in the expected time frame (5) and was terminated"}

Upvotes: 33

In Ansible 2.9.6 (I think since Ansible 2.4), you can do something else: async.

Through async, you provide a time after which it will timeout, in theory, in seconds... I say "in theory" due to I found it does not have a strong real-time constraints.

---
- hosts: docker_host
  become: yes
  become_user: jeanedoe

  tasks:
      
    - name: Run docker in the remote machine
      shell: docker ps # options if some
      register: call_stdout
      async: 30  # it will timeout after ~30 seconds (approx)...
      
    - debug:
        msg:
           - "The output is {{ call_stdout }}"

Issue of this specific approach: it will report failure, and not sure if this is what you want, maybe you should find a way to capture the error it will report (example from an output I have, not related to your case):

TASK [Run docker in the remote machine] *****************************************************************************************************************
fatal: [docker_host]: FAILED! => {"ansible_async_watchdog_pid": 2220, "ansible_job_id": "701125428265.6104", "changed": false, "finished": 1, "msg": "timed out waiting for module completion", "results_file": ..., "started": 1}

PLAY RECAP *********************************************************************************************************************************************************************
docker_host                       : ok=1    changed=0    unreachable=0    failed=1    skipped=0    rescued=0    ignored=0

Alternatively, and although maybe this is not what you want, to avoid reporting a failure, just the command call, something else can be done.

I use this when I call someone's else remote service that will run in the background (and I cannot make another called script remotely), and Ansible does not need an answer, this is what Ansible calls a "fire and forget", done by specifying a poll value of 0:

---
- hosts: docker_host
  become: yes
  become_user: jeanedoe

  tasks:
      
    - name: Run docker in the remote machine
      shell: docker ps # options if some
      register: call_stdout
      async: 30  # it will timeout after ~30 seconds (approx)...
      poll: 0  # default is 10 if not specified
      
    - debug:
        msg:
           - "The output is {{ call_stdout }}"

Now, the play recap should be:

TASK [debug] *******************************************************************************************************************************************************************
ok: [docker_host] => {
    "msg": [
        "Pyton run output is {'started': 1, 'ansible_async_watchdog_pid': 5608, 'results_file': '...', 'finished': 0, 'ansible_job_id': '822653990290.6896', 'failed': False, 'changed': False}"
    ]
}

PLAY RECAP *********************************************************************************************************************************************************************
docker_host                       : ok=3    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0 

Upvotes: 0

techraf
techraf

Reputation: 68559

There is no timeout-for-a-task-functionality implemented in Ansible.

You can try a workaround using asynchronous call, but for this case (clearly a kind of a bug) relying on the system might be easier and more appropriate.

See the GNU timeout command (if you run Docker, chances are the command is present on your OS):

shell: timeout 5m docker ps ...

Upvotes: 13

jimbo8098
jimbo8098

Reputation: 123

Weighing in on this in case someone comes across it, there was a timeout option added in a later version which allows you to specify the following variables in your inventory file on WinRM instances:

ansible_winrm_operation_timeout_sec: 120
ansible_winrm_read_timeout_sec: 150

My use case was a docker swarm init which is pretty messy on Windows but performs fine on Linux. It didn't resolve my issue but it may resolve yours depending on your transport.

I did also note https://github.com/ansible/ansible/pull/69284/files but this isn't explained anywhere that I could find.

Upvotes: 2

Related Questions