Bartosz Bilicki
Bartosz Bilicki

Reputation: 13235

How to retry Ansible task that may fail?

In my Ansible play I am restarting database then trying to do some operations on it. Restart command returns as soon as restart is started, not when db is up. Next command tries to connect to the database. That command my fail when db is not up.

I want to retry my second command a few times. If last retry fails, I want to fail my play.

When I do retries as follows

retries: 3
delay: 5

Then retries are not executed at all, because first command execution fails whole play. I could add ignore_errors: yes but that way play will pass even if all retries failed. Is there a easy way to retry failures until I have success, but fail when no success from last retry?

Upvotes: 84

Views: 183535

Answers (4)

Mxx
Mxx

Reputation: 9346

Consider using wait_for module. It waits for a condition before continuing, for example for a port to become open or closed, for a file to exist or not, or for some content in a file.

Without seeing the rest of your playbook, consider the following example:

- name: Wait for db server to restart
  local_action:
    wait_for:
      host=192.168.50.4
      port=3306
      delay=1
      timeout=300

You can also adapt it as a handler and obviously change this snippet to suit your use-case.

Upvotes: 20

Oleksii Zymovets
Oleksii Zymovets

Reputation: 740

For the following task:

- hosts: all
become: yes
tasks:
- name: create the 'myusername' user
  user: name=myusername append=yes state=present createhome=yes shell=/bin/bash

I was not sure weather the remote was ready yet (because this was a newly spinned node). So I had to try those retries and delays stuff. Unfortunately with no luck. For now I ended up creating a wrapper in my bash script to achieve the needed behavior.

#!/bin/bash

STATUS_CODE=1
TRY=1
while [ "$STATUS_CODE" -ge 1 ]
do
  if [ $TRY -gt 5 ];
  then
    echo Retried to connect to node 5 times and failed. Exiting
    exit 1
  fi

  ansible-playbook -i $HOSTS_FILE user.yml
  STATUS_CODE=$?
  TRY=$(( $TRY + 1 ))

  if [ $STATUS_CODE -ge 1 ]
  then
    echo Retry to connect to node in 5 seconds
    sleep 5
  fi
done

Still in hopes to make it a cleaner way using ansible-playbook yml. Anyone got suggestions on this?

Upvotes: 1

SerialEnabler
SerialEnabler

Reputation: 1120

Not sure if this is Ansible tower specific, but I am using:

- command: /usr/bin/false
  register: result
  retries: 3
  delay: 10
  until: result is not failed

Upvotes: 42

techraf
techraf

Reputation: 68459

I don't understand your claim that the "first command execution fails whole play". It wouldn't make sense if Ansible behaved this way.

The following task:

- command: /usr/bin/false
  retries: 3
  delay: 3
  register: result
  until: result.rc == 0

produces:

TASK [command] ******************************************************************************************
FAILED - RETRYING: command (3 retries left).
FAILED - RETRYING: command (2 retries left).
FAILED - RETRYING: command (1 retries left).
fatal: [localhost]: FAILED! => {"attempts": 3, "changed": true, "cmd": ["/usr/bin/false"], "delta": "0:00:00.003883", "end": "2017-05-23 21:39:51.669623", "failed": true, "rc": 1, "start": "2017-05-23 21:39:51.665740", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}

which seems to be exactly what you want.

Upvotes: 144

Related Questions