Luiz Eduardo
Luiz Eduardo

Reputation: 54

Ansible - Looking for files and compare their hash

Practice:

I have the file files.yml with a list of files and their respective md5_sum hash, like:

files:
  - name: /opt/file_compare1.tar
    hash: 9cd599a3523898e6a12e13ec787da50a  /opt/file_compare1.tar
  - name: /opt/file_compare2tar.gz
    hash: d41d8cd98f00b204e9800998ecf8427e  /opt/file_compare2.tar.gz 

I need to create a playbook to check this list of files if the current hash is the same or if it was changed, the playbook should have a debug message like below:

---
- hosts: localhost
  connection: local
  vars_files:
    - files.yml 
  tasks: 
    - name: Use md5 to calculate checksum
      stat:
        path: "{{ item.name }}"
        checksum_algorithm: md5
      register: hash_check
      with_items:
        - "{{ files }}"

    - name: Debug files - Different 
      debug:
        msg: | 
          "Hash changed: {{  item.name  }}"
      when: 
        - item.hash != hash_check 
      with_items:
        - "{{ files }}"

    - name: Debug files - Equal
      debug:
        msg: | 
          "Hash NOT changed: {{  item.name  }}"
      when: 
        - item.hash == hash_check 
      with_items:
        - "{{ files }}"
    
    - debug:
        msg: | 
          -  "{{ hash_check }}  {{ item.name }}"
      with_items:
        - "{{ files }}"

Upvotes: 1

Views: 3257

Answers (2)

Luiz Eduardo
Luiz Eduardo

Reputation: 54

I used your suggestion to complement the playbook, it's working now.

The idea is to get a list of files, read each one and compare with both hash, file, and current hash.

---
- hosts: localhost
  connection: local
  gather_facts: false
  vars_files:
    - files3.yml
  tasks:
    - stat:
        path: "{{ item.file }}"
        checksum_algorithm: md5
      loop: "{{ files }}"
      register: stat_results

    - name: NOT changed files
      debug:
        msg: "NOT changed: {{ item.stat.path }}"
      when: item.stat.checksum ==  item.item.checksum.split()|first
      loop: "{{ stat_results.results }}"
      loop_control:
        label: "{{ item.stat.path }}"

    - name: Changed files
      debug:
        msg: "CHANGED: {{ item.stat.path }}"
      when: item.stat.checksum != item.item.checksum.split()|first
      loop: "{{ stat_results.results }}"
      loop_control:
        label: "{{ item.stat.path }}"

Result:

>> ansible-playbook playbooks/check-file3.yml 

PLAY [localhost] ********************************************************************************************************************************************************************************************************************

TASK [stat] *************************************************************************************************************************************************************************************************************************
ok: [localhost] => (item={'file': '/opt/file_compare1.tar', 'checksum': '9cd599a3523898e6a12e13ec787da50a  /opt/file_compare1.tar'})
ok: [localhost] => (item={'file': '/opt/file_compare2.tar.gz', 'checksum': 'd41d8cd98f00b204e9800998ecf8427e  /opt/file_compare2.tar.gz'})

TASK [NOT changed files] ************************************************************************************************************************************************************************************************************
skipping: [localhost] => (item=/opt/file_compare1.tar) 
ok: [localhost] => (item=/opt/file_compare2.tar.gz) => {
    "msg": "NOT changed: /opt/file_compare2.tar.gz"
}

TASK [Changed files] ****************************************************************************************************************************************************************************************************************
ok: [localhost] => (item=/opt/file_compare1.tar) => {
    "msg": "CHANGED: /opt/file_compare1.tar"
}
skipping: [localhost] => (item=/opt/file_compare2.tar.gz) 

PLAY RECAP **************************************************************************************************************************************************************************************************************************
localhost                  : ok=3    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0 

Upvotes: 1

Vladimir Botka
Vladimir Botka

Reputation: 68134

For example, given the files

    files:
      - name: /scratch/file_compare1.tar
        hash: 4f8805b4b64dcc575547ec1c63793aec  /scratch/file_compare1.tar
      - name: /scratch/file_compare2.tar.gz
        hash: 2dc4f1e9ca4081cc49d25195627982ef  /scratch/file_compare2.tar.gz

the tasks below

    - name: Use md5 to calculate checksum
      stat:
        path: "{{ item.name }}"
        checksum_algorithm: md5
      register: hash_check
      loop: "{{ files }}"

    - name: Debug files - Different
      debug:
        msg: |
          Hash NOT changed: {{ item.0.name }}
          {{ item.0.hash.split()|first }}
          {{ item.1 }}
      with_together:
        - "{{ files }}"
        - "{{ hash_check.results|map(attribute='stat.checksum')|list }}"
      when: item.0.hash.split()|first == item.1

give

  msg: |-
    Hash NOT changed: /scratch/file_compare1.tar
    4f8805b4b64dcc575547ec1c63793aec
    4f8805b4b64dcc575547ec1c63793aec

  msg: |-
    Hash NOT changed: /scratch/file_compare2.tar.gz
    2dc4f1e9ca4081cc49d25195627982ef
    2dc4f1e9ca4081cc49d25195627982ef

A more robust option would be to create a dictionary with the calculated hashes

    - name: Use md5 to calculate checksum
      stat:
        path: "{{ item.name }}"
        checksum_algorithm: md5
      register: hash_check
      loop: "{{ files }}"

    - set_fact:
        path_hash: "{{ dict(_path|zip(_hash)) }}"
      vars:
        _path: "{{ hash_check.results|map(attribute='stat.path')|list }}"
        _hash: "{{ hash_check.results|map(attribute='stat.checksum')|list }}"

gives

  path_hash:
    /scratch/file_compare1.tar: 4f8805b4b64dcc575547ec1c63793aec
    /scratch/file_compare2.tar.gz: 2dc4f1e9ca4081cc49d25195627982ef

Then use this dictionary to compare the hashes. For example, the task below gives the same results

    - name: Debug files - Different
      debug:
        msg: |
          Hash NOT changed: {{ item.name }}
          {{ item.hash.split()|first }}
          {{ path_hash[item.name] }}
      loop: "{{ files }}"
      when: item.hash.split()|first == path_hash[item.name]

The next option is to create a dictionary with the original hashes and both lists of original and calculated hashes

    - name: Use md5 to calculate checksum
      stat:
        path: "{{ item.name }}"
        checksum_algorithm: md5
      register: hash_check
      loop: "{{ files }}"

    - set_fact:
        hash_name: "{{ dict(_hash|zip(_name)) }}"
        hash_orig: "{{ _hash }}"
        hash_stat: "{{ hash_check.results|map(attribute='stat.checksum')|list }}"
      vars:
        _hash: "{{ files|map(attribute='hash')|map('split')|map('first')|list }}"
        _name: "{{ files|map(attribute='name')|list }}"

gives

  hash_name:
    2dc4f1e9ca4081cc49d25195627982ef: /scratch/file_compare2.tar.gz
    4f8805b4b64dcc575547ec1c63793aec: /scratch/file_compare1.tar

  hash_orig:
  - 4f8805b4b64dcc575547ec1c63793aec
  - 2dc4f1e9ca4081cc49d25195627982ef

  hash_stat:
  - 4f8805b4b64dcc575547ec1c63793aec
  - 2dc4f1e9ca4081cc49d25195627982ef

Then calculate the difference of the lists and use it to extract both lists of changed and unchanged files

    - set_fact:
        files_diff: "{{ _diff|map('extract', hash_name)|list }}"
        files_orig: "{{ _orig|map('extract', hash_name)|list }}"
      vars:
        _diff: "{{ hash_orig|difference(hash_stat) }}"
        _orig: "{{ hash_orig|difference(_diff) }}"
    - name: Debug files changed
      debug:
        var: files_diff
    - name: Debug files NOT changed
      debug:
        var: files_orig

gives

  files_diff: []

  files_orig:
  - /scratch/file_compare1.tar
  - /scratch/file_compare2.tar.gz

Upvotes: 4

Related Questions