Yanis
Yanis

Reputation: 53

File search between two dates with ansible

I'm looking for a way to do a file search between two dates and I'm searching for a method to make this more beautiful than my example below , I specify that this example code below works

I know the ansible find module exists but I can't perform a search between two dates like I want to implement in my example (or at least I didn't succeed)

  1. file search between the 2022-09-26 and the 2022-10-26

  2. create some files for the test touch -d "35 days ago" /tmp/toto /tmp/tata /tmp/tutu.zip

  3. run the playbook

- name: "test find"
  gather_facts: false
  become: yes
  hosts: "localhost"
  tasks:
  - name: "create vars"
    set_fact:
      path_to_find:             "/tmp"
      BEGIN_DATE:               "{{lookup('pipe','date -d \"2 months ago\" -I')}}"
      END_DATE:                 "{{lookup('pipe','date -d \"1 months ago\" -I')}}"
      ZIP_NAME:                 "archive_test_name.zip"

  - name: "find between two dates "
    shell: find "{{ path_to_find }}" -type f ! -name "*.zip" -newermt "{{ BEGIN_DATE }}" ! -newermt "{{ END_DATE }}"
    register: FindFiles

  - debug:
      msg: "{{ FindFiles }}"

I hope someone has an idea or a best practice !

Upvotes: 5

Views: 1043

Answers (2)

U880D
U880D

Reputation: 12090

I'm searching for a method to make this more beautiful

Depending on your infrastructure and other capabilities there might be an other approach possible, creating a Custom Module written in Bash or Shell as a wrapper for the specific find command.

By doing this one will prevent multiple lookup's on the Control Node as well multiple find or searches on Remote Nodes since the approach will result into a single task. Furthermore, there will be much less configuration description within the Ansible playbook YAML and even less variable declaration, Jinja2 templating, filtering and setting facts.

The following concept / example module takes some string parameters (path, begin, end). After it is providing the files found as a list.

Concept / Example Module between.sh

#!/bin/sh

exec 3>&1 4>&2 > /dev/null 2>&1

# Create functions

function return_json() {

    exec 1>&3 2>&4

    jq --null-input --monochrome-output \
       --arg changed "$changed" \
       --arg rc "$rc" \
       --arg stdout "$stdout" \
       --arg stderr "$stderr" \
       --arg files "$files" \
       '$ARGS.named'

    exec 3>&1 4>&2 > /dev/null 2>&1

}

# Set default values

changed="false"
rc=0
stdout=""
stderr=""
files="[]"

source "$1" # to read the tmp file and source the parameters given by Ansible

# Check prerequisites

if ! which jq &>/dev/null; then
    exec 1>&3 2>&4
    printf "{ \"changed\": \"$changed\",
           \"rc\": \"1\" ,
           \"stdout\": \"\",
           \"stderr\": \"Command not found: This module require 'jq' installed on the target host. Exiting ...\",
           \"files\": \"[]\"}"
    exit 127
fi

if ! grep -q "Red Hat Enterprise Linux" /etc/redhat-release; then
    stderr="Operation not supported: Red Hat Enterprise Linux not found. Exiting ..."
    rc=95

    return_json

    exit $rc
fi

# Validate parameter key and value

if [ -z "$path" ]; then
    stderr="Invalid argument: Parameter path is not provided. Exiting ..."
    rc=22

    return_json

    exit $rc
fi

if [[ $path == *['!:*']* ]]; then
    stderr="Invalid argument: Value path contains invalid characters. Exiting ..."
    rc=22

    return_json

    exit $rc
fi

if [ ${#path} -ge 127 ]; then
    stderr="Argument too long: Value path has too many characters. Exiting ..."
    rc=7

    return_json

    exit $rc
fi

# Main (logic part)

    if [ -d "$path" ]; then
        files=$(find $path -type f -name "file*" -newermt "$begin" ! -newermt "$end" | jq --raw-input . | jq --slurp .)
    else
        stderr="Cannot access: No such directory or Permission denied."
        rc=1
    fi

# Return

changed="false"

return_json

exit $rc

Test Playbook between.yml

---
- hosts: test
  become: false
  gather_facts: false

  tasks:

  - name: Get files between
    between:
      path: "/home/{{ ansible_user }}/files"
      begin: "3 month ago"
      end: "1 month ago"
    register: result

  - name: Show result
    debug:
      var: result.files

Executed on a test file set

ls -ogla /home/${ANSIBLE_USER}/files
total 8
drwxr-xr-x.  2 4096 Nov 29 11:41 .
drwx------. 37 4096 Nov 29 14:04 ..
-rw-r--r--.  1    0 Jul 29 12:43 file1.zip
-rw-r--r--.  1    0 Aug 29 12:44 file2.zip
-rw-r--r--.  1    0 Sep 29 12:44 file3.zip
-rw-r--r--.  1    0 Oct 29 12:44 file4.zip
-rw-r--r--.  1    0 Nov 29 11:41 file5.zip

and via sshpass -p ${PASSWORD} ansible-playbook --user ${ANSIBLE_USER} --ask-pass between.yml

Result

TASK [Show result] ***********
ok: [test.example.com] =>
  result.files:
  - /home/user/files/file4.zip
  - /home/user/files/file3.zip

Please take note that within the current example the filenames are hardcoded (-name "file*"). I'll leave the enhancement of this part to the reader.

Thanks and Credits to

For script and module

and very important

For general Linux and Shell

Upvotes: 2

Vladimir Botka
Vladimir Botka

Reputation: 68074

Given the files for testing

shell> ls -ogla /tmp/test/
total 28
drwxrwxr-x  2  4096 Nov 25 21:23 .
drwxrwxrwt 70 20480 Nov 25 22:28 ..
-rw-rw-r--  1     0 Jul 25 00:00 file1
-rw-rw-r--  1     0 Aug 25 00:00 file2
-rw-rw-r--  1     0 Sep 25 00:00 file3
-rw-rw-r--  1     0 Oct 25 00:00 file4
-rw-rw-r--  1     0 Nov 25 00:00 file5
  1. Find lists of files older than the dates and create the difference between these lists.

In addition to the declarations

  begin_date: "{{ lookup('pipe', 'date -d \"2 months ago\" -I') }}"
  end_date: "{{ lookup('pipe', 'date -d \"1 months ago\" -I') }}"

Declare the variables

  today: "{{ '%Y-%m-%d'|strftime }}"
  begin_days: "{{ ((today|to_datetime('%Y-%m-%d')) -
                   (begin_date|to_datetime('%Y-%m-%d'))).days }}"
  end_days: "{{ ((today|to_datetime('%Y-%m-%d')) -
                 (end_date|to_datetime('%Y-%m-%d'))).days }}"

gives the date of today and the number of days from begin_date and end_date

  begin_date: 2022-09-25
  end_date: 2022-10-25
  today: 2022-11-25
  begin_days: 61
  end_days: 31

Find and register files where the modification time is older than begin_date and end_date

    - find:
        path: /tmp/test
        age: "{{ begin_days }}d"
      register: begin

    - find:
        path: /tmp/test
        age: "{{ end_days }}d"
      register: end

Declare the variables

  begin_files: "{{ begin.files|map(attribute='path')|list }}"
  end_files: "{{ end.files|map(attribute='path')|list }}"
  my_files: "{{ end_files|difference(begin_files) }}"

gives the lists of files where the modification time is older than begin_date and end_date. The difference between these lists my_files is what you're looking for

  begin_files: ['/tmp/test/file3', '/tmp/test/file2', '/tmp/test/file1']
  end_files: ['/tmp/test/file3', '/tmp/test/file4', '/tmp/test/file2', '/tmp/test/file1']
  my_files: ['/tmp/test/file4']

Example of a complete playbook for testing

- hosts: localhost

  vars:

    today: "{{ '%Y-%m-%d'|strftime }}"
    begin_date: "{{ lookup('pipe', 'date -d \"2 months ago\" -I') }}"
    end_date: "{{ lookup('pipe', 'date -d \"1 months ago\" -I') }}"
    begin_days: "{{ ((today|to_datetime('%Y-%m-%d')) -
                     (begin_date|to_datetime('%Y-%m-%d'))).days }}"
    end_days: "{{ ((today|to_datetime('%Y-%m-%d')) -
                   (end_date|to_datetime('%Y-%m-%d'))).days }}"
    begin_files: "{{ begin.files|map(attribute='path')|list }}"
    end_files: "{{ end.files|map(attribute='path')|list }}"
    my_files: "{{ end_files|difference(begin_files) }}"

  tasks:

    - debug:
        msg: |
          today: {{ today }}
          begin_date: {{ begin_date }}
          end_date: {{ end_date }}
          begin_days: {{ begin_days }}
          end_days: {{ end_days }}

    - find:
        path: /tmp/test
        age: "{{ begin_days }}d"
      register: begin

    - find:
        path: /tmp/test
        age: "{{ end_days }}d"
      register: end

    - debug:
        msg: |
          begin_files: {{ begin_files }}
          end_files: {{ end_files }}
          my_files: {{ my_files }}

  1. Collect the status of the files and select files by time.

Change the format of the dates to seconds

  begin_date_cmd: "date -d '2 months ago' '+%s'"
  begin_date: "{{ lookup('pipe', begin_date_cmd) }}"
  end_date_cmd: "date -d '1 months ago' '+%s'"
  end_date: "{{ lookup('pipe', end_date_cmd) }}"

gives

  begin_date: 1664143922
  end_date: 1666735922

Declare the variable

  files_all: "{{ all.files|map(attribute='path')|list }}"

and get the list of all files

    - find:
        path: /tmp/test
      register: all

Collect the status of all files

    - stat:
        path: "{{ item }}"
      register: st
      loop: "{{ files_all }}"

Declare the list where the modification time is bigger than begin_date and less than end_date

  my_files: "{{ st.results|selectattr('stat.mtime', 'gt', begin_date|float)|
                           selectattr('stat.mtime', 'lt', end_date|float)|
                           map(attribute='item')|list }}"

gives

  my_files:
  - /tmp/test/file4

Example of a complete playbook for testing

- hosts: localhost

  vars:

    begin_date_cmd: "date -d '2 months ago' '+%s'"
    begin_date: "{{ lookup('pipe', begin_date_cmd) }}"
    end_date_cmd: "date -d '1 months ago' '+%s'"
    end_date: "{{ lookup('pipe', end_date_cmd) }}"
    files_all: "{{ all.files|map(attribute='path')|list }}"
    my_files: "{{ st.results|selectattr('stat.mtime', 'gt', begin_date|float)|
                             selectattr('stat.mtime', 'lt', end_date|float)|
                             map(attribute='item')|list }}"

  tasks:

    - debug:
        msg: |
          begin_date: {{ begin_date }}
          end_date: {{ end_date }}

    - find:
        path: /tmp/test
      register: all
    - debug:
        var: files_all

    - stat:
        path: "{{ item }}"
      register: st
      loop: "{{ files_all }}"
    - debug:
        var: st
      
    - debug:
        var: my_files

Upvotes: 3

Related Questions