Kristi Jorgji
Kristi Jorgji

Reputation: 1739

How to make Ansible with_fileglob work recursively for all subdirectories or alternative?

I am using Ansible and have a directory structure like the example below:

configs
  something
     files
         1.conf
         2.conf
         // and so on

Those files are templates and I am using Ansible to parse these templates and create them automatically in the destination server.

My problem is that with_fileglob is working only of first level directory and cannot seem to enable some recursive mode.

I have

- name: "Apply templates"
  template:
    src: "{{ item }}"
    dest: "{{ item | replace('.j2', '') }}"
  with_fileglob:
    - "{{ user_configs_path }}/*"

by the way user_configs_path=configs exists and all good here.

The above does nothing.

If I add something under configs, example configs/blabla.j2 and re-run the playbook it is parsed and copied fine.

So seems somehow that the directories are not searched recursively.

I am not limited to only use the fileglob solution so feel free to suggest anything I can learn to reach my goal.

Basically I want to recursively iterate all directories for files only, and in a loop apply the template module to them and create them in remote server

Upvotes: 4

Views: 12275

Answers (3)

Mohamed Allal
Mohamed Allal

Reputation: 20890

To add uppon the other answer.

file_glob doesn't support recursivity

  • file_glob from the doc, doesn't support recursive matching (unfortunately)

  • you can't use **

  • You have to use something else for recursive matching

  • Doc ref

Matches all files in a single directory, non-recursively, that match a pattern. It calls Python’s “glob” library.

The options

  • The best, ready to go => with_community.general.filetree
    • You would use regex for matching, instead of globs
    • Same thing for excluding (and yes we can and it's simple)
    • Will detail that below.
  • ansible.builtin.find
    • Support recursive param and matching
    • find is find.
    • And you need to run two tasks too
  • You can take file_glob lookup, copy, and build another lookup plugin, that extends the functionality and add recursivity. I don't see a reason, why recursivity wasn't implemented. After all, it does use python glob() under the hood.
    • Can be found here .venv/lib/python3.11/site-packages/ansible/plugins/lookup/fileglob.py
      • See relatively to your installation (in here ansible is installed in the virtual env)
      • here a github link too
    • Details at the end, and with a full implementation.
    • I wonder if there is a reason. I'll try with a PR. And update the answer if there are any updates.

File matching globs and regex

  • The main ways for file matching always been

    • Either glob system
    • Either regex system
  • You can see that in many software. Many simply go with globs. Others go with regex. And many offer both.

Great with with_community.general.filetree we can use regex to it's extent. I'll show that next.

Regex does have more power

  • Globs generally are more simple, and have a good shorthand.
  • Regex on the other hand, They are more rich, More powerful. And allow more complex patterns.
    • With regex, you can match against basically any pattern you want.
    • You can match against spcific characters only
      • [abcd12345]+.ts
      • [a-z5-9]+.ts
      • \S+.ts
      • The patterns are endless and from simple to two complex
      • You can also easily test in regexr.com or vscode search with regex mode activated (even offline).
      • For every glob operator. There is an equiv in regex. But not the inverse

with_community.general.filetree and using regex for matching and excluding and to it's fullest

  • match and exclude both are possible

  • All get implemented by using when task attribute

  • Doc

Example from a project i was working on

- name: Render templates in nginx_build folder (locally)
    ansible.builtin.template:
      src: "{{ template_file.root }}/{{ template_file.path }}"
      dest: "{{ (site_ngx_conf_build_dir + '/' + template_file.path) | regex_replace('\\.j2', '') }}"
      mode: "0754"
    delegate_to: localhost
    # Old `fileglob`lookup that is not recursive (what a shame hhh)
    # loop: "{{ query('fileglob', site_ngx_conf_build_dir + '/**/*.j2') }}"
    with_community.general.filetree: "{{ site_ngx_conf_build_dir }}"
    loop_control:
      loop_var: template_file # only renaming var other than default `item`
    # Here go the matching
    when:
      - template_file.state != 'directory'
      # Match with regex

      # is truthy => to match
      - (template_file.path | regex_search('.+?\.j2$')) is truthy # match recursively anything that end with .j2

      # For union matching
      # - (
      #     ((template_file.path | regex_search(sub_path + '/some/.+?/doe/.+?\.j2$')) is truthy) or
      #     ((template_file.path | regex_search(sub_path + '/another/.+?\.j2$')) is truthy)
      #   )

      # For matching no recursively, only a one folder (you can use that with union above if you want)
      # - (template_file.path | regex_search('some/[^\/]+?\.j2$')) is truthy

      # is falsy => to exclude
      - (template_file.path | regex_search('certificates_ref_only')) is falsy
      # In short regex allow us to just match anything we want
  • In my real project i needed to recurse cross all files that end with .j2 (template files)
    • I only need
- (item.path | regex_search('.+?\.j2$')) is truthy # match recursively anything that end with .j2

regex_search() for regex

- (item.path | regex_search('.+?\.j2$')) is truthy
  • you can see how we set the base

when and multiple conditions list

The when list is the equiv of and, the intersection of the conditions. One of the all is falsy, the whole is falsy.

- condition is truthy # matching
- another is truthy # ???
  • A list is basically
(
  (condition is truthy) and
  (another is truthy)
)
  • That would be true only if both of them match. Creating intersection of the two, not union.

    • true and false = false (not union)
    • only true and true = true
  • Taking that away from the way. We can establish this

    • Inclusion you can have only one el in that list
      • For union, we use a block with or (details below)
    • Exclusion, you can have multiple one (because of the and )
      • false and true = false
        • one excluding, will be effective in all cases
        • Making multiple ones, a union for exclusion

In short:

- one match condition # [optionally with `or or or ...` construct ]
- exclusion 1 # |
- exclusion 2 # |
- exclusion 3 # |
- exclusion 4 # |_ union of exclusion

Inclusion and exclusion

  • is truthy for inclusion
  • is falsy for exclusion
when:
  # Match with regex
  # is truthy => to match
  - (item.path | regex_search('.+?\.j2$')) is truthy # match recursively anything that end with .j2
  # is falsy => to exclude
  - (item.path | regex_search('certificates_ref_only')) is falsy
  # In short regex allow us to just match anything we want

Inclusion union

- (
    ((item.path | regex_search(sub_path + '/some/.+?/doe/.+?\.j2$')) is truthy) or
    ((item.path | regex_search(sub_path + '/another/.+?\.j2$')) is truthy)
  )
  • always remember that the logical operators (or and ..) have a high precedence
    • Hince it's important to have (firstInclusion) or (secondInclusion)
    • Inclusion union is as simple as that.

Regex equiv of globs

  • many of characters that regex special => you need to escape them

    • . => \. (because . in regex mean any char except return character)
    • ^ => \^
    • $ => \$ ( $ in regex mean end of line)
    • [ => \[
    • ] => \]
    • / => generally doesn't need escaping. But if it was used in a regex scope delimiter like /regexPattern/g => you'll need to escape it in that case \/
      • If you escape it. It would work in both cases. Generally in 'some/pattern' context. I prefer to not escape. (cleaner)
    • ... (the list can be long)
  • ** or **/* => .+? [match a succession of chars, and one at least]

    • and regex has .*? extra, that matches a char or none (I wonder how to do that in glob)
    • recursive
    • some/path/**/*.lsm => some/path/.*?\.lsm$
    • some/path/**/some/**/another/**/*.lsm => some/path/.*?/some/.*?/another/.*\.lsm$
    • **/* match .*? because ** can be nothing (sub dir or none)

enter image description here

  • In regex, if you want to match against the end you should use $

  • Otherwise, it can not be at the end, and it would match it. So be careful.

    • Generally, you should use $ always (glob equiv). Unless you know what you are doing.
  • Same for the start ^

  • For a path from start to end ^matching_pattern_in_between$

  • Use always in most cases ^some/path/pattern/here$

    • We need to use them when we want to match an exact path from start to end (the glob mode by design). If we don't the pattern matches a substring. Even if it's not the full path. It will be matched. And you may not want that.
  • We can use ^ or $ only if we want to match the start part only or the end part only. ex (item.path | regex_search('\.j2$')) is truthy caring about matching the end only. Anything ending with .j2 will be matched automatically. We don't even need .+.

  • * ??? what about no recursive? => [^/]*?

  • See next section

Regex (.+ or .*) is recursive by nature. So how do we do no recursive

  • We should use
[^/]+?
  • Which means a succession of any character different from /
    • In the context of paths, that should suffice
    • You can choose to not have spaces chars if you want with [^/\s]+? ...
    • If you want to match a set of chars or none use * instead of + => [^/]*?
    • So it match till it reach / if doesn't fit the pattern. It's not a match. Just what * is.

Example

# For matching no recursively, only one folder (you can use that with union above if you want)
- (item.path | regex_search(site_ngx_conf_build_dir + '/[^/]+?\.j2$')) is truthy

item.path and filetree

  • Remember that item.path is relative to the base you passed to filetree.
    • All apply starting relatively form there
    • Same for ^some/path/pattern/here$
      • When we use ^$ we do it using relative paths. And we need to use them when we want to match an exact path from start to end. If we don't and if the pattern matches a sub-string. Even if it's not the full path. It will be matched. And you may not want that.

I guess that does cover the fundamentals of using regex for matching and matching with with_community.general.filetree. If you are already all familiar with regex. Than you already know all.

filetree matches files, directories and links

  • item.state
    • file
    • directory
    • link

Remember to match only files and links and not directories.

Add

- item.state != 'directory'

or

- item.state == 'file'

to the when list. I prefer to add it as the first one.

Extending fileglob implementation

This is the python glob definition, by default recursive mode is disabled

def glob(pathname, *, root_dir=None, dir_fd=None, recursive=False,
        include_hidden=False):

And in the fileglob.py you find glob() used like this

for dwimmed_path in found_paths:
    if dwimmed_path:
        globbed = glob.glob(to_bytes(os.path.join(dwimmed_path, term_file), errors='surrogate_or_strict'))
        term_results = [to_text(g, errors='surrogate_or_strict') for g in globbed if os.path.isfile(g)]
        if term_results:
            ret.extend(term_results)
            break

Simple and may be efficient in extending

All that needs to be done is:

  • converting
globbed = glob.glob(to_bytes(os.path.join(dwimmed_path, term_file), errors='surrogate_or_strict'))

to

globbed = glob.glob(to_bytes(os.path.join(dwimmed_path, term_file), errors='surrogate_or_strict', recursive=True))

I would maybe even add

include_hidden=True

That would make a version, that supports recursivity and doesn't need any params. Just use **/* for recursivity.

May be better to extend by adding optional options

Which one is better I wonder. But here is how we can set file_glob lookup with optional options that we pass to it.

to set the options we use

class LookupModule(LookupBase):

    def run(self, terms, variables=None, **kwargs):

        self.set_options(var_options=variables, direct=kwargs)

And for getting the options

response = open_url(
    term, validate_certs=self.get_option('validate_certs'),
    use_proxy=self.get_option('use_proxy'),
    url_username=self.get_option('username'),
    url_password=self.get_option('password'),
    headers=self.get_option('headers'),
    force=self.get_option('force'),
    timeout=self.get_option('timeout'),
    http_agent=self.get_option('http_agent'),

And for final lookup usage, it would be something like

- name: url lookup splits lines by default
  ansible.builtin.debug: msg="{{item}}"
  loop: "{{ lookup('ansible.builtin.url', 'https://github.com/gremlin.keys', wantlist=True) }}"

- name: display ip ranges
  ansible.builtin.debug: msg="{{ lookup('ansible.builtin.url', 'https://ip-ranges.amazonaws.com/ip-ranges.json', split_lines=False) }}"

All from that same url.py file. And we can do the same with fileglob.

  • I'll try to make a PR, and see if it would get accepted.
  • If you add your own plugin, you will need to create a folder called lookup_plugins at the same level of your playbook. Otherwise use lookup_plugins={{ ANSIBLE_HOME ~ "/plugins/lookup:/usr/share/ansible/plugins/lookup:our_ansible/lookup_plugins" }} setting in ansible.cfg

The full implementations to copy

Notes

  • For the implementations, I haven't checked them yet. I will in the next day. And update after
  • Also I didn't evaluate enough why, why the core team limited it to be not recursive. I don't see a reason without giving it much of thinking. I'll evaluate it better update that and see about a PR.
  • Same for checking ansible issues track.

Upvotes: 6

Stephen Ostermiller
Stephen Ostermiller

Reputation: 25525

Ansible has a filetree that is recursive.

Unlike glob, item is not just a string, but an object. You'd need to use item.src (absolute) and item.path (relative) to get the filename out of it.

Because filetree will also list directories, you need to add a when filter so that it only gives you the files back.

- name: "Apply templates"
  template:
    src: "{{ item.src }}"
    dest: "{{ item.path | replace('.j2', '') }}"
  with_filetree: "{{ user_configs_path }}/"
  when: item.state == 'file'

Upvotes: 8

U880D
U880D

Reputation: 12092

Regarding

My problem is that with_fileglob is working only of first level directory and cannot seem to enable some recursive mode.

and according the documentation fileglob

Matches all files in a single directory, non-recursively, that match a pattern. It calls Python’s “glob” library.

for

How to make Ansible with_fileglob work recursively for all subdirectories

one would need to enhance the module code.


Regarding

Basically I want to recursively iterate all directories for files only, and in a loop apply the template module to them and create them in remote server

as solution and depending on your requirements and what you try to achieve, you could use just

Similar Q&A

Upvotes: 1

Related Questions