Bram Vanroy
Bram Vanroy

Reputation: 28437

Using a variable function with *-operator

I'm messing around with * and **, and figuring out what the use-cases of these operators would be. For this "study", I wrote a function scandir_and_execute that traverses a directory (recursive by default) and executes a function exec_func on each file that is encountered. The function is variable, meaning when calling scandir_and_execute the programmer can indicate which function to run on every file. In addition, to figure out *, I added a func_args variable (defaults to an empty list) that can hold any number of argument.

The idea is that the programmer can use any exec_func that they have defined (or built-in) to which the file is the first argument, and that they provide the needed arguments themselves, in a list, which is then expanded on the exec_func call.

Note: at least Python 3.5 is required to run this function.

import os

def scandir_and_execute(root, exec_func, func_args=[], recursive=True, verbose=False):
    if verbose:
        print(f"TRAVERSING {root}")

    # Use scandir to return iterator rather than list
    for entry in os.scandir(root):
        if entry.is_dir() and not entry.name.startswith('.'):
            if recursive:
                scan_and_execute(entry.path, exec_func, func_args, True, verbose)
        elif entry.is_file():
            if verbose:
                print(f"\tProcessing {entry.name}")

            # Unpack (splat) argument list, i.e. turn func_args into separate arguments and run exec_func
            exec_func(entry.path, *func_args)

Is this the correct way to use *, or am I misinterpreting the documentation and the concept of the operator? The function works, as far as I have tested it, but perhaps there are some caveats or non-pythonic things that I did? For instance, would it be better to write the function like this where the unnamed "superfluous" arguments are tupled together (or another way)?

def scandir_and_execute(root, exec_func, recursive=True, verbose=False, *func_args):

Upvotes: 0

Views: 72

Answers (1)

Ry-
Ry-

Reputation: 224913

That is how you use the splat operator, but consider whether it needs to be your function’s responsibility to pas arguments at all. Say you’re using it like this now:

scandir_and_execute(root, foo, (foo_arg1, foo_arg2), recursive=True)

you can rewrite scandir_and_execute to accept a callable taking one argument:

def scandir_and_execute(root, exec_func, recursive=True, verbose=False):
    if verbose:
        print(f"TRAVERSING {root}")

    # Use scandir to return iterator rather than list
    for entry in os.scandir(root):
        if entry.is_dir() and not entry.name.startswith('.'):
            if recursive:
                scandir_and_execute(entry.path, exec_func, True, verbose)
        elif entry.is_file():
            if verbose:
                print(f"\tProcessing {entry.name}")

            exec_func(entry.path)

and let the caller handle its business:

scandir_and_execute(root, lambda path: foo(path, foo_arg1, foo_arg2))

Then drop the callback entirely and make a generator:

def scandir(root, recursive=True, verbose=False):
    if verbose:
        print(f"TRAVERSING {root}")

    # Use scandir to return iterator rather than list
    for entry in os.scandir(root):
        if entry.is_dir() and not entry.name.startswith('.'):
            if recursive:
                yield from scandir(entry.path, True, verbose)
        elif entry.is_file():
            if verbose:
                print(f"\tProcessing {entry.name}")

            yield entry.path
for path in scandir(root, recursive=True):
    foo(path, foo_arg1, foo_arg2)

(Close to walk, but not quite!) Now the non-recursive version is just this generator:

(entry.path for entry in os.scandir(root) if entry.is_file())

so you may as well provide only the recursive version:

import os


def is_hidden(dir_entry):
    return dir_entry.name.startswith('.')


def scandir_recursive(root, *, exclude_dir=is_hidden):
    for entry in os.scandir(root):
        yield entry

        if entry.is_dir() and not exclude_dir(entry):
            yield from scandir_recursive(entry.path, exclude_dir=exclude_dir)
import logging

logging.info(f'TRAVERSING {root}')

for entry in scandir_recursive(root):
    if entry.is_dir():
        logging.info(f'TRAVERSING {entry.path}')
    elif entry.is_file():
        logging.info(f'\tProcessing {entry.name}')
        foo(entry.path, foo_arg1, foo_arg2)

Upvotes: 1

Related Questions