user1561868
user1561868

Reputation: 841

Regular expression usage in glob.glob?

import glob

list = glob.glob(r'*abc*.txt') + glob.glob(r'*123*.txt') + glob.glob(r'*a1b*.txt')

for i in list:
  print i

This code works to list files in the current folder which have 'abc', '123' or 'a1b' in their names.

How would I use one glob to perform this function?

Upvotes: 83

Views: 152144

Answers (5)

Based on the best answer (I just totally don't understand why we can't use both regex and glob):

import glob
import re
path = r'.\**\*'
res = [f for f in glob.glob(path, recursive=True) if re.search(r'(abc|123|a1b).*\.txt$', f)]
for f in res:
    print(f)

The expression path = r'.\**\*' means that the glob module will search for files in current directory recursively (recursive=True).

You may also have to remove one backslash and an asterisk from path if there are no subdirectories in the folder.

Upvotes: 1

Evan
Evan

Reputation: 2301

I'm surprised that no answers here used filter.

import os
import re

def glob_re(pattern, strings):
    return filter(re.compile(pattern).match, strings)

filenames = glob_re(r'.*(abc|123|a1b).*\.txt', os.listdir())

This accepts any iterator that returns strings, including lists, tuples, dicts(if all keys are strings), etc. If you want to support partial matches, you could change .match to .search. Please note that this obviously returns a generator, so if you want to use the results without iterating over them, you could convert the result to a list yourself, or wrap the return statement with list(...).

Upvotes: 46

R.Camilo
R.Camilo

Reputation: 21

for filename in glob.iglob(path_to_directory + "*.txt"):
    if filename.find("abc") != -1 or filename.find("123") != -1 or filename.find("a1b") != -1:
        print filename

Upvotes: 1

Schnouki
Schnouki

Reputation: 7707

The easiest way would be to filter the glob results yourself. Here is how to do it using a simple loop comprehension:

import glob
res = [f for f in glob.glob("*.txt") if "abc" in f or "123" in f or "a1b" in f]
for f in res:
    print f

You could also use a regexp and no glob:

import os
import re
res = [f for f in os.listdir(path) if re.search(r'(abc|123|a1b).*\.txt$', f)]
for f in res:
    print f

(By the way, naming a variable list is a bad idea since list is a Python type...)

Upvotes: 117

SleepyCal
SleepyCal

Reputation: 5993

Here is a ready to use way of doing this, based on the other answers. It's not the most performance critical, but it works as described;

def reglob(path, exp, invert=False):
    """glob.glob() style searching which uses regex

    :param exp: Regex expression for filename
    :param invert: Invert match to non matching files
    """

    m = re.compile(exp)

    if invert is False:
        res = [f for f in os.listdir(path) if m.search(f)]
    else:
        res = [f for f in os.listdir(path) if not m.search(f)]

    res = map(lambda x: "%s/%s" % ( path, x, ), res)
    return res

Upvotes: 18

Related Questions