Using pure Python over grep?

Question

I am not familiar with grep as I've always been on a Windows system so when someone suggested I add these lines to my code, I'm a little confused...

grep = 'grep -E \'@import.*{}\' * -l'.format(name)
proc = Popen(grep, shell=True, cwd=info['path'], stdout=PIPE, stderr=PIPE)

From my understanding, this is trying to find all files in cwd that contain @import given_file_name essentially, right?

If this is how grep works, I would need to write something in just Python that would do this for me, however I'm worried about the time it may take to do such a thing.

The script is in a Sublime Text 3 plugin that runs the sublime_plugin.EventListener method on_post_save to find all files containing the just saved filename and build a list of file names to compile.

def files_that_import(filename, project_root):
    files = []
    for root, dirnames, files in os.walk(project_root):
        for fn in files:
            if fn.endswith(('.scss', '.sass')):
                with open(fn, 'r') as f:
                    data = f.read()
                if re.search(r'@import.*["\']{}["\'];'.format(fn), data):
                    files.append(fn)
    return files

Not knowing exactly how grep works, this was the best I could think of. However, like I said, I'm worried about the time it would take to scan through all .scss and .sass files. While there shouldn't be a ton of them, getting the contents for each seems like it's more complicated than what it could be.

updated

I updated the code using @nneonneo corrections. I also noticed in the code I used, it was checking each file for an @import statement for itself.

def files_that_import(filename, project_root):
    pattern = re.compile('''@import.*["']{}["'];'''.format(filename))
    found = []
    for root, dirnames, files in os.walk(project_root):
        for fn in files:
            if fn.endswith(('.scss', '.sass')):
                with open(fn, 'r') as f:
                    if any(pattern.search(line) for line in f):
                        found.append(fn)
    return found

update If anyone finds this useful and wants to use the code, I changed files = [] to found = [] since files is being defined in the for loop with os.walk() causing an error.

nneonneo · Accepted Answer

You've mostly got it. You can make it a bit more efficient by doing the following:

import_pattern = re.compile(r'''@import.*["']{}["'];'''.format(fn))
with open(fn, 'r') as f:
    for line in f:
        if import_pattern.match(line):
            files.append(fn)
            break

This will scan through each line, and break as soon as it finds what it is looking for. It should be faster than reading the whole file.

Using pure Python over grep?

Answers (1)

Related Questions