bnlucas
bnlucas

Reputation: 1764

Using pure Python over grep?

I am not familiar with grep as I've always been on a Windows system so when someone suggested I add these lines to my code, I'm a little confused...

grep = 'grep -E \'@import.*{}\' * -l'.format(name)
proc = Popen(grep, shell=True, cwd=info['path'], stdout=PIPE, stderr=PIPE)

From my understanding, this is trying to find all files in cwd that contain @import given_file_name essentially, right?

If this is how grep works, I would need to write something in just Python that would do this for me, however I'm worried about the time it may take to do such a thing.

The script is in a Sublime Text 3 plugin that runs the sublime_plugin.EventListener method on_post_save to find all files containing the just saved filename and build a list of file names to compile.

def files_that_import(filename, project_root):
    files = []
    for root, dirnames, files in os.walk(project_root):
        for fn in files:
            if fn.endswith(('.scss', '.sass')):
                with open(fn, 'r') as f:
                    data = f.read()
                if re.search(r'@import.*["\']{}["\'];'.format(fn), data):
                    files.append(fn)
    return files

Not knowing exactly how grep works, this was the best I could think of. However, like I said, I'm worried about the time it would take to scan through all .scss and .sass files. While there shouldn't be a ton of them, getting the contents for each seems like it's more complicated than what it could be.

updated

I updated the code using @nneonneo corrections. I also noticed in the code I used, it was checking each file for an @import statement for itself.

def files_that_import(filename, project_root):
    pattern = re.compile('''@import.*["']{}["'];'''.format(filename))
    found = []
    for root, dirnames, files in os.walk(project_root):
        for fn in files:
            if fn.endswith(('.scss', '.sass')):
                with open(fn, 'r') as f:
                    if any(pattern.search(line) for line in f):
                        found.append(fn)
    return found

update If anyone finds this useful and wants to use the code, I changed files = [] to found = [] since files is being defined in the for loop with os.walk() causing an error.

Upvotes: 1

Views: 118

Answers (1)

nneonneo
nneonneo

Reputation: 179602

You've mostly got it. You can make it a bit more efficient by doing the following:

import_pattern = re.compile(r'''@import.*["']{}["'];'''.format(fn))
with open(fn, 'r') as f:
    for line in f:
        if import_pattern.match(line):
            files.append(fn)
            break

This will scan through each line, and break as soon as it finds what it is looking for. It should be faster than reading the whole file.

Upvotes: 3

Related Questions