Nikit Parakh
Nikit Parakh

Reputation: 75

Optimizing a function that returns paths of a folder in python

I am looking to write a function that looks into a given directory and gives me a list of paths of all folders of a particular name.

Let's say I need to search the Desktop and all its subdirectories for folders named "Test". The original code to do it is this:

def finder():
    lst = []
    for root, dir, files in os.walk(r'C:\Users\username\Desktop'):
        for i in dir:
            if i == 'Test':
                lst.append(os.path.join(root,i))
    return lst

I looked online and found that list comprehensions can be much faster in such cases and came up with this function:

def finder2():
    lst = [i[0] for i in os.walk(r'C:\Users\username\Desktop') if i[0][-4:]=='Test']
    return lst

I timed both functions using timeit for 100 repetitions and found that they take a similar amount of time.

  1. Why is the list comprehension not faster?
  2. How can I make it faster?
  3. Is there any other faster way to do the same thing?

Thanks!

Upvotes: 1

Views: 144

Answers (1)

alani
alani

Reputation: 13069

The task is probably mainly I/O limited so you are unlikely to achieve much speed-up however it is performed.

List comprehensions are still effectively loops in the Python level, and may be slightly faster than a for loop because the append attribute does not need to be looked up each time, but the difference is not normally very significant.

To take a more radical comparison, on a Linux system I compared the timings of your Python code with the equivalent find command (find /starting/directory -type d -name Test). Here, find is an executable compiled from C code, so for CPU limited tasks would be expected to be significantly quicker than any explicit loops in Python (including list comprehensions). In fact, I found that running find was only on average about 25% quicker than the Python code. This is indicative of the fact the task is I/O limited, and you are unlikely to achieve significant speed-up by changing the algorithm.

Upvotes: 1

Related Questions