tvr2006
tvr2006

Reputation: 23

Python Search for File Unknown Extension

New to python, apologies if this is a simple question. I've searched around a bit and found plenty on searching for files with an unknown name and known file extension, but not a known name and unknown extension, and if nobody minds I could use a little help getting my code to run correctly.

What I'm attempting to write is a Python function that accepts a directory and name, and then returns a list with the path to all files (with any file extension) and directories with that name. The directory parameter will be a computer drive (such as C or F), and the name parameter is the name (without extension) of the file to be searched for.

The following is the code that I have:

import os
import glob
def search_directory(directory,name):
    result = []
    for root,dirs,files in os.walk(directory,topdown=True):
        files_lower = []
        dirs_lower = []
        for i in files:
            files_lower.append(i.lower())
        for i in dirs:
            dirs_lower.append(i.lower())
        for i in glob.glob(name + '.*'):
            if i.lower() in files_lower:
                result.append(root + "\\" + files[files_lower.index(i.lower())])
        if name.lower() in dirs_lower:
            result.append(root + "\\" + dirs[dirs_lower.index(name.lower())])
    if (len(result) == 0):
        result.append("fileNotFound")
    return result

Currently, I am only able to find results if a copy of the file is in the directory of my program. If there's not a copy there, it doesn't find the file, even though there are two copies on my drive.

I was hoping somebody could explain to me why this is the case and how to correct it so that it always finds the files that I'm searching for.

Upvotes: 0

Views: 1874

Answers (1)

ShadowRanger
ShadowRanger

Reputation: 155363

Why are you reglobbing to search? It means you end up rescanning directories repeatedly, when os.walk is giving you the names, so you can just check them directly using os.path.splitext to do extension splitting. You can also simplify the logic by making it a generator function, so you yield files as you find them, getting results faster and avoiding unnecessary state when you are processing each file name and throwing it away:

def search_directory(directory,name):
    name = name.lower()  # Convert up front in case it's pass mixed case
    for root, dirs, files in os.walk(directory,topdown=True):
        for e in files + dirs:
            if os.path.splitext(e)[0].lower() == name:
                yield os.path.join(root, e)

This makes it a generator (if you want a list, you'd wrap the call in the list constructor to realize the generator), so it doesn't tell you if there were no hits, but the caller (or a wrapping function that converts to list) can determine this themselves. If you needed to, a simple boolean initialized to False that gets set to True before yielding could let you make the same check, though usually a utility function doesn't need to worry itself with stuff like that.

Upvotes: 2

Related Questions