pipocaDourada
pipocaDourada

Reputation: 117

Is there a more efficient way instead of multiple FOR loops and IF statements?

I have the below code snippet:

for item1 in filesName :
    for item2 in finalName :
        if item1.split('.',1)[0].split('--', 1)[0] == item2.split('.',1)[0].split('--', 1)[0] :
            for eachFile in os.listdir(src) :
                if eachFile == item1 :
                
                    # rename files
                    os.rename(os.path.join(src, item1), os.path.join(src, item2))

                    # move files
                    shutil.move(os.path.join(src, item2), os.path.join(dst, item2))
            
                else :
                    logger.error(error_message)
                    raise ValueError(error_message)

FilesName = ['01-01-2001 Active File Name.xlsx', '01-01-2001 Inactive File Name.xlsx']

FinalName = ['01-01-2001 Active File Name--CORRECTION1.xlsx']

What it intends to do is rename an existing file in case of a correction - for example, I have a file called 'File Name.xlsx', but I want to rename it to 'File Name --CORRECTION1.xlsx' - it matches the file to each other and renames it.

The code works as intended, however my limited experience tells me that I'm using too many FOR loops and IF statements, and there's probably a better way [performance wise] to do it.

So what's my question - is there a better solution?

Upvotes: 1

Views: 97

Answers (2)

GuillemB
GuillemB

Reputation: 621

FilesName = ['01-01-2001 Active File Name.xlsx', '01-01-2001 Inactive File Name.xlsx']
FinalName = ['01-01-2001 Active File Name--CORRECTION1.xlsx']

# Helper function to clean up the splitting
get_base_name = lambda x: x.split('.',1)[0].split('--', 1)[0]

# Remove the two first loops by checking for base name matches
# using set intersection
filesName_base = {get_base_name(x) for x in FilesName}
finalName_base = {get_base_name(x) for x in FinalName}
files_to_change_base = filesName_base.intersection(finalName_base)

# Potential candidates of files to be changed
files_to_change = [x for x in FilesName if get_base_name(x) in files_to_change_base]

# No need to call this on each iteration of the inner loop
files_in_dir = set(os.listdir(src))

for file_to_change in files_to_change:
    if file_to_change in files_in_dir:
        # rename files
        os.rename(os.path.join(src, item1), os.path.join(src, item2))
        # move files
        shutil.move(os.path.join(src, item2), os.path.join(dst, item2))

    else :
        logger.error(error_message)
        raise ValueError(error_message)

Edit: Removed another for loop by just looping once and checking if the files are in the directory. files_in_dir is moved into a set because set membership is a O(1) operation as opposed to O(n).

Upvotes: 2

tripleee
tripleee

Reputation: 189407

There is absolutely no need to loop over all the files in the directory to find out if a file with a particular name already exists.

If I can guess at all as to what you are actually trying to accomplish, the code which writes a new file might currently look like

### FIXME: don't overwrite if outputfile exists
with open(outputfile, "w") as output:
    output.write(data)

and you'd change that to, say,

idx = 1
# base is outputfile without .xslx
base = outputfile[:-5]
while os.path.exists(outputfile):
    outputfile = "%s--CORRECTION%i.xlsx" % (base, idx)
    idx += 1

with open(outputfile, "w" ...

The loop updates the number after --CORRECTION until it lands on a number which isn't already taken by an existing file. (If you delete a file somewhere in the middle of that sequence, the code will obviously reuse that number rather than increment past the last numbered file in the numbered sequence.)

Perhaps a better design altogether would be to use version control. Replacing the Excel file format with something less braindead would also simplify your life tremendously.

Upvotes: 0

Related Questions