junk fod
junk fod

Reputation: 1

Trying delete some files by python

I am trying to delete some files that are not on the list (list_of_labeled) but the condition is never true and the condition is for some reason checked only 4010 time (it should check it 4010 * number of files times (10000))

c = 0
with open("C:\_base\MyCode\Song_genre_classification\MillionSongSubset\list_of_labeled.txt") as list_of_labeled:
    for path, subdirs, files in os.walk("C:\_base\MyCode\Song_genre_classification\MillionSongSubset\data"):
        for name in files:
            for line in list_of_labeled:
                c += 1
                if name == line[:21]:
                    # os.remove(path)
                    print(name)
print(c)

Upvotes: 0

Views: 51

Answers (1)

Mark Tolonen
Mark Tolonen

Reputation: 178115

You open the file once outside the loop. The first time for line in list_of_labeled: is iterated, the file is exhausted. Future loops read nothing because the file is at the end. Either rewind the file each time before the for line, or load the file into a list and re-use the list.

Also this algorithm is really slow. Instead of reading the entire file and slicing the line for each name, read the file once, slice the line, and store it in a set for fast searching. Something like (untested):

with open(r"C:\_base\MyCode\Song_genre_classification\MillionSongSubset\list_of_labeled.txt") as list_of_labeled:
    lines = {line[:21] for line in list_of_labeled}

for path, subdirs, files in os.walk(r"C:\_base\MyCode\Song_genre_classification\MillionSongSubset\data"):
    for name in files:
        if name in lines:
            print(name)

Upvotes: 1

Related Questions