Reputation: 35
I have a list of filenames but in the directory they are named a little different. I wanna print filenames that are not in directory. Example of files:
FOO_BAR_524B_023D9B01_2021-157T05-34-31__00001_2021-08-30T124702.130.tgz
import os
missing = ['FOO_BAR_524B_023D9B01_2021-157T05-34-31__00001', 'dfiknvbdjfhnv']
for fileName in missing:
for fileNames in next(os.walk('C:\\Users\\foo\\bar'))[2]:
if fileName not in fileNames:
print(fileName)
I cannot get what I'm doing wrong...
Upvotes: 0
Views: 700
Reputation: 2602
The problem is that you iterate over every file in the directory (for fileNames in next(os.walk(...))[2]
) and check if fileName
is in each of those file names. For every file in the folder where fileName not in fileNames
, fileName
is printed, resulting in it being printed many times.
This can be fixed by doing a single check to see if all files in the folder do not contain the target file name.
import os
missing = ['FOO_BAR_524B_023D9B01_2021-157T05-34-31__00001', 'dfiknvbdjfhnv']
fileNames = next(os.walk('C:\\Users\\foo\\bar'))[2]
for missingfileName in missing:
if all(missingfileName not in fileName for fileName in fileNames):
print(missingfileName)
If you want it to be more efficient and you are only looking for file names that are prefixes of other names, then you can use a data structure called a trie. For example if missing
equals ['bcd']
, and there is a file called abcde
and these are not considered a match, then a trie is appropriate here.
Upvotes: 1