Reputation: 71
I have multiple format files in a directory. I am trying to build a list or dictionary to group similar formatted (extension) files in python using for loop, but somehow it is not working.
Here is my sample code:
extension = ['pdf','xlsx','doc']
file_name_path=[]
file_dict ={}
for i in range(len(extension)):
for file_name in filelst:
if os.path.splitext(file_name)[-1] == extension[i]:
file_name_path.append(file_name)
file_dict[str(extension[i])]= file_name_path
file_name_path
file_dict
where filelst is a list having all file names for example like
filelst =
['PD_CFS_PLL_OnMonSummary_2017Q2.xlsx',
'PD_Detailed_OMR_PLL_Lines_2017Q2.xlsx',
'PD_Detailed_OMR_PLL_Loans_2017Q2.xlsx',
'regexp-tip-sheet.pdf',
'SAS statistical-business-analyst certification .pdf']
Upvotes: 0
Views: 104
Reputation: 36763
Alternative method to get dictionary with extensions as keys
extension = ['.pdf','.xlsx','.doc']
filelist = ['one.pdf','two.pdf','three.doc','four.xlsx'] #just for example
d = dict()
for i in extension:
d[i] = [j for j in filelist if os.path.splitext(j)[-1].lower()==i]
print(d)
output:
{'.doc': ['three.doc'], '.xlsx': ['four.xlsx'], '.pdf': ['one.pdf', 'two.pdf']}
Note that I used dots in extension list as os.path.splitext
return list with last element being '.extension'
. .lower()
is used to made this solution case-insensitive, strings in extension
list have to contain lowercase characters only.
Upvotes: 2
Reputation: 36756
You are overwriting the dictionaries value each time the same key is seen.
Instead, use a list as the value and append. This is what defaultdict
is for.
from collections import defaultdict
extension = ['pdf','xlsx','doc']
file_dict = defaultdict(list)
for file_name in filelst:
ext = os.path.splitext(file_name)[-1].lower()
if ext in extension:
file_dict[ext].append(file_name)
Upvotes: 1