Reputation: 961
I have a set of files saved in my laptop. The folder structure is like:
Part1(folder)
Part1(subfolder)
awards_1990 (subfolder)
awards_1990_00 (subfolder)
(files)
awards_1990_01
(files)
...
...
...
awards_1991
awards_1991_01
(files)
awards_1991_01
awards_1991_01
...
...
...
awards_1992
...
...
...
awards_1993
...
...
...
awards_1994
...
...
...
So I am trying to extract the list of file path with os.walk. The code I have is like this:
import os
matches=[]
for root, dirnames, dirname in os.walk('E:\\Grad\\LIS\\LIS590 Text mining\\Part1\\Part1'):
for dirname in dirnames:
for filename in dirname:
if filename.endswith(('.txt','.html','.pdf')):
matches.append(os.path.join(root,filename))
When I call matches, it returns [].
I tried another code:
import os
dirnames=os.listdir('E:\\Grad\\LIS\\LIS590 Text mining\\Part1\\Part1')
for filenames in dirnames:
for filename in filenames:
path=os.path.join(filename)
print (os.path.abspath(path))
This one gives me me this result:
C:\Python32\a
C:\Python32\w
C:\Python32\a
C:\Python32\r
C:\Python32\d
C:\Python32\s
C:\Python32\_
C:\Python32\1
...
Researching on this error. Any idea what to do with this?
Upvotes: 1
Views: 4078
Reputation: 414129
for filename in dirname:
enumerates individual characters in dirname
string. Try:
#!/usr/bin/env python
import os
topdir = r'E:\Grad\LIS\LIS590 Text mining\Part1\Part1'
matches = []
for root, dirnames, filenames in os.walk(topdir):
for filename in filenames:
if filename.endswith(('.txt','.html','.pdf')):
matches.append(os.path.join(root, filename))
print("\n".join(matches))
You don't need the for
-loop with dirnames
here.
Upvotes: 1
Reputation: 1648
Function endswith takes: suffix[, start[, end]], so if you have more than one suffix, then you need parentheses around them:
if filename.endswith(('.txt','.html','.pdf')):
Upvotes: 3