Reputation: 75
This is for python 2.
I have a chunk of code that is creating an object (dtry) containing three identical lists. Each list is all of the files (excluding folders) with a folder. This works, but I want to extend it to also work for subfolders.
My working code is as follows:
import os
fldr = "C:\Users\jonsnow\OneDrive\Documents\my_python\Testing\Testing"
dtry[:] = [] # clear list
for i in range(3):
dtry.append([tup for tup in os.listdir(fldr)
if os.path.isfile(os.path.join(fldr, tup))])
This successfully creates the three lists containing the names but not full paths of files (and only files, not folders) inside fldr.
I want this to also search within the subfolders of fldr.
Unfortunately I can't figure out how to get it to do so.
I have cobbled together another piece of code that does list all of the files in the subfolders as well (and so kind of works), but it lists the full paths not just the file names. This is as follows:
import os
fldr = "C:\Users\jonsnow\OneDrive\Documents\my_python\Testing\Testing"
dtry[:] = [] # clear list
for i in range(3):
dtry.append([os.path.join(root, name)
for root, dirs, files in os.walk(fldr)
for name in files
if os.path.isfile(os.path.join(root, name))])
I have tried changing the line:
dtry.append([os.path.join(root, name)
to
tup for tup in os.listdir(fldr)
but this is not working for me.
Can anyone tell me what I am missing here?
Again, I am trying to get dtry to be three lists, each list being all of the files within fldr and the files within all of its all of its subfolders.
Upvotes: 0
Views: 1023
Reputation: 31354
You're making an easy problem very hard. This works:
from glob import glob
files = glob(r'C:\Users\jonsnow\OneDrive\Documents\my_python\Testing\Testing\**\*', recursive=True')
result = [files for _ in range(3)]
Note that this produces a list with three references to the original list. If you need three identical copies:
from glob import glob
files = glob(r'C:\Users\jonsnow\OneDrive\Documents\my_python\Testing\Testing\**\*', recursive=True)
result = [files.copy() for _ in range(3)]
Upvotes: 0
Reputation: 23129
Here's the simplest way I can think of to get all of the filenames without any subpaths, using just os.listdir():
import os
from pprint import pprint
def getAllFiles(dir, result = None):
if result is None:
result = []
for entry in os.listdir(dir):
entrypath = os.path.join(dir, entry)
if os.path.isdir(entrypath):
getAllFiles(entrypath ,result)
else:
result.append(entry)
return result
def main():
result = getAllFiles("/tmp/foo")
pprint(result)
main()
This uses the recursion idea I mentioned in my comment.
With test directory structure:
/tmp/foo
├── D
│ ├── G
│ │ ├── h
│ │ └── i
│ ├── e
│ └── f
├── a
├── b
└── c
I get:
['a', 'c', 'i', 'h', 'f', 'e', 'b']
If I change this line:
result.append(entry)
to:
result.append(entrypath)
then I get:
['/tmp/foo/a',
'/tmp/foo/c',
'/tmp/foo/D/G/i',
'/tmp/foo/D/G/h',
'/tmp/foo/D/f',
'/tmp/foo/D/e',
'/tmp/foo/b']
To get the exact result you wanted, you can do
dtry = [getAllFiles("/tmp/foo")]
dtry.append(list(dtry[0]))
dtry.append(list(dtry[0]))
And if you want to use os.walk, which is more compact, here are the two flavors of that:
def getAllFiles2(dir):
result = []
for root, dirs, files in os.walk(dir):
result.extend(files)
return result
def getAllFilePaths2(dir):
result = []
for root, dirs, files in os.walk(dir):
result.extend([os.path.join(root, f) for f in files])
return result
These produce the same results (order aside) as the recursive versions.
Upvotes: 2