Reputation: 75
My code should find the newest and oldest files in a folder and its subfolders. It works for the top-level folder but it doesn't include files within subfolders.
import os
import glob
mypath = 'C:/RDS/*'
print(min(glob.glob(mypath), key=os.path.getmtime))
print(max(glob.glob(mypath), key=os.path.getmtime))
How do I make it recurse into the subfolders?
Upvotes: 0
Views: 2440
Reputation: 123531
Here's a fairly efficient way of doing it. It determines the oldest and newest files by iterating through them all once. Since it uses iteration, there's no need to first create a list of them and go through it twice to determine the two extremes.
mport os
import pathlib
def max_min(iterable, keyfunc=None):
if keyfunc is None:
keyfunc = lambda x: x # Identity.
iterator = iter(iterable)
most = least = next(iterator)
mostkey = leastkey = keyfunc(most)
for item in iterator:
key = keyfunc(item)
if key > mostkey:
most = item
mostkey = key
elif key < leastkey:
least = item
leastkey = key
return most, least
mypath = '.'
files = (f for f in pathlib.Path(mypath).resolve().glob('**/*') if f.is_file())
oldest, newest = max_min(files, keyfunc=os.path.getmtime)
print(f'oldest file: {oldest}')
print(f'newest file: {newest}')
Upvotes: 0
Reputation: 1205
Pay attention to the os filepath separator: "/" (on unix) vs. "\" (on windows). You can try something like below. It saves the files list in a variable, it is faster than traversing twice the file system. There is one line for debugging, comment it in production.
import os
import glob
mypath = 'D:\RDS\**'
allFilesAndFolders = glob.glob(mypath, recursive=True)
# just for debugging
print(allFilesAndFolders)
print(min(allFilesAndFolders, key=os.path.getmtime))
print(max(allFilesAndFolders, key=os.path.getmtime))
Upvotes: 0
Reputation: 2129
Try using pathlib, also getmtime
gives the last modified time, you want the time file was created so use getctime
if you strictly want only files:
import os
import pathlib
mypath = 'your path'
taggedrootdir = pathlib.Path(mypath)
print(min([f for f in taggedrootdir.resolve().glob('**/*') if f.is_file()], key=os.path.getctime))
print(max([f for f in taggedrootdir.resolve().glob('**/*') if f.is_file()], key=os.path.getctime))
if results may include folders:
import os
import pathlib
mypath = 'your path'
taggedrootdir = pathlib.Path(mypath)
print(min(taggedrootdir.resolve().glob('**/*'), key=os.path.getctime))
print(max(taggedrootdir.resolve().glob('**/*'), key=os.path.getctime))
Upvotes: 1
Reputation: 957
As the docs show, you can add a recursive=True
keyword argument to glob.glob()
so your code becomes:
import os
import glob
mypath = 'C:/RDS/*'
print(min(glob.glob(mypath, recursive=True), key=os.path.getmtime))
print(max(glob.glob(mypath, recursive=True), key=os.path.getmtime))
This should give you the oldest and newest file in your folder and all its subfolders.
Upvotes: 1