Reputation: 5203
I have this code which I was hoping would work for a list of files in a filesystem. The file names in the directory look like this:
directory/
./file-2014-7-8.info
./file-2014-7-9.info
./file-2014-7-10.info
The relevant code is this:
filetype = '.info'
dir_list = os.listdir(directory)
try:
latest_file = sorted([i for i in dir_list if i.endswith(filetype)])[-1]
return latest_file
except Exception as e:
logging.error("could not find any %s files in the directory: %s" % (filetype, e)
This code returns the 7-9.info file instead of the 7-10.info file.
How do I get it to return the 7-10 without altering the names of the files themselves? Is there an easy way?
Upvotes: 0
Views: 481
Reputation: 308
Build the list of string filenames into a data structure that can be easily sorted. For example, if the date component was treated as int
s rather than str
s, you'd get what you want. Perhaps something along the lines of:
[
((2014,7,8), './file-2014-7-8.info'),
((2014,7,9), './file-2014-7-9.info'),
((2014,7,10), './file-2014-7-10.info'),
]
There are many ways to get just the date component from the file. Here's one crude way of doing it:
>>> def get_date(f):
... return map(int, f.replace('./file-', '').replace('.info', '').split('-'))
>>> get_date('./file-2014-7-10.info')
[2014, 7, 10]
Now that you have a function to get the date tuple for each filename, you just have to apply it to the all of them:
>>> import pprint
>>> result = [ (get_date(f), f) for f in contents ]
>>> pprint.pprint(result)
[([2014, 7, 8], './file-2014-7-8.info'),
([2014, 7, 9], './file-2014-7-9.info'),
([2014, 7, 10], './file-2014-7-10.info')]
If you call sorted
on the result
with default options, it'll output the list in date-ascending order and you can just grab the last item.
Upvotes: 0
Reputation: 5203
This was answered from the ideas given in the comments section of the original question. The credit goes to cox who suggested I look in the pypi repo for natsort. Here is the code changed to work properly:
from natsort import natsorted
filetype = '.info'
dir_list = os.listdir(directory)
try:
latest_file = natsorted([i for i in dir_list if i.endswith(filetype)])[0]
return latest_file
except Exception as e:
logging.error("could not find any %s files in the directory: %s" % (filetype, e)
Upvotes: 0
Reputation: 1343
You could use a lambda function to parse out the datetime part of the file names while sorting.
import datetime
filetype = '.info'
dir_list = [i for i os.listdir(directory) if i.endswith(filetype)]
try:
sorted_files = sorted(dir_list, key=lambda x: datetime.datetime.strptime(x[5:-5], "%Y-%m-%d"))
return sorted_files[-1]
except Exception as e:
logging.error("could not find any %s files in the directory: %s" % (filetype, e)
Upvotes: 0
Reputation: 113940
fname_2_ts = lambda fname:time.strptime(os.path.basename(fname),"file-%Y-%m-%d.info")
latest_file = sorted([i for i in dir_list if i.endswith(filetype)],key = fname_2_ts)[-1]
the problem was that you were comparing as strings and "1" (the first part of "10" is less than both "8" and "9")
Upvotes: 1