Eric
Eric

Reputation: 295

get latest file based on filename python

starting a new thread from this

I have a directory with files in this format:

Report_Test-01-16-2014.09_42-en.zip
Another Report_Test-01-16-2014.09_42-en.zip
Report_Holiday-01-16-2014.09_42-en.zip
Report_Weekday-01-16-2014.09_42-en.zip
Report_Special-01-16-2014.09_42-en.zip

Report_Test-12-16-2013.10_52-en.zip
Another Report_Test-12-16-2013.10_52-en.zip
Report_Holiday-12-16-2013.10_52-en.zip
Report_Weekday-12-16-2013.10_52-en.zip
Report_Special-12-16-2013.10_52-en.zip

I have no control over the file naming and the file name pattern stays consistent. I've tried everything in the previous thread

I need to be able to return the last file and the last two files based on date in filename. Unfortunately %m-%d-%Y format of the date is throwing me off. I end up with 2013 files because 12 in 12-16-2013 is higher than 01 in 01-16-2014.

Any advice would be very much appreciated. Thanks

Upvotes: 0

Views: 5500

Answers (3)

Elisha
Elisha

Reputation: 4951

you can use your own compare function to compare according to your logic

filenames = ["Report_Test-01-16-2014.09_42-en.zip",
             "Report_Special-12-16-2013.10_52-en.zip"]

def compare_dates(fn1,fn2):
        # parse the date information
        day1,month1,year1 = fn1.split(".")[0].split("-")[-3:]
        day2,month2,year2 = fn2.split(".")[0].split("-")[-3:]
        ret = cmp(year1,year2) # first compare the years
        if ret != 0:
            return ret
        ret = cmp(month1,month2) # if years equal, compare months
        if ret != 0:
            return ret
        return cmp(day1,day2) # if months equal, compare days

filenames.sort(cmp=compare_dates)

and now 2013 is before 2014:

>>> filenames
['Report_Special-12-16-2013.10_52-en.zip', 'Report_Test-01-16-2014.09_42-en.zip

Upvotes: 0

falsetru
falsetru

Reputation: 369094

  • Extract date string from the filenames.
  • convert it to date object.
  • find last date. (1)
  • filter filename using the last date.

filenames = [
    'Report_Test-01-16-2014.09_42-en.zip',
    'Another Report_Test-01-16-2014.09_42-en.zip',
    'Report_Holiday-01-16-2014.09_42-en.zip',
    'Report_Weekday-01-16-2014.09_42-en.zip',
    'Report_Special-01-16-2014.09_42-en.zip',
    'Report_Test-12-16-2013.10_52-en.zip',
    'Another Report_Test-12-16-2013.10_52-en.zip',
    'Report_Holiday-12-16-2013.10_52-en.zip',
    'Report_Weekday-12-16-2013.10_52-en.zip',
    'Report_Special-12-16-2013.10_52-en.zip',
] # Used in place of `os.listdir(....)`

import re
import datetime

date_pattern = re.compile(r'\b(\d{2})-(\d{2})-(\d{4})\b')
def get_date(filename):
    matched = date_pattern.search(filename)
    if not matched:
        return None
    m, d, y = map(int, matched.groups())
    return datetime.date(y, m, d)

dates = (get_date(fn) for fn in filenames)
dates = (d for d in dates if d is not None)
last_date = max(dates)
last_date = last_date.strftime('%m-%d-%Y')
filenames = [fn for fn in filenames if last_date in fn]
for fn in filenames:
    print(fn)

output:

Report_Test-01-16-2014.09_42-en.zip
Another Report_Test-01-16-2014.09_42-en.zip
Report_Holiday-01-16-2014.09_42-en.zip
Report_Weekday-01-16-2014.09_42-en.zip
Report_Special-01-16-2014.09_42-en.zip

Upvotes: 3

Sugar
Sugar

Reputation: 519

use .split("-") function for it. like

x="Report_Test-01-16-2014.09_42-en.zip"
y=x.split("-") #['Report_Test', '01', '16', '2014.09_42', 'en.zip']

then make some sort and get the latest

Upvotes: 0

Related Questions