user11530392
user11530392

Reputation:

How to extract dates from filename in Python when there are two dates? And how do I convert to day of year?

I have a list of files that I'm lopping through that contain two dates in the filename. How do I extract each date and store it into a variable? And how do I convert it to day of year?

The code I have isn't working because I can't extract the month/day due to some months leading with zero. The code I have is:

sstDataFile=AQUA_MODIS.20100101_20100131.L3m.MO.SST.sst.4km.nc

    year1=int(sstDataFile[11:15])
    day1=int(sstDataFile[15:19])
    year2=int(sstDataFile[20:24])
    day2=int(sstDataFile[24:27])
    
    print(year1)
    print(year2)
    print (day1)
    print(day2)
    if year1%4==0:
        decYr1 = year1 + day1/366.0
    else:   
        decYr1 = year1 + day1/365.0

    if year2%4==0:
        decYr2 = year2 + day2/366.0
    else:    
        decYr2 = year2 + day2/365.0
    
    decYr=0.5*(decYr1+decYr2) #decimal year between the composite start and end points

Upvotes: 1

Views: 940

Answers (1)

Djib2011
Djib2011

Reputation: 7442

You can use datetime to parse and perform operations with dates in python:

First you need to manipulate the string a bit to extract the part that has to do with the date. Assuming that all your strings come in the same format (i.e. 'SOMETHING.date1_date2.SOMETHINGELSE...'):

full_string = 'AQUA_MODIS.20100101_20100131.L3m.MO.SST.sst.4km.nc'
date1, date2 = full_string.split('.')[1].split('_')
print(date1, date2)

20100101 20100131

Now that we have extracted the dates we need to convert them to datetime format. Your two strings are in the following format YYYYMMDD. Somehow we need to tell datetime that that is the format it should be using. You can see more details about this here. In our case it should look like this:

fmt='%Y%m%d'  # %Y --> YYYY
              # %m --> MM
              # %d --> DD

date1 = datetime.strptime(date1, fmt)
date2 = datetime.strptime(date2, fmt)

print(date1, date2)

2010-01-01 00:00:00 2010-01-31 00:00:00

From what I understand you want to find the "average" year in decimals. Datetime allows you to perform operations on dates. I think the best way to do this in your case is:

avg_date = datetime.fromtimestamp((date1.timestamp() + date2.timestamp()) / 2)

print(avg_date)

2010-01-16 00:00:00

Finally, you want to represent the date as a fraction of the year:

y = avg_date.year
m = (avg_date.month - 1) / 12
d = (avg_date.day - 1) / 365

result = y + m + d
print(result)

Upvotes: 1

Related Questions