Reputation: 89
I have the following series:
myseries = pd.Series([' Period : From 1 February 2020 to 31 January 2021',
' Period : 1 January 2020 to 31 December 2020',
' Period 67 months',
' Period: 8 Months'])
I want to convert the datetime objects where there are two dates (only the first 2) into datetime format, while keeping the others in their original format.
i.e - [('02-01-2020', '01-31-2021'), ('01-01-2020', '12-31-2020'), 'Period: 67 Months', 'Period: 8 Months']
I tried the following, but I'm getting a datetime object for the ones that don't have a proper date.
for i,v in myseries.iteritems():
matches = list(datefinder.find_dates(v))
if len(matches) > 0:
print(matches)
I've tried using the staticmethod
argument in datefinder's find_dates()
, which gives me the following. I can work with this however, I'm unable to extract the objects I require.
[(datetime.datetime(2020, 2, 1, 0, 0), '1 February 2020'), (datetime.datetime(2021, 1, 31, 0, 0), '31 January 2021')]
[(datetime.datetime(2020, 1, 1, 0, 0), '1 January 2020'), (datetime.datetime(2020, 12, 31, 0, 0), '31 December 2020')]
[(datetime.datetime(2067, 4, 4, 0, 0), '67 mon')]
[(datetime.datetime(2020, 4, 8, 0, 0), '8 Mon')]
My desired output is 2 lists:
date_1 = ['1 February 2020', '1 January 2020', '67 mon', '8 Mon']
date_2 = ['31 January 2021', '31 December 2020', '67 mon', '8 Mon']
Upvotes: 2
Views: 184
Reputation: 4618
IIUC:
myseries.apply(lambda x: [x[1] for x in datefinder.find_dates(x, source=True)][:2] if not pd.isna(x) else [])
Basically, use the source parameter to get the original date, then if the list of dates is bigger than 2, get the first 2.
If you want date_1 and date_2:
date_1 = []
date_2 = []
dates = myseries.apply(lambda x: [x[1] for x in datefinder.find_dates(x, source=True)][:2])
for date in dates:
if len(date)==0:
date_1.append(np.nan)
date_2.append(np.nan)
if len(date)>0:
date_1.append(date[0])
if len(date)>1:
date_2.append(date[1])
elif len(date)>0:
date_2.append(date[0])
Upvotes: 2