eduardo2111
eduardo2111

Reputation: 379

Convert dataframe column into date type with last business day of the month

I have the following dataframe df:

       Date     number
0    AUG 17        1.0
1    AUG 17        1.6
2    FEB 18        1.0
3    MAR 18        1.7
4    APR 18        6.0
5    Jan 19        1.0
6    Apr 19        2.0
7    Jun 19        7.1
8    Jan 20        5.5
9    Feb 20        8.6

And I would like to convert the Date column into date type (with the last business day of the month (Monday- Friday)), so that I would get the following output:

            Date     number
0    2017-08-31        1.0
1    2017-08-31        1.6
2    2018-02-28        1.0
3    2018-03-30        1.7
4    2018-04-30        6.0
5    2019-01-31        1.0
6    2019-04-30        2.0
7    2019-06-28        7.1
8    2020-01-31        5.5
9    2020-02-28        8.6

NOTICE that some of my months are in CAPS.

I tried:

date = [datetime.datetime.strptime(x,'%b%Y').date() for x in df['Date']]

But keeps me giving matching error, I assume it is because some months are in CAPS.

Upvotes: 0

Views: 100

Answers (2)

FObersteiner
FObersteiner

Reputation: 25564

is this what you are looking for? make use of the capitalize method (which is also awailable in pandas) to parse the date and add an offset from pd.offsets to get the appropriate business day:

import pandas as pd
# example df:
df = pd.DataFrame({'Date': ['AUG 17', 'aug 17', 'FEB 18', 'MAR 18'], 
                   'number': [1, 1.6, 1, 1.7]})
# convert to datetime after capitalizing the month name, add offset so you can get last business day of month
df['Date'] = (pd.to_datetime(df['Date'].str.capitalize(), format='%b %y') + 
              pd.offsets.BMonthEnd(1))
# df
#         Date  number
# 0 2017-08-31     1.0
# 1 2017-08-31     1.6
# 2 2018-02-28     1.0
# 3 2018-03-30     1.7

Upvotes: 2

eduardo2111
eduardo2111

Reputation: 379

So I figured out that the mistake I was making was that the Year part was also abbreviated. So it should be %y instead of %Y and also it should have a space between the month and the year : %b %y

So to achieve the output I wanted:

import pandas as pd
import datetime

# convert the string dates into date type
df['Date'] = [datetime.datetime.strptime(x,'%b %y').date() for x in df['Date']]

#convert to Business Days (Monday-Friday)
df = df.assign(Date=df['Date'] + pd.offsets.BMonthEnd(1)) 

Upvotes: 0

Related Questions