Thomas Amal
Thomas Amal

Reputation: 23

How can I convert a string to a required date format in a pandas data frame

My pandas dataframe has a column with string representation of dates, which are in different formats as follows: 23MAR where 23 is the year, MAR is the month, and the end day of the month is presumed, 23309 where 23 is the year, 3 is the month and 09 is the day. I need to convert these into date formats : yyyy-mm-dd (2023-03-31, 2023-03-09)

I tried converting these to date format using: df['Date'] = df['Date'].fillna(pd.to_datetime(df['Date'], format='%Y-%m-%d', errors='coerce') but the resultant df['Date'] is unchanged as '23MAR', '23309'

Help in this direction is highly appreciated

Upvotes: 0

Views: 595

Answers (2)

Jamiu S.
Jamiu S.

Reputation: 5741

Try this custom function using pd.offsets.MonthEnd() and pd.dt.strftime()

def parse_date(date_str):
    if '-' in date_str:
        return pd.to_datetime(date_str, format='%Y-%m-%d')
    try:
        return pd.to_datetime(date_str, format='%y%b') + pd.offsets.MonthEnd(1)
    except ValueError:
        return pd.to_datetime(date_str, format='%y%m%d')

df['Date'] = df['Date'].apply(parse_date).dt.strftime('%Y-%m-%d')
print(df)

         Date
0  2023-03-31
1  2023-03-09

Upvotes: 0

fbend
fbend

Reputation: 1

You could try this:

import pandas as pd
from pandas.tseries.offsets import MonthEnd

# convert the 'date' column to datetime format with the correct format string
df['date'] = pd.to_datetime(df['date'], format='%y%b', errors='coerce').fillna(pd.to_datetime(df['date'], format='%y%m%d', errors='coerce'))

# set the day of the month to the last day of the month only for the format '23MAR'
df['date'] = df.apply(lambda row: row['date'] + MonthEnd(1)  if row['date'].strftime('%d') == '01' else row['date'], axis=1)

# convert the date column to the desired format 'yyyy-mm-dd'
df['date'] = df['date'].dt.strftime('%Y-%m-%d')

print(df)

Upvotes: 0

Related Questions