Wendy Velasquez
Wendy Velasquez

Reputation: 181

How to convert a string that only has the month and year in a Pandas column into a datetime with another format in Python?

I have a dataset coming from a PDF file that contains a column for date in the following format: JAN2021, and I want to convert it into a datetime that only displays Month and Year, but as JAN-2021. I tried using the statement below but it did not work, is there any way I can do this with altering the data ( meaning without adding a day for that month as an alteration).

df2['MONTH1'] = pd.to_datetime(df2['MONTH1'], format="%mmm%YY")

any feedback/tip will be greatly appreciated.

Upvotes: 0

Views: 1973

Answers (2)

SeaBean
SeaBean

Reputation: 23207

You can use pd.to_datetime to convert the month-year string (e.g. JAN2021) to datetime format. Then, use dt.strftime() to format the datetime objects to the required layout in mmm-YYYY (e.g Jan-2021), as follows:

(Assuming your dataframe is called df and column name is Col1):

df['Col_new'] = pd.to_datetime(df['Col1'], format='%b%Y').dt.strftime('%b-%Y')

Result:

print(df)

      Col1   Col_new
0  JAN2021  Jan-2021
1  FEB2021  Feb-2021
2  MAR2021  Mar-2021
3  APR2021  Apr-2021

You have used an invalid format string "%mmm%YY" in your trial code. For a complete list of valid format strings and their meaning, with examples, you can refer to the Python official document here.

Upvotes: 2

MDR
MDR

Reputation: 2670

Converting will add a day to the format.

https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html

pd.to_datetime looks for standard designations of the datetime component in the column names, including:

required: year, month, day

optional: hour, minute, second, millisecond, microsecond, nanosecond

Hence when converting you get a day (time is optional):

pd.to_datetime('JAN2021', format='%b%Y')
>> Timestamp('2021-01-01 00:00:00')

Upvotes: 1

Related Questions