van dam
van dam

Reputation: 69

Cleaning up date data

I have a dataset where the value for date is either looking like "updated 11 April 2022" or "11-Apr-22". anyway to unify the dates and make them all the exact same format?

I know I can remove the "update" with regex but how do I turn "11 April 2022" to "11-Apr-22" or vice versa?

Upvotes: 1

Views: 226

Answers (1)

mozway
mozway

Reputation: 260890

IIUC, you can use pandas.to_datetime combined with str.replace:

df['col'] = (pd.to_datetime(df['col'].replace(r'updated\s*', '', regex=True))
               .dt.strftime('%d-%b-%y')
             )

For 11 April 2022 use %d %B %Y as strftime format output:

0    11-Apr-22
1    11-Apr-22
Name: col, dtype: object

Input data:

df = pd.DataFrame({'col': ["updated 11 April 2022", "11-Apr-22"]})

Upvotes: 2

Related Questions