Reputation: 485
I have a pandas dataframe like this:
date_1 | date_2 | date_3 |
---|---|---|
0020-01-31 | 0020-01-31 | 2020-01-31 |
0021-01-01 | 0021-12-31 | 2021-02-28 |
0021-01-01 | 0021-12-31 | 2021-02-28 |
AS the date_1 is string, so I have to replace 0020 to 2020 first as it is year but I am getting no change in the dataframe
my_df = my_df.replace(regex=['0020'], value='2020')
my_df.head()
Thanks for Help in advance!
Upvotes: 1
Views: 453
Reputation: 3379
Well in theory you'd like to correct all the bad dates in every column, not just 2020
:
import pandas as pd
df = pd.DataFrame({"date_1":["0020-01-01", "0021-01-02", "0000-01-03"], "date_2":["0020-01-01", "0021-01-02", "0000-01-03"], "date_3":["2020-01-01", "2021-01-02", "2000-01-03"]})
df.apply(lambda x:x.str.replace("^00", "20", regex=True))
FOLLOW UP
If some of columns are made of Timestamp
and not str
, the method above will fail. The following will avoid that problem but it needs the help of library re
(as far as I know, you cannot use regex with normal str replace
):
import re
import pandas as pd
df["date_3"] = pd.to_datetime(df["date_3"])
def swap(x): return re.sub("^00", "20", x) if type(x) is str else x
df.applymap(swap)
OUTPUT
date1 date2 date3
0 2020-01-01 2020-01-01 2020-01-01
1 2021-01-02 2021-01-02 2021-01-02
2 2000-01-03 2000-01-03 2000-01-03
If you check .dtypes
:
date_1 object
date_2 object
date_3 object
dtype: object
Upvotes: 1