Reputation: 51
I am new to python. I have a data-frame which has a date column in it, it has different formats. I would like to check if it is following particular date format or not. I it is not following I want to drop it. I have tried using try except and iterating over the rows. But I am looking for a faster way to check if the column is following a particular date format or not. If it is not following then it has to drop. Is there any faster way to do it? Using DATE TIME library?
My code:
Date_format = %Y%m%d
df =
Date abc
0 2020-03-22 q
1 03-12-2020 w
2 55552020 e
3 25122020 r
4 12/25/2020 r
5 1212202033 y
Excepted out:
Date abc
0 2020-03-22 q
Upvotes: 1
Views: 2276
Reputation: 8906
You could try
pd.to_datetime(df.Date, errors='coerce')
0 2020-03-22
1 2020-03-12
2 NaT
3 NaT
4 2020-12-25
5 NaT
It's easy to drop the null values then
EDIT:
For a given format you can still leverage pd.to_datetime
:
datetimes = pd.to_datetime(df.Date, format='%Y-%m-%d', errors='coerce')
datetimes
0 2020-03-22
1 NaT
2 NaT
3 NaT
4 NaT
5 NaT
df.loc[datetimes.notnull()]
Also note I am using the format %Y-%m-%d
which I think is the one you want based on your expected output (not the one you gave as Date_format
)
Upvotes: 2