Tech Fukrey
Tech Fukrey

Reputation: 39

Check if pandas dataframe date column is in correct date format or not?

I have a dataframe having multiple columns. One of the column is having dates of format (%m/%d/%Y) or having null values. I have to apply a check to make sure that date column contains date in correct format (mentioned above).

What I am trying to do is:

pd.to_datetime(df['DOB'], format='%m/%d/%Y', errors='coerce').all(skipna=True)

to check it has correct date format and empty values can be ignored, but I am getting this error,

TypeError: invalid_op() got an unexpected keyword argument 'skipna'

So, kindly let me know how to do it or what other logic I can apply ?

EDIT 1: Suppose data having 3 DOBs and 1 null value:

data = {"Name": ["James", "Alice", "Phil", "Jacob"], "DOB": ["07-01-1997", "06-02-1995", "", "03-07-2002"]}

Modifying DOB column to convert date as per my format and replacing empty fields with NaN:

df['DOB']=pd.to_datetime(df['DOB']).apply(lambda cell: cell.strftime(DATE_IN_MDY) if not pd.isnull(cell) else np.nan)

And in this case I want result to be true.

Upvotes: 2

Views: 6236

Answers (1)

jezrael
jezrael

Reputation: 862671

Idea is compare for empty strings OR (|) for missing values by Series.isna and then compare by possible added misisng values by parameter errors='coerce' in to_datetime:

data = {"Name": ["James", "Alice", "Phil", "Jacob"],
            "DOB": ["07-01-1997", "06-02-1995", "", "03-07-2002"]}

df = pd.DataFrame(data)

m1 = df['DOB'].eq('') | df['DOB'].isna()
m2 = pd.to_datetime(df['DOB'], errors='coerce').isna()

print (m1.eq(m2).all())
True

Sample for return False, because wrong datetime:

data = {"Name": ["James", "Alice", "Phil", "Jacob"],
            "DOB": ["07-01-1997", "06-02-1995", "", "03-97-2002"]}

df = pd.DataFrame(data)

m1 = df['DOB'].eq('') | df['DOB'].isna()
m2 = pd.to_datetime(df['DOB'], errors='coerce').isna()

print (m1.eq(m2).all())
False

Upvotes: 2

Related Questions