Pierre Corbin
Pierre Corbin

Reputation: 62

Pandas find values other than date

I have a pandas DataFrame in which I have columns containing dates.

I need to make sure that nothing else but dates are contained in these columns. Does anyone have any advice on how to do this? I thought about simply finding which rows contain data of a type other than date - although I don't know how to code this.

Any help would be appreciated!

Upvotes: 0

Views: 1506

Answers (1)

jezrael
jezrael

Reputation: 863611

If data not contains NaNs or Nones values and need check if all string valus can be converted to datetime use apply with to_datetime and parameter errors='coerce' what return NaT if some value cannot be parsed. So then add notnull with all for mask and select with loc:

df = pd.DataFrame({'a':['2015-02-04','2015-02-05','2015-02-06'],
                   'b':['2015-02-06','2015-02-06', 'u'], 
                   'c':['2015-01-01','d','2015-02-06']})
print (df)
            a           b           c
0  2015-02-04  2015-02-06  2015-01-01
1  2015-02-05  2015-02-06           d
2  2015-02-06           u  2015-02-06

cols = ['a','b','c']
mask = df[cols].apply(pd.to_datetime, errors='coerce').notnull().all()
print (mask)
a     True
b    False
c    False
dtype: bool

print (df.loc[:, mask])
            a
0  2015-02-04
1  2015-02-05
2  2015-02-06

Or if need check if some column has already dtype datetime use DataFrame.select_dtypes:

df['a'] = pd.to_datetime(df['a'])
print (df)
           a           b           c
0 2015-02-04  2015-02-06  2015-01-01
1 2015-02-05  2015-02-06           d
2 2015-02-06           u  2015-02-06

print (df.dtypes)
a    datetime64[ns]
b            object
c            object
dtype: object

print (df.select_dtypes(include=['datetime']))
           a
0 2015-02-04
1 2015-02-05
2 2015-02-06

Upvotes: 2

Related Questions