Reputation: 62
I have a pandas DataFrame in which I have columns containing dates.
I need to make sure that nothing else but dates are contained in these columns. Does anyone have any advice on how to do this? I thought about simply finding which rows contain data of a type other than date - although I don't know how to code this.
Any help would be appreciated!
Upvotes: 0
Views: 1506
Reputation: 863611
If data not contains NaN
s or None
s values and need check if all string valus can be converted to datetime
use apply
with to_datetime
and parameter errors='coerce'
what return NaT
if some value cannot be parsed. So then add notnull
with all
for mask and select with loc
:
df = pd.DataFrame({'a':['2015-02-04','2015-02-05','2015-02-06'],
'b':['2015-02-06','2015-02-06', 'u'],
'c':['2015-01-01','d','2015-02-06']})
print (df)
a b c
0 2015-02-04 2015-02-06 2015-01-01
1 2015-02-05 2015-02-06 d
2 2015-02-06 u 2015-02-06
cols = ['a','b','c']
mask = df[cols].apply(pd.to_datetime, errors='coerce').notnull().all()
print (mask)
a True
b False
c False
dtype: bool
print (df.loc[:, mask])
a
0 2015-02-04
1 2015-02-05
2 2015-02-06
Or if need check if some column has already dtype
datetime
use DataFrame.select_dtypes
:
df['a'] = pd.to_datetime(df['a'])
print (df)
a b c
0 2015-02-04 2015-02-06 2015-01-01
1 2015-02-05 2015-02-06 d
2 2015-02-06 u 2015-02-06
print (df.dtypes)
a datetime64[ns]
b object
c object
dtype: object
print (df.select_dtypes(include=['datetime']))
a
0 2015-02-04
1 2015-02-05
2 2015-02-06
Upvotes: 2