Reputation: 25
I have a dataframe with a column that has time stamp data with some nulls. I am trying to replace the nulls with the earliest date in the column using np.where.
The dataframe looks like this:
index date
1 2019-06-30 22:40:25.799000+00:00
2 2019-06-30 22:40:25.799000+00:00
3 NaN
I'm writing my code as:
mini = df['date'].min()
df['date'] = np.where(df['date'].isnull(), mini, df['date'])
but the resulting column for date gives me a Unix Timestamp, with the NaN's filled correctly:
index date
1 1552685510470841000
2 1555706405810536000
3 2015-05-07 13:49:51.072000+00:00
Why does this happen and how do I get it to all be timestamps?
Upvotes: 1
Views: 161
Reputation: 93181
numpy has a bias toward treating array's elements as floats. It saw NaN and Timestamp are both representable as floats so it converts df['date']
to float.
You can use fillna
instead:
df['date'].fillna(mini, inplace=True)
Upvotes: 2