Reputation: 626
Let's start with a dataframe that looks like this:
import datetime
import numpy as np
from dateutil import tz
from pandas import DataFrame
date1 = datetime.datetime(2021, 4, 1, 9, 15, 0, 0, tzinfo=tz.tzoffset(None, -5 * 60 * 60))
df = DataFrame({"date": [date1], "a": [1.0], "b": [2.0]})
In this dataframe, dtypes
are TimeStamp
, float64
and float64
. Now I need to insert a row to another date with NaN
values. I did something like this:
date2 = date1 + timedelta(seconds=300)
row = {"date": date2, "a": np.nan, "b": np.nan}
df = df.append(row, ignore_index=True)
My problem is that this new row was inserted with NaT
s instead of NaN
s. Dataframe dtypes
become TimeStamp
, object
and object
. NaN
s have been converted as TimeStamp
s, which is not what I expected.
Any ideas why this is happening and how to avoid it? I want NaN
s to remain as float
in my dataframe.
I also tried the following:
date2 = date1 + timedelta(seconds=300)
row = {"date": np.nan, "a": np.nan, "b": np.nan}
df = df.append(row, ignore_index=True)
df.iloc[0, 1] = date2
Doing that, my dataframe dtypes
become object
, float64
and float64
. NaN
s remain numeric, but now dates are plain object
s.
To give some context, this dataframe is built using another module that connects to a database to extract time series between two dates. Dates in these series have a 5-minute gap, but data may be missing at a given date. I have to insert rows of NaN
s in this dataframe for missing dates.
Thanks in advance.
Upvotes: 0
Views: 342
Reputation: 1151
Another way to get around this issue would be to convert your row to a dataframe first:
date2 = date1 + timedelta(seconds=300)
row = {"date": date2, "a": np.nan, "b": np.nan}
df = df.append(pd.DataFrame([row]), ignore_index=True)
I don't know why this issue occurs in the first place though
Upvotes: 1