Inserting a row in a Pandas dataframe converts NaN into NaT

Question

Let's start with a dataframe that looks like this:

import datetime
import numpy as np
from dateutil import tz
from pandas import DataFrame

date1 = datetime.datetime(2021, 4, 1, 9, 15, 0, 0, tzinfo=tz.tzoffset(None, -5 * 60 * 60))
df = DataFrame({"date": [date1], "a": [1.0], "b": [2.0]})

In this dataframe, dtypes are TimeStamp, float64 and float64. Now I need to insert a row to another date with NaN values. I did something like this:

date2 = date1 + timedelta(seconds=300)
row = {"date": date2, "a": np.nan, "b": np.nan}
df = df.append(row, ignore_index=True)

My problem is that this new row was inserted with NaTs instead of NaNs. Dataframe dtypes become TimeStamp, object and object. NaNs have been converted as TimeStamps, which is not what I expected.

Any ideas why this is happening and how to avoid it? I want NaNs to remain as float in my dataframe.

I also tried the following:

date2 = date1 + timedelta(seconds=300)
row = {"date": np.nan, "a": np.nan, "b": np.nan}
df = df.append(row, ignore_index=True)
df.iloc[0, 1] = date2

Doing that, my dataframe dtypes become object, float64 and float64. NaNs remain numeric, but now dates are plain objects.

To give some context, this dataframe is built using another module that connects to a database to extract time series between two dates. Dates in these series have a 5-minute gap, but data may be missing at a given date. I have to insert rows of NaNs in this dataframe for missing dates.

Thanks in advance.

Daniel Wyatt · Accepted Answer

Another way to get around this issue would be to convert your row to a dataframe first:

date2 = date1 + timedelta(seconds=300)
row = {"date": date2, "a": np.nan, "b": np.nan}
df = df.append(pd.DataFrame([row]), ignore_index=True)

I don't know why this issue occurs in the first place though

Inserting a row in a Pandas dataframe converts NaN into NaT

Answers (1)

Related Questions