Sid
Sid

Reputation: 4055

Creating timedelta column from a datetime64[ns] column which has NaT values?

I am reading in a CSV file.

df = pd.read_csv('xyz.csv',parse_dates=['last_time'])

The dtype of last_tweeted column is datetime64[ns].

The column contains only 1 datetime64[ns] rest are all NaT for now.

df

     last_time
0      NaT
1      NaT
2      NaT
3      NaT
4      2020-07-07 15:53:26.798844

I want to make a new column time_since.

df['time_since'] = df[df['last_time'] - datetime.datetime.now()]

I read through a bunch of questions but was unable to figure out the issue.

I get the following error:

Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/IPython/core/interactiveshell.py", line 3331, in run_code exec(code_obj, self.user_global_ns, self.user_ns) File "", line 1, in df['trial'] = df[df['last_time'] - datetime.datetime.now()] File "/home/xxx/.local/lib/python3.6/site-packages/pandas/core/frame.py", line 2806, in getitem indexer = self.loc._get_listlike_indexer(key, axis=1, raise_missing=True)[1] File "/home/xxx/.local/lib/python3.6/site-packages/pandas/core/indexing.py", line 1553, in _get_listlike_indexer keyarr, indexer, o._get_axis_number(axis), raise_missing=raise_missing File "/home/xxx/.local/lib/python3.6/site-packages/pandas/core/indexing.py", line 1640, in _validate_read_indexer raise KeyError(f"None of [{key}] are in the [{axis_name}]") KeyError: "None of [TimedeltaIndex([NaT, NaT, NaT, NaT, NaT, NaT, NaT, NaT,\n '-1 days +23:06:31.564892', NaT, NaT],\n dtype='timedelta64[ns]', freq=None)] are in the [columns]"

What am I doing wrong? I assumed that the NaT's would be ignored for the calculation or I would get a timedelta column with a bunch of NaT's.

Upvotes: 1

Views: 527

Answers (1)

jezrael
jezrael

Reputation: 863226

Remove df[], it is used for boolean indexing by some mask:

df['time_since'] = df['last_time'] - datetime.datetime.now()

Upvotes: 1

Related Questions