Reputation: 4055
I am reading in a CSV file.
df = pd.read_csv('xyz.csv',parse_dates=['last_time'])
The dtype
of last_tweeted
column is datetime64[ns]
.
The column contains only 1 datetime64[ns]
rest are all NaT
for now.
df
last_time
0 NaT
1 NaT
2 NaT
3 NaT
4 2020-07-07 15:53:26.798844
I want to make a new column time_since
.
df['time_since'] = df[df['last_time'] - datetime.datetime.now()]
I read through a bunch of questions but was unable to figure out the issue.
I get the following error:
Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/IPython/core/interactiveshell.py", line 3331, in run_code exec(code_obj, self.user_global_ns, self.user_ns) File "", line 1, in df['trial'] = df[df['last_time'] - datetime.datetime.now()] File "/home/xxx/.local/lib/python3.6/site-packages/pandas/core/frame.py", line 2806, in getitem indexer = self.loc._get_listlike_indexer(key, axis=1, raise_missing=True)[1] File "/home/xxx/.local/lib/python3.6/site-packages/pandas/core/indexing.py", line 1553, in _get_listlike_indexer keyarr, indexer, o._get_axis_number(axis), raise_missing=raise_missing File "/home/xxx/.local/lib/python3.6/site-packages/pandas/core/indexing.py", line 1640, in _validate_read_indexer raise KeyError(f"None of [{key}] are in the [{axis_name}]") KeyError: "None of [TimedeltaIndex([NaT, NaT, NaT, NaT, NaT, NaT, NaT, NaT,\n '-1 days +23:06:31.564892', NaT, NaT],\n dtype='timedelta64[ns]', freq=None)] are in the [columns]"
What am I doing wrong? I assumed that the NaT
's would be ignored for the calculation or I would get a timedelta
column with a bunch of NaT
's.
Upvotes: 1
Views: 527
Reputation: 863226
Remove df[]
, it is used for boolean indexing
by some mask:
df['time_since'] = df['last_time'] - datetime.datetime.now()
Upvotes: 1