user3217125
user3217125

Reputation: 649

Pandas DatetimeIndex indexing dtype: datetime64 vs Timestamp

Indexing a pandas DatetimeIndex (with dtype numpy datetime64[ns]) returns either:

The confusing part is that Timestamps do not equal np.datetime64, so that:

import numpy as np
import pandas as pd

a_datetimeindex = pd.date_range('1/1/2016', '1/2/2016', freq = 'D')
print np.in1d(a_datetimeindex[0], a_datetimeindex)

Returns false. But:

print np.in1d(a_datetimeindex[0:1], a_datetimeindex)
print np.in1d(np.datetime64(a_datetimeindex[0]), a_datetimeindex)

Returns the right results.

I guess that is because np.datetime64[ns] has accuracy to the nanosecond, but the Timestamp is truncated?

My question is, is there a way to create the DatetimeIndex so that it always indexes to the same (or comparable) data type?

Upvotes: 2

Views: 2496

Answers (1)

ptrj
ptrj

Reputation: 5222

You are using numpy functions to manipulate pandas types. They are not always compatible.

The function np.in1d first converts its both arguments to ndarrays. A DatetimeIndex has a built-in conversion and an array of dtype np.datetime64 is returned (it's DatetimIndex.values). But a Timestamp doesn't have such a facility and it's not converted.

Instead, you can use for example a python keyword in (the most natural way):

a_datetimeindex[0] in a_datetimeindex

or an Index.isin method for a collection of elements

a_datetimeindex.isin(a_list_or_index)

If you want to use np.in1d, explicitly convert both arguments to numpy types. Or call it on the underlying numpy arrays:

np.in1d(a_datetimeindex.values[0], a_datetimeindex.values)

Alternatively, it's probably safe to use np.in1d with two collections of the same type:

np.in1d(a_datetimeindex, another_datetimeindex)

or even

np.in1d(a_datetimeindex[[0]], a_datetimeindex)

Upvotes: 2

Related Questions