deep_butter
deep_butter

Reputation: 215

`in` operator doesn't work as expected when checking whether a pandas Series contains a given value

I have the following series zar which contain timestamps:

In [743]: zar
Out[743]: 
0   2019-01-01
1   2019-03-21
2   2019-04-19
3   2019-04-22
4   2019-04-27
5   2019-05-01
6   2019-06-17
7   2019-08-09
8   2019-09-24
9   2019-12-16
Name: zar, dtype: datetime64[ns]

In [744]: zar[5]
Out[744]: Timestamp('2019-05-01 00:00:00')

In [745]: j
Out[745]: Timestamp('2019-05-01 00:00:00')

In [746]: j in zar.values
Out[746]: False

since both contains timestamps why is the result False? I want to return True when matching timestamps in a series.

Upvotes: 1

Views: 356

Answers (1)

cs95
cs95

Reputation: 402663

j in zar will compare j with each element of zar's index.

For example,

0 in zar
# True

0 in zar.index
# True

This is also consistent with the behaviour of DataFrames, for which in by default does a membership test on columns.

df = pd.DataFrame(columns=['a', 'b', 'c'])
'a' in df
# True

'd' in df
# False

You'll need Series.eq (== operator) or Series.isin, along with Series.any.

(zar == j).any()

Details
zar == j returns a Series of bools:

(zar == j)

0    False
1    False
2    False
3    False
4    False
5     True
6    False
7    False
8    False
9    False
Name: zar, dtype: bool

You then call any which returns True if any of the rows are True. If you want the index of the True value, use np.flatnonzero:

np.flatnonzero(zar == j)
# array([5])

Upvotes: 2

Related Questions