Reputation: 43
While writing some program involving numpy, I found that membership test doesn't work as expected for numpy dtype objects. Specifically, the result is unexpected for set
, but not list
or tuple
.
import numpy as np
x = np.arange(5).dtype
y = np.int64
print(x in {y}, x in (y,), x in [y])
the result is False True True
.
found this in both Python 2.7 and 3.6, with numpy 1.12.x installed.
Any idea why?
UPDATE
looks that dtype objects don't respect some assumptions about hashing in Python.
http://www.asmeurer.com/blog/posts/what-happens-when-you-mess-with-hashing-in-python/
and https://github.com/numpy/numpy/issues/5345
Thanks @ser2357112 and @Fabien
Upvotes: 1
Views: 91
Reputation: 280733
The __hash__
and __eq__
implementations of dtype objects were pretty poorly thought out. Among other problems, the __hash__
and __eq__
implementations aren't consistent with each other. You're seeing the effects of that here.
Some other problems with dtype __hash__
and __eq__
are that
__hash__
and __eq__
, something that should never be true of a hashable object. (Specifically, you can reassign the names
of a structured dtype.)x
and y
in your question, we have x == y
and x == 'int64'
, but y != 'int64'
.__eq__
raises TypeError
when it should return NotImplemented
.You could submit a bug report, but looking at existing bug reports relating to those methods, it's unlikely to be fixed. The design is too much of a mess, and people are already relying on the broken parts.
Upvotes: 2
Reputation: 4972
The difference lies in how sets
implement the in
keyword in Python.
Lists simply examine each object, checking for equality. Sets first hash the objects.
This is because sets must ensure uniqueness. But your objects are not equivalent:
>>> x
dtype('int64')
>>> y
<class 'numpy.int64'>
Hashing them probably delivers different results.
Upvotes: 0