roippi
roippi

Reputation: 25964

Why are (some) dict views hashable?

In python 3, the keys(), values(), and items() methods provide dynamic views of their respective elements. These were backported to python 2.7 and are available there as viewkeys, viewvalues, and viewitems. I'm referring to them interchangeably here.

Is there any reasonable explanation for this:

#!/usr/bin/python3.4
In [1]: hash({}.keys())
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-1-3727b260127e> in <module>()
----> 1 hash({}.keys())

TypeError: unhashable type: 'dict_keys'

In [2]: hash({}.items())
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-2-decac720f012> in <module>()
----> 1 hash({}.items())

TypeError: unhashable type: 'dict_items'

In [3]: hash({}.values())
Out[3]: -9223363248553358775

I found this rather surprising.


The python docs glossary on "hashable" says:

An object is hashable if it has a hash value which never changes during its lifetime (it needs a __hash__() method), and can be compared to other objects (it needs an __eq__() method). Hashable objects which compare equal must have the same hash value.

Okay, the first part actually checks out; it doesn't appear that the hash of a dict_values object will change over its lifetime - even though its underlying values certainly can.

In [11]: d = {}

In [12]: vals = d.values()

In [13]: vals.__hash__()
Out[13]: -9223363248553358718

In [14]: d['a'] = 'b'

In [15]: vals
Out[15]: dict_values(['b'])

In [16]: vals.__hash__()
Out[16]: -9223363248553358718

But the part about __eq__()... well, it doesn't have one of those, actually.

In [17]: {'a':'a'}.values().__eq__('something else')
Out[17]: NotImplemented

So... yeah. Can someone make any sense of this? Is there a reason for this asymmetry that of the three viewfoo methods, only dict_values objects are hashable?

Upvotes: 6

Views: 549

Answers (1)

dano
dano

Reputation: 94901

I believe this occurs because viewitems and viewkeys provide custom rich comparison functions, but viewvalues does not. Here are the definitions of each view type:

PyTypeObject PyDictKeys_Type = {
    PyVarObject_HEAD_INIT(&PyType_Type, 0)
    "dict_keys",                                /* tp_name */
    sizeof(dictviewobject),                     /* tp_basicsize */
    0,                                          /* tp_itemsize */
    /* methods */
    (destructor)dictview_dealloc,               /* tp_dealloc */
    0,                                          /* tp_print */
    0,                                          /* tp_getattr */
    0,                                          /* tp_setattr */
    0,                                          /* tp_reserved */
    (reprfunc)dictview_repr,                    /* tp_repr */
    &dictviews_as_number,                       /* tp_as_number */
    &dictkeys_as_sequence,                      /* tp_as_sequence */
    0,                                          /* tp_as_mapping */
    0,                                          /* tp_hash */
    0,                                          /* tp_call */
    0,                                          /* tp_str */
    PyObject_GenericGetAttr,                    /* tp_getattro */
    0,                                          /* tp_setattro */
    0,                                          /* tp_as_buffer */
    Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC,/* tp_flags */
    0,                                          /* tp_doc */
    (traverseproc)dictview_traverse,            /* tp_traverse */
    0,                                          /* tp_clear */
    dictview_richcompare,                       /* tp_richcompare */
    0,                                          /* tp_weaklistoffset */
    (getiterfunc)dictkeys_iter,                 /* tp_iter */
    0,                                          /* tp_iternext */
    dictkeys_methods,                           /* tp_methods */
    0,
};

PyTypeObject PyDictItems_Type = {
    PyVarObject_HEAD_INIT(&PyType_Type, 0)
    "dict_items",                               /* tp_name */
    sizeof(dictviewobject),                     /* tp_basicsize */
    0,                                          /* tp_itemsize */
    /* methods */
    (destructor)dictview_dealloc,               /* tp_dealloc */
    0,                                          /* tp_print */
    0,                                          /* tp_getattr */
    0,                                          /* tp_setattr */
    0,                                          /* tp_reserved */
    (reprfunc)dictview_repr,                    /* tp_repr */
    &dictviews_as_number,                       /* tp_as_number */
    &dictitems_as_sequence,                     /* tp_as_sequence */
    0,                                          /* tp_as_mapping */
    0,                                          /* tp_hash */
    0,                                          /* tp_call */
    0,                                          /* tp_str */
    PyObject_GenericGetAttr,                    /* tp_getattro */
    0,                                          /* tp_setattro */
    0,                                          /* tp_as_buffer */
    Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC,/* tp_flags */
    0,                                          /* tp_doc */
    (traverseproc)dictview_traverse,            /* tp_traverse */
    0,                                          /* tp_clear */
    dictview_richcompare,                       /* tp_richcompare */
    0,                                          /* tp_weaklistoffset */
    (getiterfunc)dictitems_iter,                /* tp_iter */
    0,                                          /* tp_iternext */
    dictitems_methods,                          /* tp_methods */
    0,
};

PyTypeObject PyDictValues_Type = {
    PyVarObject_HEAD_INIT(&PyType_Type, 0)
    "dict_values",                              /* tp_name */
    sizeof(dictviewobject),                     /* tp_basicsize */
    0,                                          /* tp_itemsize */
    /* methods */
    (destructor)dictview_dealloc,               /* tp_dealloc */
    0,                                          /* tp_print */
    0,                                          /* tp_getattr */
    0,                                          /* tp_setattr */
    0,                                          /* tp_reserved */
    (reprfunc)dictview_repr,                    /* tp_repr */
    0,                                          /* tp_as_number */
    &dictvalues_as_sequence,                    /* tp_as_sequence */
    0,                                          /* tp_as_mapping */
    0,                                          /* tp_hash */
    0,                                          /* tp_call */
    0,                                          /* tp_str */
    PyObject_GenericGetAttr,                    /* tp_getattro */
    0,                                          /* tp_setattro */
    0,                                          /* tp_as_buffer */
    Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC,/* tp_flags */
    0,                                          /* tp_doc */
    (traverseproc)dictview_traverse,            /* tp_traverse */
    0,                                          /* tp_clear */
    0,                                          /* tp_richcompare */
    0,                                          /* tp_weaklistoffset */
    (getiterfunc)dictvalues_iter,               /* tp_iter */
    0,                                          /* tp_iternext */
    dictvalues_methods,                         /* tp_methods */
    0,
};

Notice that tp_richcompare is defined as dictview_richcompare for items and keys, but not values. Now, the documentation for __hash__ says this:

A class that overrides __eq__() and does not define __hash__() will have its __hash__() implicitly set to None.

...

If a class that overrides __eq__() needs to retain the implementation of __hash__() from a parent class, the interpreter must be told this explicitly by setting __hash__ = <ParentClass>.__hash__.

If a class that does not override __eq__() wishes to suppress hash support, it should include __hash__ = None in the class definition.`

So, because items/keys are overriding __eq__() (by providing a tp_richcompare function), they would need to explicitly define __hash__ as being equal to the parent's to retain an implementation for it. Because values doesn't override __eq__(), it inherits the __hash__ from object, because tp_hash and tp_richcompare get inherited from the parent if they're both NULL:

This field is inherited by subtypes together with tp_richcompare: a subtype inherits both of tp_richcompare and tp_hash, when the subtype’s tp_richcompare and tp_hash are both NULL.

The fact that the impelmentation for dict_values isn't preventing this automatic inheritence would probably be considered a bug.

Upvotes: 8

Related Questions