Reputation: 362766
Given a list iterator, you can find the original list via the pickle protocol:
>>> L = [1, 2, 3]
>>> Li = iter(L)
>>> Li.__reduce__()[1][0] is L
True
Given a dict iterator, how can you find the original dict? I could only find a hacky way using CPython implementation details (via garbage collector):
>>> def get_dict(dict_iterator):
... [d] = gc.get_referents(dict_iterator)
... return d
...
>>> d = {}
>>> get_dict(iter(d)) is d
True
Upvotes: 7
Views: 693
Reputation: 1122152
There is no API to find the source iterable object from an iterator. This is intentional, iterators are seen as single-use objects; iterate and discard. A such, they often drop their iterable reference once they have reached the end; what's the point of keeping it if you can't get more elements, anyway?
You see this in both the list and dict iterators, the hacks you found either produce empty objects or None
once you are done iterating. List iterators use an empty list when pickled:
>>> l = [1]
>>> it = iter(l)
>>> it.__reduce__()[1][0] is l
True
>>> list(it) # exhaust the iterator
[1]
>>> it.__reduce__()[1][0] is l
False
>>> it.__reduce__()[1][0]
[]
and the dictionary iterator just sets the pointer to the original dictionary to null, so there are no referents left after that:
>>> import gc
>>> it = iter({'foo': 42})
>>> gc.get_referents(it)
[{'foo': 42}]
>>> list(it)
['foo']
>>> gc.get_referents(it)
[]
Both your hacks are just that: hacks. They are implementation dependent and can and probably will change between Python releases. Currently, using iter(dictionary).__reduce__()
gets you the equivalent of iter, list(copy(self))
and rather than access to the dictionary because that's deemed a better implementation, but future versions might use something different altogether, etc.
For dictionaries, the only other option currently available is to access the di_dict
pointer in the dictiter
struct, with ctypes:
import ctypes
class PyObject_HEAD(ctypes.Structure):
_fields_ = [
("ob_refcnt", ctypes.c_ssize_t),
("ob_type", ctypes.c_void_p),
]
class dictiterobject(ctypes.Structure):
_fields_ = [
("ob_base", PyObject_HEAD),
("di_dict", ctypes.py_object),
("di_used", ctypes.c_ssize_t),
("di_pos", ctypes.c_ssize_t),
("di_result", ctypes.py_object), # always NULL for dictkeys_iter
("len", ctypes.c_ssize_t),
]
def dict_from_dictiter(it):
di = dictiterobject.from_address(id(it))
try:
return di.di_dict
except ValueError: # null pointer
return None
This is just as much of a hack as relying on gc.get_referents()
:
>>> d = {'foo': 42}
>>> it = iter(d)
>>> dict_from_dictiter(it)
{'foo': 42}
>>> dict_from_dictiter(it) is d
True
>>> list(it)
['foo']
>>> dict_from_dictiter(it) is None
True
For now, at least in CPython versions up to and including Python 3.8, there are no other options available.
Upvotes: 7