Reputation: 1094
I came across a very strange behaviour in Python.
Using a class derived from UserDict
, the iterator a.items()
behaves differently in a for loop than a.data.items()
, even though the two are identical:
Python 3.3.1 (default, Apr 17 2013, 22:32:14)
[GCC 4.7.3] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from datastruct import QueueDict
>>> a=QueueDict(maxsize=1700)
>>> for i in range(1000):
... a[str(i)]=1/(i+1)
...
>>> a.items()
ItemsView(OrderedDict([('991', 0.0010080645161290322), ('992', 0.0010070493454179255), ('993', 0.001006036217303823), ('994', 0.0010050251256281408), ('995', 0.001004016064257028), ('996', 0.0010030090270812437), ('997', 0.001002004008016032), ('998', 0.001001001001001001), ('999', 0.001)]))
>>> a.data.items()
ItemsView(OrderedDict([('991', 0.0010080645161290322), ('992', 0.0010070493454179255), ('993', 0.001006036217303823), ('994', 0.0010050251256281408), ('995', 0.001004016064257028), ('996', 0.0010030090270812437), ('997', 0.001002004008016032), ('998', 0.001001001001001001), ('999', 0.001)]))
>>> a.items()==a.data.items()
True
>>> # nevertheless:
...
>>> for item in a.items(): print(item)
...
('992', 0.0010070493454179255)
>>> for item in a.data.items(): print(item)
...
('993', 0.001006036217303823)
('994', 0.0010050251256281408)
('995', 0.001004016064257028)
('996', 0.0010030090270812437)
('997', 0.001002004008016032)
('998', 0.001001001001001001)
('999', 0.001)
('991', 0.0010080645161290322)
('992', 0.0010070493454179255)
>>>
The class definition is as follows:
import collections, sys
class QueueDict(collections.UserDict):
def __init__(self, maxsize=1*((2**10)**2), *args, **kwargs ):
self._maxsize=maxsize
super().__init__(*args, **kwargs)
self.data=collections.OrderedDict(self.data)
def __getitem__(self, key):
self.data.move_to_end(key)
return super().__getitem__(key)
def __setitem__(self, key, value):
super().__setitem__(key, value)
self._purge()
def _purge(self):
while sys.getsizeof(self.data) > self._maxsize:
self.data.popitem(last=False)
This is quite disturbing. Any ideas how the same object [by "visual" inspection, and also by (a.items()==a.data.items()) == True
] can, and why it does, behave differently in the for loop?
Thanks for your help and ideas!
Upvotes: 2
Views: 149
Reputation: 180927
Changing a collection while iterating can have (and in this case has) some unexpected consequences.
Your getter;
def __getitem__(self, key):
self.data.move_to_end(key)
return super().__getitem__(key)
...moves the current key to the end of the collection, and that will make the for loop over a.items
stop since it thinks it reached the end of the collection.
Commenting the move_to_end
line allows the iteration to run as expected.
When you're iterating over a.data.items
, your getter is never invoked so there it's not a problem.
Upvotes: 2