Reputation: 992
I've been reading into how super()
works. I came across this recipe that demonstrates how to create an Ordered Counter:
from collections import Counter, OrderedDict
class OrderedCounter(Counter, OrderedDict):
'Counter that remembers the order elements are first seen'
def __repr__(self):
return '%s(%r)' % (self.__class__.__name__,
OrderedDict(self))
def __reduce__(self):
return self.__class__, (OrderedDict(self),)
For example:
oc = OrderedCounter('adddddbracadabra')
print(oc)
OrderedCounter(OrderedDict([('a', 5), ('d', 6), ('b', 2), ('r', 2), ('c', 1)]))
Is someone able to explain how this magically works?
This also appears in the Python documentation.
Upvotes: 27
Views: 17277
Reputation: 11
I found this way of creating the ordered counter the easiest in python3.
By casting the Counter
to dict
, print will use __repr__
method of dict
which will make sure, order is maintained!
from collections import Counter
c = Counter(['apple', 'banana', 'cherry', 'mango', 'apple', 'pie', 'mango'])
OC = dict(c)
print(OC)
Output:
{'apple': 2, 'banana': 1, 'cherry': 1, 'mango': 2, 'pie': 1}
Upvotes: 1
Reputation: 41
I think we need to represent those methods repr
and reduce
in the class when words are given as input.
Without repr
and reduce
:
from collections import Counter, OrderedDict
class OrderedCounter(Counter, OrderedDict):
pass
oc = OrderedCounter(['apple', 'banana', 'cherry', 'mango', 'apple', 'pie', 'mango'])
print(oc)
Output:
OrderedCounter({'apple': 2, 'mango': 2, 'banana': 1, 'cherry': 1, 'pie': 1})
The order in the above example is not preserved.
With repr
and reduce
:
from collections import Counter, OrderedDict
class OrderedCounter(Counter, OrderedDict):
'Counter that remembers the order elements are first encountered'
def __repr__(self):
return '%s(%r)' % (self.__class__.__name__, OrderedDict(self))
def __reduce__(self):
return self.__class__, (OrderedDict(self),)
oc = OrderedCounter(['apple', 'banana', 'cherry', 'mango', 'apple', 'pie', 'mango'])
print(oc)
Output:
OrderedCounter(OrderedDict([('apple', 2), ('banana', 1), ('cherry', 1), ('mango', 2), ('pie', 1)]))
Upvotes: 3
Reputation: 4418
OrderedCounter is given as an example in the OrderedDict documentation, and works without needing to override any methods:
class OrderedCounter(Counter, OrderedDict):
pass
When a class method is called, Python has to find the correct method to execute. There is a defined order in which it searches the class hierarchy called the "method resolution order" or mro. The mro is stored in the attribute __mro__
:
OrderedCounter.__mro__
(<class '__main__.OrderedCounter'>, <class 'collections.Counter'>, <class 'collections.OrderedDict'>, <class 'dict'>, <class 'object'>)
When an instance of an OrderedDict is calling __setitem__()
, it searches the classes in order: OrderedCounter
, Counter
, OrderedDict
(where it is found). So an statement like oc['a'] = 0
ends up calling OrderedDict.__setitem__()
.
In contrast, __getitem__
is not overridden by any of the subclasses in the mro, so count = oc['a']
is handled by dict.__getitem__()
.
oc = OrderedCounter()
oc['a'] = 1 # this call uses OrderedDict.__setitem__
count = oc['a'] # this call uses dict.__getitem__
A more interesting call sequence occurs for a statement like oc.update('foobar').
First, Counter.update()
gets called. The code for Counter.update()
uses self[elem], which gets turned into a call to OrderedDict.__setitem__()
. And the code for that calls dict.__setitem__()
.
If the base classes are reversed, it no longer works. Because the mro is different and the wrong methods get called.
class OrderedCounter(OrderedDict, Counter): # <<<== doesn't work
pass
More info on mro can be found in the Python 2.3 documentation.
Upvotes: 41