Python: OOP overhead?

Question

I've been working on a real-time application and noticed that some OOP design patterns introduce incredible overheads in Python (tested with 2.7.5).

Being straightforward, why do simple accessor methods of dictionary values take almost 5x more time when the dictionary is encapsulated by another object?

For instance, running the code below, I got:

Dict Access: 0.167706012726
Attribute Access: 0.191128969193
Method Wrapper Access: 0.711422920227
Property Wrapper Access: 0.932291030884

Executable code:

class Wrapper(object):
    def __init__(self, data):
        self._data = data

    @property
    def id(self):
        return self._data['id']

    @property
    def name(self):
        return self._data['name']

    @property
    def score(self):
        return self._data['score']


class MethodWrapper(object):
    def __init__(self, data):
        self._data = data

    def id(self):
        return self._data['id']

    def name(self):
        return self._data['name']

    def score(self):
        return self._data['score']


class Raw(object):
    def __init__(self, id, name, score):
        self.id = id
        self.name = name
        self.score = score


data = {'id': 1234, 'name': 'john', 'score': 90}
wp = Wrapper(data)
mwp = MethodWrapper(data)
obj = Raw(data['id'], data['name'], data['score'])


def dict_access():
    for _ in xrange(100):
        uid = data['id']
        name = data['name']
        score = data['score']


def method_wrapper_access():
    for _ in xrange(100):
        uid = mwp.id()
        name = mwp.name()
        score = mwp.score()


def property_wrapper_access():
    for _ in xrange(100):
        uid = wp.id
        name = wp.name
        score = wp.score


def object_access():
    for _ in xrange(100):
        uid = obj.id
        name = obj.name
        score = obj.score


import timeit
print 'Dict Access:', timeit.timeit("dict_access()", setup="from __main__ import dict_access", number=10000)
print 'Attribute Access:', timeit.timeit("object_access()", setup="from __main__ import object_access", number=10000)
print 'Method Wrapper Access:', timeit.timeit("method_wrapper_access()", setup="from __main__ import method_wrapper_access", number=10000)
print 'Property Wrapper Access:', timeit.timeit("property_wrapper_access()", setup="from __main__ import property_wrapper_access", number=10000)

Jason S · Accepted Answer

This is because of the dynamic lookups the Python interpreter (CPython) is doing to dispatch all of your calls, indexing, etc. Dynamic lookups allow a great deal of flexibility in the language but at a performance cost. When you use the "Method Wrapper", this (at least) is happening:

look up mwp.id - it happens to be a method, but it's also just an object assigned to an attribute and has to be looked up like any other
call mwp.id()
inside the method, look up self._data
look up the __getitem__ of self._data
call the __getitem__ (this at least will be a C function but you still had to go through all those dynamic lookups to get here)

By comparison, your "Dict Access" test case only has to look up the __getitem__ and then invoke it.

As Matteo Italia pointed out in a comment, this is implementation specific. In the Python ecosystem now you also have PyPy (uses a JIT and runtime optimization), Cython (compiles to C, with optional static type annotations etc), Nuitka (compiles to C++, supposed to take code as-is), and multiple other implementations.

One way of optimizing these lookups in "pure" Python on CPython is to get direct references to objects and assign them to local variables outside of loops, and then use the local variables inside loops. This is an optimization that potentially comes at the cost of cluttering the code and/or breaking encapsulation.

Python: OOP overhead?

Answers (1)

Related Questions