Reputation: 32429
I have a lot of classes that all do the same: They receive an identifier (the PK in the DB) during construction and then are loaded from the DB. I am trying to cache the instances of these classes in order to minimize calls down to the DB. When the cache reaches a critical size, it should discard those cached objects that have been accessed least recently.
The caching actually seems to work fine, but somehow I cannot determine the memory usage of the cache (in the line after #Next line doesn't do what I expected
).
My code so far:
#! /usr/bin/python3.2
from datetime import datetime
import random
import sys
class Cache:
instance = None
def __new__ (cls):
if not cls.instance:
cls.instance = super ().__new__ (cls)
cls.instance.classes = {}
return cls.instance
def getObject (self, cls, ident):
if cls not in self.classes: return None
cls = self.classes [cls]
if ident not in cls: return None
return cls [ident]
def cache (self, object):
#Next line doesn't do what I expected
print (sys.getsizeof (self.classes) )
if object.__class__ not in self.classes:
self.classes [object.__class__] = {}
cls = self.classes [object.__class__]
cls [object.ident] = (object, datetime.now () )
class Cached:
def __init__ (self, cache):
self.cache = cache
def __call__ (self, cls):
cls.cache = self.cache
oNew = cls.__new__
def new (cls, ident):
cached = cls.cache ().getObject (cls, ident)
if not cached: return oNew (cls, ident)
cls.cache ().cache (cached [0] )
return cached [0]
cls.__new__ = new
def init (self, ident):
if hasattr (self, 'ident'): return
self.ident = ident
self.load ()
cls.__init__ = init
oLoad = cls.load
def load (self):
oLoad (self)
self.cache ().cache (self)
cls.load = load
return cls
@Cached (Cache)
class Person:
def load (self):
print ('Expensive call to DB')
print ('Loading Person {}'.format (self.ident) )
#Just simulating
self.name = random.choice ( ['Alice', 'Bob', 'Mallroy'] )
@Cached (Cache)
class Animal:
def load (self):
print ('Expensive call to DB')
print ('Loading Animal {}'.format (self.ident) )
#Just simulating
self.species = random.choice ( ['Dog', 'Cat', 'Iguana'] )
sys.getsizeof
returns funny values.
How can I determine the actual memory usage of all cached objects?
Upvotes: 1
Views: 644
Reputation: 16212
getsizeof
is pretty tricksy, here's an illustration of the fact:
getsizeof([]) # returns 72 ------------A
getsizeof([1,]) # returns 80 ------------B
getsizeof(1) # returns 24 ------------C
getsizeof([[1,],]) # returns 80 ------------D
getsizeof([[1,],1]) # returns 88 ------------E
Here's some stuff worth noting:
1
is 8 bytes more1
is not 8 bytes. The reason for this weirdness is that 1
exists separately to the list as a unique entity so line C returns the size of the entity while B returns the size of an empty list plus a reference to that entity.What I'm trying to get at here is that getsizeof can only help you get the sizes of things. You need to get the sizes of things as well as the sizes of the things those things refer to. This smells like recursion.
check out this recipe, it might help you out: http://code.activestate.com/recipes/546530/
Upvotes: 1