Reputation: 4244
I'm building a client library for a web API that exposes some objects like this:
# objs/foo/obj_id_1
{id: "foo_id_1"
name: "your momma"
bar: "bar_id_2"}
# objs/bar/obj_id_1
{id: "bar_id_2"
name: "puh-lease"
foo: "foo_id_1"}
So, objects of type foo
have a reference to objects of type bar
and vice versa.
In my python client library, I build python objects out of this API data. A Foo
instance will have a bar
attribute containing a Bar
instance instead of just the id of the bar
and a Bar
instance has a reference to a Foo
instance in the foo
attribute.
My python classes have save
and refresh
methods that POST or GET from the web API, and they do this for sub-objects. For example, if I call a_foo.refresh()
, this will also automatically call a_foo.bar.refresh()
.
The above is a very simplified example, and there might be many different classes all referring to the same instance of bar
or foo
or any of the many other types of objects we get.
I think my question is actually two questions:
save
or refresh
what's a good design or strategy to prevent an infinite loop when two or more objects refer to each other?Upvotes: 0
Views: 60
Reputation: 32870
For #1, you can use a 2-level dictionary holding the fetched objects
objectCache
|- foo
| |- foo_id_1 : <obj>
| |- foo_id_2 : <obj>
| |- ...
|
|- bar
| |- bar_id_1: <obj>
| |- bar_id_2: <obj>
| |- ...
|
|- baz
| |- baz_id_1: <obj>
| |- baz_id_2: <obj>
| |- ...
|
|- ...
You'd use it like this:
def get_or_make_fetched_object(cls, id):
return object_cache.setdefault(cls, {}).setdefault(id, cls())
my_foo = get_or_make_fetched_object(Foo, 'foo_id_1')
The trick part here is how to get rid of no longer referenced objects (for example if foo_id_1 switches from bar_id_1 to bar_id_35), in order to avoid memory leaks, as the object_cache
dictionary will indefinitely keep a reference to the object, unless is removed from cache.
A possible approach to the memory usage problem would be a cleanup function that uses gc.get_referrers() to obtain the refcount
for each cached object. Basically objects in cache that have refcount
equal to 2 can be removed from the cache (1 count comes from gc
, the other one from the cache). This will not work for circular references though...
As for #2, you can timestamp an object with a save date, objects that match the current saving timestamp will be skipped in order to avoid infinite recursion. However it would make sense to save only subordinated objects.
Upvotes: 1
Reputation: 6326
This is a pretty broad question.
1) One way would be to use the factory method pattern. Example:
def get_foo(_id):
get_foo.foos = dict()
def real_get_foo(_id):
# get foo from the API
get_foo.foos[_id] = foo
return foo
if _id in get_foo.foos:
return get_foo.foos[_id]
return real_get_foo(_id)
2) I don't think it's a good idea to make nested saves. If I write foo.bar.x = 5
followed by foo.save()
I wouldn't expect bar
to get saved. Why? Because I called save()
on foo
, and I shouldn't have to worry about unwanted saves on related objects.
Upvotes: 1