Dustin Wyatt
Dustin Wyatt

Reputation: 4244

How to handle python objects built from a web API that have references to each other?

I'm building a client library for a web API that exposes some objects like this:

# objs/foo/obj_id_1
{id: "foo_id_1"
 name: "your momma"
 bar: "bar_id_2"}

# objs/bar/obj_id_1
{id: "bar_id_2"
 name: "puh-lease"
 foo: "foo_id_1"}

So, objects of type foo have a reference to objects of type bar and vice versa.

In my python client library, I build python objects out of this API data. A Foo instance will have a bar attribute containing a Bar instance instead of just the id of the bar and a Bar instance has a reference to a Foo instance in the foo attribute.

My python classes have save and refresh methods that POST or GET from the web API, and they do this for sub-objects. For example, if I call a_foo.refresh(), this will also automatically call a_foo.bar.refresh().

The above is a very simplified example, and there might be many different classes all referring to the same instance of bar or foo or any of the many other types of objects we get.

I think my question is actually two questions:

  1. What is a good design or strategy to ensure that when I build an object from api data that all of its references to other objects point to the same object if I've already built that object from previous api requests?
  2. When I call save or refresh what's a good design or strategy to prevent an infinite loop when two or more objects refer to each other?

Upvotes: 0

Views: 60

Answers (2)

Cristik
Cristik

Reputation: 32870

For #1, you can use a 2-level dictionary holding the fetched objects

objectCache
    |- foo 
    |   |- foo_id_1 : <obj>
    |   |- foo_id_2 : <obj>
    |   |- ...
    |
    |- bar
    |   |- bar_id_1: <obj>
    |   |- bar_id_2: <obj>
    |   |- ...
    |   
    |- baz
    |   |- baz_id_1: <obj>
    |   |- baz_id_2: <obj>
    |   |- ...
    |
    |- ...

You'd use it like this:

def get_or_make_fetched_object(cls, id):
    return object_cache.setdefault(cls, {}).setdefault(id, cls())

my_foo = get_or_make_fetched_object(Foo, 'foo_id_1')

The trick part here is how to get rid of no longer referenced objects (for example if foo_id_1 switches from bar_id_1 to bar_id_35), in order to avoid memory leaks, as the object_cache dictionary will indefinitely keep a reference to the object, unless is removed from cache.

A possible approach to the memory usage problem would be a cleanup function that uses gc.get_referrers() to obtain the refcount for each cached object. Basically objects in cache that have refcount equal to 2 can be removed from the cache (1 count comes from gc, the other one from the cache). This will not work for circular references though...

As for #2, you can timestamp an object with a save date, objects that match the current saving timestamp will be skipped in order to avoid infinite recursion. However it would make sense to save only subordinated objects.

Upvotes: 1

Ionut Hulub
Ionut Hulub

Reputation: 6326

This is a pretty broad question.

1) One way would be to use the factory method pattern. Example:

def get_foo(_id):
    get_foo.foos = dict()
    def real_get_foo(_id):
        # get foo from the API
        get_foo.foos[_id] = foo
        return foo
    if _id in get_foo.foos:
        return get_foo.foos[_id]
    return real_get_foo(_id)

2) I don't think it's a good idea to make nested saves. If I write foo.bar.x = 5 followed by foo.save() I wouldn't expect bar to get saved. Why? Because I called save() on foo, and I shouldn't have to worry about unwanted saves on related objects.

Upvotes: 1

Related Questions