geoffjentry
geoffjentry

Reputation: 4754

Efficiently search through a tree of ForeignKeys in Django

I have a tree like structure created out of models using ForeignKey linkages. As an example:

Model Person:  
   name = CharField  

Model Book:  
   name = CharField  
   author = FK(Person)  

Model Movie:  
   name = CharField  
   director = FK(Person)  

Model Album:  
   name = CharField  
   director = FK(Person)  

Model Chapter:  
   name = CharField  
   book = FK(Book)  

Model Scene:  
   name = CharField  
   movie = FK(Movie)  

Model Song:  
   name = CharField  
   album = FK(Album)  

The caveat here is the real structure is both deeper and broader, and a node might have multiple non-FK fields (ie not just 'name').

What I'd like to do is have a search function such that there is a string supplied which will return any Person that either matches a Person object itself, or a field in any of the child nodes. IOW, if "beat it" is the string, and the name of a song associated with an album associated w/ the person matches, the person will be returned.

What I've done so far is the following:

For any leaf node, have a Manager object w/ a search method which does something like:

return Song.objects.filter(name__icontains=search_string)

Then for the root node (Person) and any interior nodes, there is also a Manager object w/ a search() method which looks something like:

class AlbumManager(models.Model):  
    def search(self, search_string):  
        from_songs = Album.objects.filter(song__in=Song.objects.search(search_string))  
        return Album.objects.filter(name__icontains=search_string)|from_songs  

As you might imagine, once you get to the root node, this unleashes a massive number of queries and is really inefficient. I'd be pretty surprised if there wasn't a better way to do this ... one idea is to just have one search() method at the top of the tree that manually searches through everything, but a) that seems very messy (albeit probably more efficient) and b) it would be nice to be able to search individual nodes arbitrarily.

So with all of this said, what would be a more efficient method of getting where I want to be instead of my bonehead method here?

Upvotes: 1

Views: 432

Answers (4)

Skylar Saveland
Skylar Saveland

Reputation: 11464

Well, if you don't want to set-up a search engine like whoosh then you could use Q

http://docs.djangoproject.com/en/dev/topics/db/queries/#complex-lookups-with-q-objects

This would involve hardcoding the fields a bit and not simply going through all of them.

>>> q = Q()
>>> q = q | Q( fk__fk__field__icontains = searchTerm )
>>> q = q | ...
...
>>> qs = Model.objects.filter(q)

Upvotes: 2

M. Ryan
M. Ryan

Reputation: 7192

I don't think anything you can do with the basic ORM is going to be truly efficient, you are basically going to be running a series of filter() queries on different models with a hell of a lot of JOINs.

If you were aiming at high performance, long term solutions, your best bet would be to create an object repository in memory. I don't see any way in basic SQL you could make those queries efficient, especially if you are working with a ton of rows.

Creating a denormalized table for your child objects (chapter, song, etc) that contains all your Person and Book/Album data might be a potential mid-term solution, and that will require some fancy SQL footwork that is probably not part of Django Core.

Upvotes: 1

tosh
tosh

Reputation: 315

Django Solr is another option.

Upvotes: 1

Benjamin Wohlwend
Benjamin Wohlwend

Reputation: 31868

Perhaps it would be better to outsource the search functionality to a dedicated tool like Woosh or Sphinx. There are projects for both search frameworks that integrate them with Django (django-haystack and django-sphinx, respectively). You will get much better performance than with complicated SQL queries with many joins and subqueries.

Upvotes: 2

Related Questions