noisy
noisy

Reputation: 6783

How to load at once all objects related to specified in SQLAlchemy?

This is simplified project of my database.

enter image description here

My model is created by SQLAlchemy and it looks like this

#!/usr/bin/python

class Book(Base):
    id = Column(Integer, primary_key = True)
    title = Column(Unicode(512))
    sentenses = relationship("Sentense", backref = backref("book", uselist = False))
    isbns = relationship("ISBN", secondary = books_isbns, backref = "book")
    authors = relationship("Author", secondary = books_authors, backref = "book")

class Sentense(Base):
    id = Column(Integer, primary_key = True)
    content = Column(Unicode(512))
    words = relationship("Word", secondary = sentenses_words, backref = "sentense")

class Word(Base):
    id = Column(Integer, primary_key = True)
    content = Column(Unicode(32), index = True, unique = True)
    soundex_id = Column(Integer, ForeignKey('Soundex.id'))

class Soundex(Base):
    id = Column(Integer, primary_key = True)
    code = Column(Unicode(5), index = True, unique = True)
    words = relationship("Word", backref = backref("soundex", uselist = False))

The problem is with time of object loading. Using great profiler I got this:

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
   111                                               @staticmethod
   112                                               @profile
   113                                               def getBooksWithSimilarWordsLikeInThisSentence(session, sentense):
   114        16           51      3.2      0.0          s = set()
   115        89       116294   1306.7      0.1          for word in sentense.words:
   116      4200       712414    169.6      0.5              for word in word.soundex.words:
   117     33690     13370590    396.9      8.7                  for sentense in word.sentense:
   118     29563       130437      4.4      0.1                      if sentense.id != sentense.id:
   119     18732     44930792   2398.6     29.3                          s.add(sentense.book)
   120                                           
   121        16          709     44.3      0.0          list_of_other_books = list(s)
   122                                           
   123
   124     18748        25865      1.4      0.0          for book in list_of_other_books:
   125
   126     39016     48461924   1242.1     31.6              for authors in book.authors:
   127     20284       564884     27.8      0.4                  print authors.name
   128                                           
   129     33896     44392639   1309.7     29.0              for isbn in book.isbns:
   130     15164       421289     27.8      0.3                  print isbn.raw
   131                                           
   132     18732       133320      7.1      0.1              books.add(book)
   133                                           
   134        16          926     57.9      0.0          return list(books)

Is there a way to load all content related to book object at once? I have tried use session.refresh() object, but it doesn't give any results.

Upvotes: 0

Views: 345

Answers (1)

davidism
davidism

Reputation: 127180

The code you posted only deals with the result of a query - passing sentense to the function. The issue is that all the relationships are lazy by default, so they require more SQL queries to work, which can be slow.

The solution is to eager load all desired relationships. Something like this will get you there:

# import sqlalchemy as sa
sentense = Sentense.query.options(sa.joinedload_all(
        "words.soundex.words.sentense.book.authors"
    ), sa.joinedload_all(
        "words.soundex.words.sentense.book.isbns"
    )).filter(<some filters here>).first()

Note that this could still be pretty slow, I don't know the details of your database and data, but it will result in one large query getting sent at once.

Also note there are other problems with your code. The "sides" of the relationships seem to be random, and the pluralization of names is not consistent, which makes it hard to follow. In the profiled code, you overwrite the passed in sentense during the for loop, so sentense.id != sentense.id will always evaluate False. You also overwrite the word from an outer for loop with a word in an inner for loop.

Upvotes: 2

Related Questions