Reputation: 6783
This is simplified project of my database.
My model is created by SQLAlchemy and it looks like this
#!/usr/bin/python
class Book(Base):
id = Column(Integer, primary_key = True)
title = Column(Unicode(512))
sentenses = relationship("Sentense", backref = backref("book", uselist = False))
isbns = relationship("ISBN", secondary = books_isbns, backref = "book")
authors = relationship("Author", secondary = books_authors, backref = "book")
class Sentense(Base):
id = Column(Integer, primary_key = True)
content = Column(Unicode(512))
words = relationship("Word", secondary = sentenses_words, backref = "sentense")
class Word(Base):
id = Column(Integer, primary_key = True)
content = Column(Unicode(32), index = True, unique = True)
soundex_id = Column(Integer, ForeignKey('Soundex.id'))
class Soundex(Base):
id = Column(Integer, primary_key = True)
code = Column(Unicode(5), index = True, unique = True)
words = relationship("Word", backref = backref("soundex", uselist = False))
The problem is with time of object loading. Using great profiler I got this:
Line # Hits Time Per Hit % Time Line Contents
==============================================================
111 @staticmethod
112 @profile
113 def getBooksWithSimilarWordsLikeInThisSentence(session, sentense):
114 16 51 3.2 0.0 s = set()
115 89 116294 1306.7 0.1 for word in sentense.words:
116 4200 712414 169.6 0.5 for word in word.soundex.words:
117 33690 13370590 396.9 8.7 for sentense in word.sentense:
118 29563 130437 4.4 0.1 if sentense.id != sentense.id:
119 18732 44930792 2398.6 29.3 s.add(sentense.book)
120
121 16 709 44.3 0.0 list_of_other_books = list(s)
122
123
124 18748 25865 1.4 0.0 for book in list_of_other_books:
125
126 39016 48461924 1242.1 31.6 for authors in book.authors:
127 20284 564884 27.8 0.4 print authors.name
128
129 33896 44392639 1309.7 29.0 for isbn in book.isbns:
130 15164 421289 27.8 0.3 print isbn.raw
131
132 18732 133320 7.1 0.1 books.add(book)
133
134 16 926 57.9 0.0 return list(books)
Is there a way to load all content related to book object at once? I have tried use session.refresh() object, but it doesn't give any results.
Upvotes: 0
Views: 345
Reputation: 127180
The code you posted only deals with the result of a query - passing sentense to the function. The issue is that all the relationships are lazy by default, so they require more SQL queries to work, which can be slow.
The solution is to eager load all desired relationships. Something like this will get you there:
# import sqlalchemy as sa
sentense = Sentense.query.options(sa.joinedload_all(
"words.soundex.words.sentense.book.authors"
), sa.joinedload_all(
"words.soundex.words.sentense.book.isbns"
)).filter(<some filters here>).first()
Note that this could still be pretty slow, I don't know the details of your database and data, but it will result in one large query getting sent at once.
Also note there are other problems with your code. The "sides" of the relationships seem to be random, and the pluralization of names is not consistent, which makes it hard to follow. In the profiled code, you overwrite the passed in sentense during the for loop, so sentense.id != sentense.id
will always evaluate False
. You also overwrite the word
from an outer for loop with a word
in an inner for loop.
Upvotes: 2