Reputation: 866
I am using a Flask for one application I am working on, together with sqlalchemy and MySQL database for storing data. Jinja2 is used for templating. While developing, everything was OK, but now when the larger dataset is used, I am getting very slow rendering of some of the webpages that contain a lot of data. For example, a list of users is handled by this code:
def users():
q = Users.query.all()
out = []
for i in q:
try:
something_i_need = Table.query.filter_by(email=i.email).order_by(Table.date).first().id
except:
something_i_need = 0
out.append({
'id': i.id,
'last_name': i.last_name,
'first_name': i.first_name,
'middle_name': i.middle_name,
'phone': i.phone,
'team': i.team,
'status': i.status,
'needed': something_i_need,
'some_more_data': i.some_more_data
})
return render_template('users.html', list_of_users=out)
The data is a bit simplified, since theres more fields in the list. The call to this location, which queries around 2000 users at a time takes more than 10 seconds, and then the load of the page takes 10-20 as well. The table is wrapped by the following theme functions, first table from this link, and although disabling sorting functions on some columns helped, its still very slow.
So, I am wondering, how can I optimize this process or the generation of the template in order to make this faster? I am running this on an Amazon EC2 instance, with Python 2.7.3 and mod_wsgi, Flask 0.9, Flask-SQLAlchemy=0.16, Jinja2==2.7, MySQL-python=1.2.4, SQLAlchemy=0.8.1
. The one "obvious" solution, making a pagination and returning 100 or so records at a time would really work since the list has to contain all of the users for sorting purposes (date mostly), so that solution would the last resort.
Thanks in advance!
Upvotes: 1
Views: 1462
Reputation: 159905
Your issue is that you are running N+1 queries (where N
is the total number of users in the User
table) where 1 would do far better:
User.query \
.outerjoin(Table, User.email == Table.email) \
.order_by(User.email, Table.date)
will get you all users with all of their entries in Table
ordered by each user's email address and then by the date column in Table
. We use an outerjoin
rather than a join
to ensure that we get ever user, even if they don't have an entry in Table
.
Now, this is not quite what we need - we really want to get the latest entry in Table
if one exists. There is probably a better way to do this, but here's a simple example with a subquery:
last_entry = Table.query \
.group_by(Table.email) \
.order_by(Table.date) \
.subquery()
# Assuming that db is your Flask-SQLAlchemy extension
results = db.session.query(User, last_entry.c.id) \
.outerjoin(last_entry, User.email == last_entry.c.email)
Then you can just materialize the query with a call to all
and you'll be off to the races:
return render_template('users.html', list_of_users=results.all())
Upvotes: 5
Reputation: 6714
Firstly do the querying Asynchronously. Secondly maintain a conditional flow, where you should display the returned results in chunks. for example:
//get lenght of returning cursor
if(//length of returning cursor is greator thatn 10 or something..)
{
//Run a loop here and update your UI on UI thread for every 10 values.
}
Upvotes: 0