wont_compile
wont_compile

Reputation: 866

Improve page speed when querying database

I am using a Flask for one application I am working on, together with sqlalchemy and MySQL database for storing data. Jinja2 is used for templating. While developing, everything was OK, but now when the larger dataset is used, I am getting very slow rendering of some of the webpages that contain a lot of data. For example, a list of users is handled by this code:

def users():
    q = Users.query.all()
    out = []
    for i in q: 
        try:
            something_i_need = Table.query.filter_by(email=i.email).order_by(Table.date).first().id
        except:
            something_i_need = 0
        out.append({
                'id': i.id,
                'last_name': i.last_name,
                'first_name': i.first_name,
                'middle_name': i.middle_name,
                'phone': i.phone,
                'team': i.team,
                'status': i.status,
                'needed': something_i_need,
                'some_more_data': i.some_more_data
            })
    return render_template('users.html', list_of_users=out)

The data is a bit simplified, since theres more fields in the list. The call to this location, which queries around 2000 users at a time takes more than 10 seconds, and then the load of the page takes 10-20 as well. The table is wrapped by the following theme functions, first table from this link, and although disabling sorting functions on some columns helped, its still very slow.

So, I am wondering, how can I optimize this process or the generation of the template in order to make this faster? I am running this on an Amazon EC2 instance, with Python 2.7.3 and mod_wsgi, Flask 0.9, Flask-SQLAlchemy=0.16, Jinja2==2.7, MySQL-python=1.2.4, SQLAlchemy=0.8.1. The one "obvious" solution, making a pagination and returning 100 or so records at a time would really work since the list has to contain all of the users for sorting purposes (date mostly), so that solution would the last resort.

Thanks in advance!

Upvotes: 1

Views: 1462

Answers (2)

Sean Vieira
Sean Vieira

Reputation: 159905

Your issue is that you are running N+1 queries (where N is the total number of users in the User table) where 1 would do far better:

User.query \
        .outerjoin(Table, User.email == Table.email) \
        .order_by(User.email, Table.date)

will get you all users with all of their entries in Table ordered by each user's email address and then by the date column in Table. We use an outerjoin rather than a join to ensure that we get ever user, even if they don't have an entry in Table.

Now, this is not quite what we need - we really want to get the latest entry in Table if one exists. There is probably a better way to do this, but here's a simple example with a subquery:

last_entry = Table.query \
                  .group_by(Table.email) \
                  .order_by(Table.date) \
                  .subquery()

# Assuming that db is your Flask-SQLAlchemy extension
results = db.session.query(User, last_entry.c.id)  \
    .outerjoin(last_entry, User.email == last_entry.c.email)

Then you can just materialize the query with a call to all and you'll be off to the races:

return render_template('users.html', list_of_users=results.all())

Upvotes: 5

Salman Khakwani
Salman Khakwani

Reputation: 6714

Firstly do the querying Asynchronously. Secondly maintain a conditional flow, where you should display the returned results in chunks. for example:

//get lenght of returning cursor
if(//length of returning cursor is greator thatn 10 or something..)
 {
  //Run a loop here and update your UI on UI thread for every 10 values.
 }

Upvotes: 0

Related Questions