Son
Son

Reputation: 1863

How to handle long SQL query in Flask?

I have a Flask web app that might need to execute some heavy SQL queries via Sqlalchemy given user input. I would like to set a timeout for the query, let's say 20 seconds so if a query takes more than 20 seconds, the server will display an error message to user so they can try again later or with smaller inputs.

I have tried with both multiprocessing and threading modules, both with the Flask development server and Gunicorn without success: the server keeps blocking and no error message is returned. You'll find below an excerpt of the code.

How do you handle slow SQL query in Flask in a user friendly way?

Thanks.

from multiprocessing import Process

@app.route("/long_query")
def long_query():
    query = db.session(User)

    def run_query():
        nonlocal query
        query = query.all()

    p = Process(target=run_query)
    p.start()

    p.join(20) # timeout of 20 seconds
    if p.is_alive():
        p.terminate()
        return render_template("error.html", message="please try with smaller input")

    return render_template("result.html", data=query)

Upvotes: 2

Views: 3559

Answers (2)

Giannis Spiliopoulos
Giannis Spiliopoulos

Reputation: 2698

As leovp mentioned Celery is the way to go if you are working on a long-term project. However, if you are working on a small project where you want something easy to setup, I would suggest going with RQ and it's flask plugin. Also think very seriously about terminating processes while they are querying the database, since they might not be able to clean up after themselves (e.g. releasing locks they have on the db)


Well if you really want to terminate queries on a timeout, I would suggest you use a database that supports it (PostgreSQL is one). I will assume you use PostgreSQL for this section.

from sqlalchemy.interfaces import ConnectionProxy

class ConnectionProxyWithTimeouts(ConnectionProxy):

    def cursor_execute(self, execute, cursor, statement, parameters, context, executemany):

        timeout = context.execution_options.get('timeout', None)

        if timeout:
            c = cursor._parent.cursor()
            c.execute('SET statement_timeout TO %d;' % int(timeout * 1000))
            c.close()

            ret = execute(cursor, statement, parameters, context)

            c = cursor._parent.cursor()
            c.execute('SET statement_timeout TO 0')
            c.close()
            return ret
        else:
            return execute(cursor, statement, parameters, context)

Then when you created an engine you would your own connection proxy

engine = create_engine(URL, proxy=TimeOutProxy(), pool_size=1, max_overflow=0)

And you could query then like this

 User.query.execution_options(timeout=20).all()

If you want to use the code above, use it only as a base for your own implementation, since I am not 100% sure it's bug free.

Upvotes: 3

leovp
leovp

Reputation: 4758

I would recommend using Celery or something similar (people use python-rq for simple workflows). Take a look at Flask documentation regarding Celery: http://flask.pocoo.org/docs/0.12/patterns/celery/

As for dealing with the results of a long-running query: you can create an endpoint for requesting task results and have client application periodically checking this endpoint until the results are available.

Upvotes: 5

Related Questions