Reputation: 83
I have a Flask application to start long-running Celery tasks (~10-120 min/task, sometimes with slow queries). I use Flask-SQLAlchemy for ORM and connection management. My app looks like this:
app = Flask(__name__)
db = SQLAlchemy(app)
celery = make_celery(app)
@app.route('/start_job')
def start_job():
task = job.delay()
return 'Async job started', 202
@celery.task(bind=True)
def job(self):
db.session.query(... something ...)
... do something for hours ...
db.session.add(... something ...)
db.session.commit()
return
Unfortunately the MySQL server I have to use likes to close connections after a few minutes inactivity and the celery tasks can't handle the situation, so after a lot of waiting I get (2006, 'MySQL server has gone away') errors. AFAIK the connection pooling should take care of the closed connections. I read the docs, but it only writes about the SQLALCHEMY_POOL_TIMEOUT
and SQLALCHEMY_POOL_RECYCLE
parameters, so based on some random internet article I tried to change recycle to 3 minutes, but that didn't help.
How the connection (session ?) handling works with this configuration? What should I do to avoid such errors?
Upvotes: 0
Views: 2773
Reputation: 83
I am not entirely sure about the goodness of the solution below, but it seems to solve the problem.
The session initialize a connection before the first query (or insert) statement and starts a transaction. Then it waits for a rollback or commit, but because of inactivity the MySQL server closes the connection after a few minutes. The solution is to close the session if you do not need it for a long time, and SQLAlchemy will open a new one for the next transaction.
@celery.task(bind=True)
def job(self):
db.session.query(... something ...)
db.session.close()
... do something for hours ...
db.session.add(... something ...)
db.session.commit()
return
Upvotes: 2