Reputation: 22875
I have a Flask app and there is an endpoint/resource that inserts a record in candidate_device
table using FlaskSqlAlchemy
as orm. I am facing a problem that when I run my tests on jenkins, jenkins runs these tests in 48 parallel threads/processes pytest -n 48 some_service/tests
. There are about 13 tests that call that mentioned resource and create a new record in candidate_device
table. Not all times but sometimes my tests fails due to this error
{"asciTime":"2016-07-19 07:59:43,093", "fileName":"app.py", "logRecordCreationTime":"1468915183.093476", "functionName":"log_exception", "levelNo":"40", "lineNo":"1423", "time":"93", "levelName":"ERROR", "message":"Exception on /v1/candidates/1641/devices [POST]" }#012Traceback (most recent call last):#012 File "/root/.pyenv/versions/2.7.9/lib/python2.7/site-packages/flask_restful/__init__.py", line 268, in error_router#012 return self.handle_error(e)#012 File "/root/base_service_container/candidate_service/common/talent_api.py", line 33, in handle_error#012 app_name, url, user_id, user_email = get_request_info(current_app)#012 File "/root/base_service_container/candidate_service/common/error_handling.py", line 175, in get_request_info#012 user_id = request.user.id#012 File "/root/.pyenv/versions/2.7.9/lib/python2.7/site-packages/sqlalchemy/orm/attributes.py", line 237, in __get__#012 return self.impl.get(instance_state(instance), dict_)#012 File "/root/.pyenv/versions/2.7.9/lib/python2.7/site-packages/sqlalchemy/orm/attributes.py", line 573, in get#012 value = state._load_expired(state, passive)#012 File "/root/.pyenv/versions/2.7.9/lib/python2.7/site-packages/sqlalchemy/orm/state.py", line 480, in _load_expired#012 self.manager.deferred_scalar_loader(self, toload)#012 File "/root/.pyenv/versions/2.7.9/lib/python2.7/site-packages/sqlalchemy/orm/loading.py", line 664, in load_scalar_attributes#012 only_load_props=attribute_names)#012 File "/root/.pyenv/versions/2.7.9/lib/python2.7/site-packages/sqlalchemy/orm/loading.py", line 219, in load_on_ident#012 return q.one()#012 File "/root/.pyenv/versions/2.7.9/lib/python2.7/site-packages/sqlalchemy/orm/query.py", line 2473, in one#012 ret = list(self)#012 File "/root/.pyenv/versions/2.7.9/lib/python2.7/site-packages/sqlalchemy/orm/query.py", line 2516, in __iter__#012 return self._execute_and_instances(context)#012 File "/root/.pyenv/versions/2.7.9/lib/python2.7/site-packages/sqlalchemy/orm/query.py", line 2529, in _execute_and_instances#012 close_with_result=True)#012 File "/root/.pyenv/versions/2.7.9/lib/python2.7/site-packages/sqlalchemy/orm/query.py", line 2520, in _connection_from_session#012 **kw)#012 File "/root/.pyenv/versions/2.7.9/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 882, in connection#012 execution_options=execution_options)#012 File "/root/.pyenv/versions/2.7.9/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 887, in _connection_for_bind#012 engine, execution_options)#012 File "/root/.pyenv/versions/2.7.9/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 313, in _connection_for_bind#012 self._assert_active()#012 File "/root/.pyenv/versions/2.7.9/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 214, in _assert_active#012 % self._rollback_exception#012InvalidRequestError:
This Session's transaction has been rolled back due to a previous exception during
flush. To begin a new transaction with this Session, first issue Session.rollback().
Original exception was: (_mysql_exceptions.OperationalError) (1205, 'Lock wait timeout exceeded; try restarting transaction')
[SQL: u'INSERT INTO candidate_device (one_signal_device_id, candidate_id, registered_at_datetime) VALUES (%s, %s, %s)'] [parameters: ('b702183d-4541-4ec0-a977-82d2a18ff681', 1641L, datetime.datetime(2016, 7, 19, 7, 58, 52, 49878))]
Mian error is this
Original exception was: (_mysql_exceptions.OperationalError) (1205, 'Lock wait timeout exceeded; try restarting transaction')
So I though, why not add an error handler to catch such exceptions and rollback so other code will not fail due to previous exceptions.
My idea can be totally wrong. So I just want to ask you brilliant guys that is there a way to handle such problems because I could not find a solution to fix this error using some SqlAlchemy
configurations etc.
Your help is most appreciated. Thanks in advance
Upvotes: 2
Views: 555
Reputation: 451
Your problem must be about connection limit. Depending on how long are your queries execution, you can exceed this number (same queries not always use the same time, this is why sometimes fails and sometimes they dont).
First execute this sql query:
show variables like "max_connections";
This number is the amount of simultaneous connections allowed by your mysql server. More connections requires more memory so choose it wisely.
How to increment the mysql connection limit? Easy, just go to your configuration file (usually at /etc/my.cnf, but it depends on your installation) and the attribute
max_connections = 200
Please note that although you launch only 48 processes, the used connections will be greater. SqlAlchemy uses a pool of connections so every process will consume multiple connections. But again, the exact behaviour of sqlalchemy depdends on your configuration.
Upvotes: 1
Reputation: 142218
48 is too high. Earlier versions of MySQL could handle about 8 before actually slowing down. The latest version may be able to handle 48, but only with carefully selected queries.
innodb_lock_wait_timeout
defaults to 50 seconds. This sounds like the 48 were really stumbling over each other. That INSERT
should take only a fraction of a second.
I suggest you lower the 48 to the number of CPUs or 16, whichever is smaller. Then cautiously raise it to see if you can improve the speed.
If you want to discuss the particular query, please provide
SHOW CREATE TABLE candidate_device;
SHOW VARIABLES LIKE 'innodb%';
There are many ways to speed up INSERTs
. You can easily get 10x improvement by batching them. Is that feasible for your application?
Upvotes: 1