wuz
wuz

Reputation: 513

Good practice to avoid sqlalchemy.exc.TimeoutError: QueuePool limit of size 5 overflow 10 reached

I encountered the following error after inserting 95 rows of video metadata out of 4000. sqlalchemy.exc.TimeoutError: QueuePool limit of size 5 overflow 10 reached, connection timed out, timeout 30 (Background on this error at: http://sqlalche.me/e/3o7r).

Base

engine = db.create_engine(connect_string)
con = engine.connect()
_SessionFactory = sessionmaker(bind=engine)
Base = declarative_base()

def session_factory():
    Base.metadata.create_all(engine)
    return _SessionFactory()

Video object

class Video(Base):

    __tablename__ = 'video'
    video_to_person = Table('video_to_person', Base.metadata,
                            Column('video_id', String, ForeignKey('video.vid')),
                            Column('person_id', Integer, ForeignKey('person.id'))
                            )

    _vid = Column("vid",String, primary_key=True)
    _webpage_url = Column("webpage_url", String)
    _upload_date = Column("upload_date", Date)
    _uploader = Column("uploader", String)
    _view_count = Column("view_count", DECIMAL)
    _like_count = Column("like_count", DECIMAL)
    _dislike_count = Column("dislike_count", DECIMAL)
    _format = Column("format", String)
    _duration = Column("duration", DECIMAL)
    _title = Column("title", String)
    _description = Column("description", String)
    persons = relationship("Person", secondary=video_to_person)

Video repository:

class VideoRepository():

    def create_video(self, vid: Video):
        session = session_factory()
        session.add(vid)
        session.commit()
        session.close()

How can I improve the connection management?

UPDATE: Thanks for the responses so far. One of the challenges is that all my model classes e.g. (class Video) inherit from Base. Base always creates a new engine object. I will look into further refactoring.

Upvotes: 10

Views: 13940

Answers (1)

bagerard
bagerard

Reputation: 6374

As explained in the doc of that particular error, you get this error because your application exceeded the number of connections that it is allowed to open/use in parallel, according to the limit set on the create_engine call. In this case, it's using the default, thus pool_size=5, max_overflow=10 and pool_timeout=30.

Which means that with a single engine, you can use up to 15 concurrent connections to the database, when that limit is reached, once a new request comes in to instantiate a new connection, it will wait 30 seconds and if none of the 15 already-established connections got released in the meantime, it raises the error.

As explained in the doc, this could have different causes:

  • The application is fielding too many concurrent requests to do work based on the configured value for the pool
  • The application is not returning connections to the pool
  • The application is attempting to run long-running transactions
  • The application is deadlocking

Given the information provided, my guess is that either:

  • You use too much threads (i.e >>15) and at some point your engine isn't able to provide new connections

  • You use limited number of threads (maybe 16) and you have a deadlock in your code.

My recommendation:

  • Check your slow query logs and look for queries that locked the table for a long period

  • Refactor the code to avoid calling [Base.metadata.create_all][2](engine) every time you get a session. This thing is usually called when an application starts, not every time you insert a record. If you need it there, set at least checkfirst=True so that it won't fire CREATE TABLE statement. This could potentially be a source of deadlock

  • If possible, use bulk inserts. You'll get a massive performance boost for free and will make a better use of your connection pool.

  • Once you identify the root cause of you problem (and only after that), you can adapt the 2 parameters pool_size and max_overflow. You can easily increase pool_size to 15 or 25 and max_overflow to 15

Upvotes: 14

Related Questions