Andrew Ray
Andrew Ray

Reputation: 186

SQLAlchemy causing memory leaks

App startup memory

Partition of a set of 249162 objects. Total size = 28889880 bytes.
 Index  Count   %     Size   % Cumulative  % Referrers by Kind (class / dict of class)
     0  77463  31  5917583  20   5917583  20 types.CodeType
     1  30042  12  3774404  13   9691987  34 function
     2  51799  21  3070789  11  12762776  44 tuple
     3  15106   6  2061017   7  14823793  51 dict of type
     4   5040   2  1928939   7  16752732  58 function, tuple
     5   6627   3  1459448   5  18212180  63 type
     6   5227   2  1346136   5  19558316  68 dict of module
     7  16466   7  1026538   4  20584854  71 dict (no owner)
     8    734   0   685897   2  21270751  74 dict of module, tuple
     9    420   0   626760   2  21897511  76 function, module

App memory after 100 subsequent calls (interacting with SQLAlchemy)

Partition of a set of 628910 objects. Total size = 107982928 bytes.
 Index  Count   %     Size   % Cumulative  % Referrers by Kind (class / dict of class)
     0  23373   4 27673632  26  27673632  26 sqlalchemy.sql.schema.Column
     1 141175  22 20904408  19  48578040  45 dict of sqlalchemy.sql.schema.Column
     2  78401  12  5984371   6  54562411  51 types.CodeType
     3  34133   5  4239726   4  58802137  54 function
     4  64371  10  3661978   3  62464115  58 tuple
     5  20034   3  2971710   3  65435825  61 dict of sqlalchemy.sql.schema.Table
     6  13356   2  2297232   2  67733057  63 sqlalchemy.sql.base.ColumnCollection
     7  15924   3  2133374   2  69866431  65 dict of type
     8   5095   1  1946855   2  71813286  67 function, tuple
     9   8714   1  1793696   2  73606982  68 type

Helper function that detects memory usage by rcs

def heap_results():
    from guppy import hpy
    hp = hpy()
    h = hp.heap()
    return Response(response=str(h.bytype),
                    status=200,
                    mimetype='application/json')

The implementation of SQLAlchemy is fairly straightforward. Using the db.Model, we are creating a class for ORM, and breaking the tables into subfunctions of the ORM class.

We are gc.collect() just before returning the final response to the user. We are also using db.session.flush(), db.session.expunge_all(), and db.session.close().

We have tried to remove the db.session.* commands, as well as the gc.collect(). Nothing changes.

Here is a timeseries graph of our app's memory usage, the application being restarted is where you see the memory cap reset back to a stable state: enter image description here

Code to simulate an HAProxy.

def reconnect():
    hostnames = [Settings.SECRETS.get('PATRONI_HOST_C', ''), Settings.SECRETS.get('PATRONI_HOST_E', '')]

    try:
        master_node = HAProxy(hostnames=hostnames)

    except (ValueError, TypeError, BaseException) as e:
        # send an alert here though, use the informant!
        raise e

    else:
        if master_node in ['None', None]:
            raise ValueError("Failed to determined which server is acting as the master node")

    my_app.config['SQLALCHEMY_DATABASE_URI'] = "postgresql://{}:{}@{}/{}".format(Settings.SECRETS['PATRONI_USER'],
                                                                                 Settings.SECRETS['PATRONI_PASSWORD'],
                                                                                 master_node,
                                                                                 Settings.SECRETS['PATRONI_DB'])

    my_app.config['SQLALCHEMY_ENGINE_OPTIONS'] = {
    'pool_recycle': 1800
    }

    new_db = SQLAlchemy(my_app)
    new_db_orm = DBORM(new_db)
    return new_db_orm

What the DBORM (modified to hide full functionality) looks like:

class DBORM(object):
    def __init__(self, database):
        self.database = database
        self.owner_model = self.owner_model()

    def create_owner_model(self):
        db = self.database

        class OwnerModel(db.Model):
            __tablename__ = "owners"
            owner_id = db.Column(UUID(as_uuid=True), unique=True,
                                      nullable=False, primary_key=True)
  
            client_owner = db.Column(db.String(255), unique=False, nullable=False)
            admin_owner = db.Column(db.String(255), unique=False, nullable=False)

            @staticmethod
            def owner_validation(owner_model, owner=None):
                if owner is not None:
                    owner_model = OwnerModel.get_owner_by_id(owner_id=owner_id,
                                                                  return_as_model=True)
                    if owner_model is not None:
                        client_owner = owner_model.client_owner
                        admin_owner = owner_model.admin_owner

                        if client_owner is None and admin_owner is None:
                            return False

                        elif client_owner.lower() == owner.lower():
                            return True

                        elif admin_owner.lower() == owner.lower():
                            return True

                        else:
                            return False

                    else:
                        return None

                else:
                    return None

Example of Using OwnerModel from API

@api.route('/owners/{owner_id}')
def my_function(owner_id):
   try:
      dborm = reconnect()

   except (AttributeError, KeyError, ValueError, BaseException) as e:
      logger.error(f'Unable to get an owner model.')
      logger.info(f'Garbage collector, collected: {gc.collect()}')
      return Response(response=Exception.database_reconnect_failure(),
                      status=503,
                      mimetype='application/json')

   else:
      response = dborm.get_owner_by_id(owner_id=owner_id)

   logger.info(f'Garbage collector, collected: {gc.collect()}')
   return Response(response=json.dumps(response),
                   status=200,
                   mimetype='application/json')

Upvotes: 2

Views: 6538

Answers (1)

Ilja Everilä
Ilja Everilä

Reputation: 52929

SQLAlchemy MetaData holds references to Table objects and the Declarative base class also has an internal registry for lookups, for example for use as context in relationship() lazily evaluated arguments. When you repeatedly create new versions of model classes, which also creates the required metadata like Table, you likely consume more and more memory, if the references are kept around. The fact that Column objects dominate your memory usage supports this in my view.

You should aim to create your models and their metadata just once during your application's life cycle. You only need to be able to change the connection parameters dynamically. SQLAlchemy versions up to 1.3 provide the creator argument of Engine for exactly this, and version 1.4 introduced DialectEvents.do_connect() event hook for even finer control.

Using creator:

import psycopg2

db = SQLAlchemy()
dborm = DBORM(db)


def initialize(app):
    """
    Setup `db` configuration and initialize the application. Call this once and
    once only, before your application starts using the database.

    The `creator` creates a new connection every time the connection pool
    requires one, due to all connections being in use, or old ones having been
    recycled, etc.
    """
    # Placeholder that lets the Engine know which dialect it will be speaking
    app.config['SQLALCHEMY_DATABASE_URI'] = "postgresql+psycopg2://"
    app.config['SQLALCHEMY_ENGINE_OPTIONS'] = {
        'pool_recycle': 1800,
        'creator': lambda: psycopg2.connect(
            dbname=Settings.SECRETS['PATRONI_DB'],
            user=Settings.SECRETS['PATRONI_USER'],
            password=Settings.SECRETS['PATRONI_PASSWORD'],
            host=HAProxy([
                Settings.SECRETS.get('PATRONI_HOST_C', ''),
                Settings.SECRETS.get('PATRONI_HOST_E', ''),
            ]))
    }

    db.init_app(app)


class OwnerModel(db.Model):
    __tablename__ = "owners"
    ...

Note that you need to change DBORM to use the global db and model classes, and that your controllers do not call reconnect()—that does not exist—anymore, but just use db, dborm, and the classes directly as well.

Upvotes: 1

Related Questions