Reputation: 186
App startup memory
Partition of a set of 249162 objects. Total size = 28889880 bytes.
Index Count % Size % Cumulative % Referrers by Kind (class / dict of class)
0 77463 31 5917583 20 5917583 20 types.CodeType
1 30042 12 3774404 13 9691987 34 function
2 51799 21 3070789 11 12762776 44 tuple
3 15106 6 2061017 7 14823793 51 dict of type
4 5040 2 1928939 7 16752732 58 function, tuple
5 6627 3 1459448 5 18212180 63 type
6 5227 2 1346136 5 19558316 68 dict of module
7 16466 7 1026538 4 20584854 71 dict (no owner)
8 734 0 685897 2 21270751 74 dict of module, tuple
9 420 0 626760 2 21897511 76 function, module
App memory after 100 subsequent calls (interacting with SQLAlchemy)
Partition of a set of 628910 objects. Total size = 107982928 bytes.
Index Count % Size % Cumulative % Referrers by Kind (class / dict of class)
0 23373 4 27673632 26 27673632 26 sqlalchemy.sql.schema.Column
1 141175 22 20904408 19 48578040 45 dict of sqlalchemy.sql.schema.Column
2 78401 12 5984371 6 54562411 51 types.CodeType
3 34133 5 4239726 4 58802137 54 function
4 64371 10 3661978 3 62464115 58 tuple
5 20034 3 2971710 3 65435825 61 dict of sqlalchemy.sql.schema.Table
6 13356 2 2297232 2 67733057 63 sqlalchemy.sql.base.ColumnCollection
7 15924 3 2133374 2 69866431 65 dict of type
8 5095 1 1946855 2 71813286 67 function, tuple
9 8714 1 1793696 2 73606982 68 type
Helper function that detects memory usage by rcs
def heap_results():
from guppy import hpy
hp = hpy()
h = hp.heap()
return Response(response=str(h.bytype),
status=200,
mimetype='application/json')
The implementation of SQLAlchemy is fairly straightforward. Using the db.Model, we are creating a class for ORM, and breaking the tables into subfunctions of the ORM class.
We are gc.collect()
just before returning the final response to the user. We are also using db.session.flush()
, db.session.expunge_all()
, and db.session.close()
.
We have tried to remove the db.session.*
commands, as well as the gc.collect()
. Nothing changes.
Here is a timeseries graph of our app's memory usage, the application being restarted is where you see the memory cap reset back to a stable state:
Code to simulate an HAProxy.
def reconnect():
hostnames = [Settings.SECRETS.get('PATRONI_HOST_C', ''), Settings.SECRETS.get('PATRONI_HOST_E', '')]
try:
master_node = HAProxy(hostnames=hostnames)
except (ValueError, TypeError, BaseException) as e:
# send an alert here though, use the informant!
raise e
else:
if master_node in ['None', None]:
raise ValueError("Failed to determined which server is acting as the master node")
my_app.config['SQLALCHEMY_DATABASE_URI'] = "postgresql://{}:{}@{}/{}".format(Settings.SECRETS['PATRONI_USER'],
Settings.SECRETS['PATRONI_PASSWORD'],
master_node,
Settings.SECRETS['PATRONI_DB'])
my_app.config['SQLALCHEMY_ENGINE_OPTIONS'] = {
'pool_recycle': 1800
}
new_db = SQLAlchemy(my_app)
new_db_orm = DBORM(new_db)
return new_db_orm
What the DBORM (modified to hide full functionality) looks like:
class DBORM(object):
def __init__(self, database):
self.database = database
self.owner_model = self.owner_model()
def create_owner_model(self):
db = self.database
class OwnerModel(db.Model):
__tablename__ = "owners"
owner_id = db.Column(UUID(as_uuid=True), unique=True,
nullable=False, primary_key=True)
client_owner = db.Column(db.String(255), unique=False, nullable=False)
admin_owner = db.Column(db.String(255), unique=False, nullable=False)
@staticmethod
def owner_validation(owner_model, owner=None):
if owner is not None:
owner_model = OwnerModel.get_owner_by_id(owner_id=owner_id,
return_as_model=True)
if owner_model is not None:
client_owner = owner_model.client_owner
admin_owner = owner_model.admin_owner
if client_owner is None and admin_owner is None:
return False
elif client_owner.lower() == owner.lower():
return True
elif admin_owner.lower() == owner.lower():
return True
else:
return False
else:
return None
else:
return None
Example of Using OwnerModel from API
@api.route('/owners/{owner_id}')
def my_function(owner_id):
try:
dborm = reconnect()
except (AttributeError, KeyError, ValueError, BaseException) as e:
logger.error(f'Unable to get an owner model.')
logger.info(f'Garbage collector, collected: {gc.collect()}')
return Response(response=Exception.database_reconnect_failure(),
status=503,
mimetype='application/json')
else:
response = dborm.get_owner_by_id(owner_id=owner_id)
logger.info(f'Garbage collector, collected: {gc.collect()}')
return Response(response=json.dumps(response),
status=200,
mimetype='application/json')
Upvotes: 2
Views: 6538
Reputation: 52929
SQLAlchemy MetaData
holds references to Table
objects and the Declarative base class also has an internal registry for lookups, for example for use as context in relationship()
lazily evaluated arguments. When you repeatedly create new versions of model classes, which also creates the required metadata like Table
, you likely consume more and more memory, if the references are kept around. The fact that Column
objects dominate your memory usage supports this in my view.
You should aim to create your models and their metadata just once during your application's life cycle. You only need to be able to change the connection parameters dynamically. SQLAlchemy versions up to 1.3 provide the creator
argument of Engine
for exactly this, and version 1.4 introduced DialectEvents.do_connect()
event hook for even finer control.
Using creator
:
import psycopg2
db = SQLAlchemy()
dborm = DBORM(db)
def initialize(app):
"""
Setup `db` configuration and initialize the application. Call this once and
once only, before your application starts using the database.
The `creator` creates a new connection every time the connection pool
requires one, due to all connections being in use, or old ones having been
recycled, etc.
"""
# Placeholder that lets the Engine know which dialect it will be speaking
app.config['SQLALCHEMY_DATABASE_URI'] = "postgresql+psycopg2://"
app.config['SQLALCHEMY_ENGINE_OPTIONS'] = {
'pool_recycle': 1800,
'creator': lambda: psycopg2.connect(
dbname=Settings.SECRETS['PATRONI_DB'],
user=Settings.SECRETS['PATRONI_USER'],
password=Settings.SECRETS['PATRONI_PASSWORD'],
host=HAProxy([
Settings.SECRETS.get('PATRONI_HOST_C', ''),
Settings.SECRETS.get('PATRONI_HOST_E', ''),
]))
}
db.init_app(app)
class OwnerModel(db.Model):
__tablename__ = "owners"
...
Note that you need to change DBORM
to use the global db
and model classes, and that your controllers do not call reconnect()
—that does not exist—anymore, but just use db
, dborm
, and the classes directly as well.
Upvotes: 1