Reputation: 2319
I've a web app which lets user run some ETL and also exposes the final table via SQLAlchemy models in a UI. The models use the auto_load feature of SQLAlchemy:
Base = declarative_base(name='Model')
class Product(Base):
__table__ = Table('product', Base.metadata,
Column('product_number', Text, primary_key=True),
autoload=True, autoload_with=engine, schema='data')
Sometimes the ETL results in a different table structure (new/removed/renamed columns, table name is still the same) and up to now I have to restart the whole web app to update the autoloaded SQLAlchemy models. Is there a way to rerun the autoload process on my models?
Upvotes: 2
Views: 464
Reputation: 2319
I've now build a accessor class which can be used similar to a module:
from sqlalchemy.ext.declarative import declarative_base
# in my case this saves the last timestamp when the db was
# updated into a memcache
from app.elsewhere import get_last_database_change_time
class Models():
"""A module-like accessor which allows to reset the models.
"""
_models = {}
def __init__(self):
self.Base = declarative_base(name='Base')
self._last_reset = get_last_database_change_time()
def reset(self):
"""Resets all models so that they are autoloaded again the next time they are accessed via the properties"""
self._models.clear()
# without resetting Base, we get errors that the same table is declared twice...
# extend_existing = True was not enough :-(
self.Base = declarative_base(name='Base')
def __reset_if_needed(self):
"""Resets all models after the database changed"""
# this is backed by a memcached value which gets reset when the ETL finishes
# needed as we do not have direct access to this object when we run the ETL
last_reset = get_last_database_change_time()
if last_reset != self._last_reset:
self.reset()
self._last_reset = last_reset
@property
def Model1(self):
self.__reset_if_needed()
if 'Model1' not in self._models:
class Model1(self.Base):
__tablename__ = 'table_name'
__table_args__ = dict(schema='schema_name', autoload=True, autoload_with=some_engine,
extend_existing=True)
id = Column(Text, primary_key=True)
self._models['Model1'] = Model1
return self._models['Model1']
def __declare_main_models(self):
"""Helper to declare the main models in one go due to dependency issues"""
class Model2(self.Base):
# model with dependency on/relationship with Model3
class Model3(self.Base):
# model with dependency on/relationship with Model2
self._models['Model2'] = Model2
self._models['Model3'] = Model3
@property
def Model2(self):
self.__reset_if_needed()
if 'Model2' not in self._models:
self.__declare_main_models()
return self._models['Model2']
models = Models()
Use like:
from app.database import models as m
p = m.Model1(...)
For some models, I had to declare them in one go as they had relationships with each other and it was a pain to sort that out.
For a 'read only' case this works nicely in a gunicorn with multiple workers (=processes). It's probably not safe to use in a multi-threading app...
Upvotes: 1