Reputation: 331
I'm trying to copy rows from one DB instance to a another DB with an identical schema in a different environment. Two tables within this schema are linked in such a way that they result in mutually dependent rows. When these rows are inserted, the post_update
runs afterward as expected, but the update statement sets the value of the ID field to None
instead of the expected ID.
This only happens when using objects that have been expunged from a session. When using newly created objects, the post_update
behaves exactly as expected.
Examples
I have a relationship set up that looks like this:
class Category(Base):
__tablename__ = 'categories'
id = Column(Integer, primary_key=True)
top_product_id = Column(Integer, ForeignKey('products.id'))
products = relationship('Product', primaryjoin='Product.category_id == Category.id', back_populates='category', cascade='all', lazy='selectin')
top_product = relationship('Product', primaryjoin='Category.top_product_id == Product.id', post_update=True, cascade='all', lazy='selectin')
class Product(Base):
__tablename__ = 'products'
id = Column(Integer, primary_key=True)
category_id = Column(Integer, ForeignKey('categories.id'))
category = relationship('Category', primaryjoin='Product.category_id == Category.id', back_populates='products', cascade='all', lazy='selectin')
If I query a category and its related products from one DB and try to write them to another, the update of top_product_id
doesn't behave as expected, and sets the value to None
instead. The following code:
category = source_session.query(Category).filter(Category.id == 99).one()
source_session.expunge(category)
make_transient(category)
for products in category.products:
make_transient(product)
# this step is necessary to prevent a foreign key error on the initial category insert
category.top_product_id = None
dest_session.add(category)
results in SQLAlchemy generating the following SQL:
INSERT INTO categories (name, top_product_id) VALUES (%s, %s)
('SomeCategoryName', None)
INSERT INTO products (name, category_id) VALUES (%s, %s)
('SomeProductName', 99)
UPDATE categories SET top_product_id=%s WHERE categories.id = %s
(None, 99)
But if I use newly created objects, everything works as expected.
category = Category()
product = Product()
category.name = 'SomeCategoryName'
product.name = 'SomeProductName'
product.category = category
category.top_product = product
dest_session.add(category)
results in:
INSERT INTO categories (name, top_product_id) VALUES (%s, %s)
('SomeCategoryName', None)
INSERT INTO products (name, category_id) VALUES (%s, %s)
('SomeProductName', 99)
UPDATE categories SET top_product_id=%s WHERE categories.id = %s
(1, 99)
Aside from this difference, everything behaves in the same way between these two actions. All other relationship are created properly, IDs and foreign keys are set as expected. Only the top_product_id
set in the update clause created by the post_update
fails to behave as expected.
As an additional troubleshooting step, I tried:
On the first flush to the DB, the top_product_id
is set properly. On the second, it's set to None
. So this confirms that the issue is not with differences in the sessions, but something to do with expunging objects from sessions and making them transient. There must be something that does/doesn't happen during the expunge/make transient process that leaves these objects in a fundamentally different state and prevents post_update
from behaving the way it should.
Any ideas on where to go from here would be appreciated.
Upvotes: 0
Views: 683
Reputation: 26
I assume your Base
class mixes in the name
column?
Your goal is to make inspect(category).committed_state
look like it does for newly created objects (except maybe for id
attribute). Same for each product object.
In your "newly created objects" example, category
's committed_state
looks like this before flushing the session:
{'id': symbol('NEVER_SET'),
'name': symbol('NO_VALUE'),
'products': [],
'top_product': symbol('NEVER_SET')}
while product
's committed_state
looks like this:
{'category': symbol('NEVER_SET'),
'id': symbol('NEVER_SET'),
'name': symbol('NO_VALUE')}
To get the post-update behavior, you need to both expire category.top_product_id
(to prevent it from being included in the INSERT
) and fudge category.top_product
's committed_state
(to make SQLAlchemy believe that the value has changed and therefore needs to cause an UPDATE
).
First, expire category.top_product_id
before making category
transient:
source_session.expire(category, ["top_product_id"])
Then fudge category.top_product
's committed_state
(this can happen before or after making category
transient):
from sqlalchemy import inspect
from sqlalchemy.orm.base import NEVER_SET
inspect(category).committed_state.update(top_product=NEVER_SET)
Full example:
from sqlalchemy import Column, ForeignKey, Integer, String, create_engine, inspect
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import Session, make_transient, relationship
from sqlalchemy.orm.base import NEVER_SET
class Base(object):
name = Column(String(50), nullable=False)
Base = declarative_base(cls=Base)
class Category(Base):
__tablename__ = 'categories'
id = Column(Integer, primary_key=True)
top_product_id = Column(Integer, ForeignKey('products.id'))
products = relationship('Product', primaryjoin='Product.category_id == Category.id', back_populates='category', cascade='all', lazy='selectin')
top_product = relationship('Product', primaryjoin='Category.top_product_id == Product.id', post_update=True, cascade='all', lazy='selectin')
class Product(Base):
__tablename__ = 'products'
id = Column(Integer, primary_key=True)
category_id = Column(Integer, ForeignKey('categories.id'), nullable=False)
category = relationship('Category', primaryjoin='Product.category_id == Category.id', back_populates='products', cascade='all', lazy='selectin')
source_engine = create_engine('sqlite:///')
dest_engine = create_engine('sqlite:///', echo=True)
def fk_pragma_on_connect(dbapi_con, con_record):
dbapi_con.execute('pragma foreign_keys=ON')
from sqlalchemy import event
for engine in [source_engine, dest_engine]:
event.listen(engine, 'connect', fk_pragma_on_connect)
Base.metadata.create_all(bind=source_engine)
Base.metadata.create_all(bind=dest_engine)
source_session = Session(bind=source_engine)
dest_session = Session(bind=dest_engine)
source_category = Category(id=99, name='SomeCategoryName')
source_product = Product(category=source_category, id=100, name='SomeProductName')
source_category.top_product = source_product
source_session.add(source_category)
source_session.commit()
source_session.close()
# If you want to test UPSERTs in dest_session.
# dest_category = Category(id=99, name='PrevCategoryName')
# dest_product = Product(category=dest_category, id=100, name='PrevProductName')
# dest_category.top_product = dest_product
# dest_session.add(dest_category)
# dest_session.commit()
# dest_session.close()
category = source_session.query(Category).filter(Category.id == 99).one()
# Ensure relationship attributes are initialized before we make objects transient.
_ = category.top_product
# source_session.expire(category, ['id']) # only if you want new IDs in dest_session
source_session.expire(category, ['top_product_id'])
for product in category.products:
# Ensure relationship attributes are initialized before we make objects transient.
_ = product.category
# source_session.expire(product, ['id']) # only if you want new IDs in dest_session
# Not strictly needed as long as Product.category is not a post-update relationship.
source_session.expire(product, ['category_id'])
make_transient(category)
inspect(category).committed_state.update(top_product=NEVER_SET)
for product in category.products:
make_transient(product)
# Not strictly needed as long as Product.category is not a post-update relationship.
inspect(product).committed_state.update(category=NEVER_SET)
dest_session.add(category)
# Or, if you want UPSERT (must retain original IDs in this case)
# dest_session.merge(category)
dest_session.flush()
Which produces this DML in dest_session
:
INSERT INTO categories (name, id, top_product_id) VALUES (?, ?, ?)
('SomeCategoryName', 99, None)
INSERT INTO products (name, id, category_id) VALUES (?, ?, ?)
('SomeProductName', 100, 99)
UPDATE categories SET top_product_id=? WHERE categories.id = ?
(100, 99)
It seems like make_transient
should reset committed_state
to be as if it were a new object, but I guess not.
Upvotes: 1