Kevin
Kevin

Reputation: 33

SQLAlchemy association table (association object pattern) raises IntegrityError

I'm using SQLAlchemy release 0.8.2 (tried python 2.7.5 and 3.3.2)

I've had to use the association object pattern (for a many-to-many relationship) in my code, but whenever I've been adding an association, it has been raising an IntegrityError exception. This is because instead of executing "INSERT INTO association (left_id, right_id, extra_data) [...]", it instead executes "INSERT INTO association (right_id, extra_data) [...]", which is going to raise an IntegrityError exception since it's missing a primary key.

After trying to narrow down the problem for a while and simplifying the code as much as possible, I found the culprit(s?), but I don't understand why it's behaving this way.

I included my complete code so the reader can test it as is. The class declarations are exactly the same as in the documentation (with backrefs).

#!/usr/bin/env python2
import sqlalchemy
from sqlalchemy.orm import sessionmaker
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import Column, Integer, String
from sqlalchemy import ForeignKey
from sqlalchemy.orm import relationship, backref


Base = declarative_base()

class Association(Base):
    __tablename__ = 'association'
    left_id = Column(Integer, ForeignKey('left.id'), primary_key=True)
    right_id = Column(Integer, ForeignKey('right.id'), primary_key=True)
    extra_data = Column(String(50))
    child = relationship("Child", backref="parent_assocs")

class Parent(Base):
    __tablename__ = 'left'
    id = Column(Integer, primary_key=True)

    children = relationship("Association", backref="parent")

class Child(Base):
    __tablename__ = 'right'
    id = Column(Integer, primary_key=True)



def main():
    engine = sqlalchemy.create_engine('sqlite:///:memory:', echo=True)
    Base.metadata.create_all(engine)
    Session = sessionmaker(bind=engine)
    session = Session()

    # populate old data
    session.add(Child()) 

    # new data
    p = Parent()
    session.add(p) # Commenting this fixes the error. 
    session.flush()

    # rest of new data
    a = Association(extra_data="some data")
    a.child = session.query(Child).one()
    # a.child = Child() # Using this instead of the above line avoids the error - but that's not what I want. 
    p.children.append(a)
    # a.parent = p # Using this instead of the above line fixes the error! They're logically equivalent. 

    session.add(p)
    session.commit()

if __name__ == '__main__':
    main()

So, as mentioned in the comments in the code above, there are three ways to fix/avoid the problem.

  1. Don't add the parent to the session before declaring the association
  2. Create a new child for the association instead of selecting an already existing child.
  3. Use the backref on the association

I don't understand the behaviour of all three cases.

The second case does something different, so it's not a possible solution. I don't understand the behaviour however, and would appreciate an explanation of why the problem is avoided in this case.

I'm thinking the first case may have something to do with "Object States", but I don't know exactly what's causing it either. Oh, and adding session.autoflush=False just before the first occurrence of session.add(p) also fixes the problem which adds to my confusion.

For the third case, I'm drawing a complete blank since they should be logically equivalent.

Thanks for any insight!

Upvotes: 3

Views: 1677

Answers (1)

zzzeek
zzzeek

Reputation: 75317

what happens here is that when you call upon p.children.append(), SQLAlchemy can't append to a plain collection without loading it first. As it goes to load, autoflush kicks in - you know this because in your stack trace you will see a line like this:

File "/Users/classic/dev/sqlalchemy/lib/sqlalchemy/orm/session.py", line 1183, in _autoflush
  self.flush()

Your Association object is then flushed here in an incomplete state; it's in the session in the first place because when you say a.child = some_persistent_child, an event appends a to the parent_assocs collection of Child which then cascades the Association object into the session (see Controlling Cascade on Backrefs for some background on this, and one possible solution).

But without affecting any relationships, the easiest solution when you have this chicken/egg sort of problem is to temporarily disable autoflush using no_autoflush:

with session.no_autoflush:
    p.children.append(a)

by disabling the autoflush when p.children is loaded, your pending object a is not flushed; it is then associated with the already persistent Parent (because you've added and flushed that already) and is ready for INSERT.

this allows your test program to succeed.

Upvotes: 3

Related Questions