Reputation: 6982
This is the first time I've used ORM, so I'm not sure the best way to handle this. I have a one-to-many relationship where each Parent can have many Children:
class Parent(Base):
__tablename__ = 'Parent'
name = Column(String(50))
gid = Column(String(16), primary_key = True)
lastUpdate = Column(DateTime)
def __init__(self,name, gid):
self.name = name
self.gid = gid
self.lastUpdate = datetime.datetime.now()
class Child(Base):
__tablename__ = 'Child'
id = Column(Integer, primary_key = True)
loc = Column(String(50))
status = Column(String(50))
parent_gid = Column(String(16), ForeignKey('Parent.gid'))
parent = relationship("Parent", backref=backref('children'))
Now, updates are coming in over the network. When an update comes in, I want to UPDATE the appropriate Parent row (updating lastUpdate column) and INSERT new children rows into the database. I don't know how to do that with ORM. Here is my failed attempt:
engine = create_engine('sqlite+pysqlite:///file.db',
module=dbapi2)
Base.metadata.create_all(engine)
session = sessionmaker(bind=engine)()
def addChildren(parent):
p = session.query(Parent).filter(Parent.gid == p1.gid).all()
if len(p) == 0:
session.add(p1)
session.commit()
else:
updateChildren = parent.children[:]
parent.chlidren = []
for c in updateChildren:
c.parent_gid = parent.gid
session.add_all(updateChildren)
session.commit()
if __name__ == '__main__':
#first update from the 'network'
p1 = Parent(name='team1', gid='t1')
p1.children = [Child(loc='x', status='a'), Child(loc='y', status='b')]
addChildren(p1)
import time
time.sleep(1)
#here comes another network update
p1 = Parent(name='team1', gid='t1')
p1.children = [Child(loc='z', status='a'), Child(loc='k', status='b')]
#this fails
addChildren(p1)
I initially tried to do a merge, but that caused the old children to be disassociated with the parent (the foreign IDs were set to null). What is the best way to approach this with ORM? Thanks
EDIT
I guess it doesn't really make sense to create entirely new objects when updates come in over the network. I should just query the session for the appropriate parent, then create new children if necessary and merge? E.g.
def addChildren(pname, pid, cloc, cstat):
p = session.query(Parent).filter(Parent.gid == pid).all()
if len(p) == 0:
p = Parent(pname, pid)
p.children = [Child(loc=cloc, status=cstat)]
session.add(p)
session.commit()
else:
p = p[0]
p.children.append(Child(loc=cloc, status=cstat))
session.merge(p)
session.commit()
Upvotes: 32
Views: 51262
Reputation: 77002
You are right - you should not create the same parent twice. In terms of adding children, ... well, you really need only to add them and you do not care about the existing ones... So your edited code should do the job just fine. You can make it shorter and more readable though:
def addChildren(pname, pid, cloc, cstat):
p = session.query(Parent).get(pid) # will give you either Parent or None
if not(p):
p = Parent(pname, pid)
session.add(p)
p.children.append(Child(loc=cloc, status=cstat))
session.commit()
The disadvantage of this way is that for existing Parent the whole collection of Children will be loaded into memory before a new Child is added and later saved to the database. If this is the case (many and increasing number of children for each parent), then the lazy='noload'
might be useful:
parent = relationship("Parent", backref=backref('children', lazy='noload'))
This might dramatically improve the speed of inserts, but in this case the access to p.children
will never load the existing objects from the database. In such scenarios it is enough to define another relationship. In these situations I prefer to use Building Query-Enabled Properties, so you end up with one property only for adding objects, and the other only for quering persisted results, which often are used by different parts of the system.
Upvotes: 49