technomalogical
technomalogical

Reputation: 3002

Iterating over a SQLAlchemy Collection Efficiently

I have a SQLAlchemy ORM model that I'm using to transfer specific rows (full object graphs) from our production database to our test and development databases. This is working really well, until I get a collection with lot of child dependencies, and I run into a MemoryError. I have already set up dynamic loading at key points in the object hierarchy and load the children separately with different queries, but there are still collections that have enough child data that I'm running into memory issues.

What's the best way to load only one element of a collection at a time, so I can copy single objects (and all their child rows) at a time?

Upvotes: 3

Views: 2044

Answers (1)

van
van

Reputation: 76992

Try to clean-up the session after each Node (with children) processed using Session.expunge. The sample code below prints the number of instances in the session:

class Node(Base):
    __tablename__ = 'node'
    id = Column(Integer, primary_key=True)
    parent_id = Column(Integer, ForeignKey('node.id'))
    name = Column(String(50))
    children = relationship("Node",
            backref=backref("parent", remote_side=[id],)
            )

def process_node(node, expunge=False, ident=1):
    print "Node: ", "-" * ident, node, " --> ", len(session.identity_map)
    for child in node.children:
        process_node(child, expunge, ident + 4)
        session.expunge(child)

roots = session.query(Node).filter(Node.parent == None)
for root in roots:
    process_node(root, True)

Upvotes: 1

Related Questions