Sean Payne
Sean Payne

Reputation: 1961

Deleting millions of rows with Many-To-Many Relationship SQLAlchemy

I have a couple of tables with the following many-to-many relationship

def TableOne(db.Model):
    __tablename__ = "table_one"
    id = db.Column(db.Integer, primary_key=True)
    table_twos = db.relationship(
        "TableTwo", secondary=relationship_table, lazy="subquery"
    )
    # Some other attributes

def TableTwo(db.Model):
    __tablename__ = "table_two"
    id = db.Column(db.Integer, primary_key=True)

relationship_table = db.Table(
    "relationship_table",
    db.Column("table_one_id", db.Integer, db.ForeignKey("table_one.id"), primary_key=True),
    db.Column(
        "table_two_id",
        db.Integer,
        db.ForeignKey("table_two.id"),
        primary_key=True,
    ),
)

Normally, I've worked on small projects, and I can delete all the relationships as follows

tables = db.session.query(TableOne).all()
for t in tables:
    t.table_twos = []
    db.session.flush()
db.session.commit()

table_twos = db.session.query(TableTwo).all()
for t in table_twos:
    db.session.delete(t)
    db.session.flush()
db.session.commit()

However, since I am working with millions of rows I can't load them all into memory. If I try to just delete all the TableTwo rows, it gives me an error about foreign keys.

How can I delete all of the relationships at once and then delete the TableTwo rows all at once?

Thank you

Upvotes: 0

Views: 162

Answers (1)

bobflux
bobflux

Reputation: 11581

If you want to delete all rows in a table, it's much faster to use TRUNCATE, that simply trash the table files on disk instead of deleting every row one by one. It will also reclaim disk space, unlike DELETE which will only create free space in the table file.

If there are foreign keys:

You can TRUNCATE the referencing table (TableTwo).

But you cannot truncate the referenced table because that would break the foreign key references. But if want to delete all rows in the referencing and referenced table, just truncate both:

TRUNCATE table1, table2;

If the referencing and referenced table are listed in the same truncate command, it will work. Do not use two independent TRUNCATE commands, or postgres will refuse to break your foreign keys (as it should!).

Note if you want row deletions in table1 to also delete the referencing rows in table2, you must set your foreign key to "ON DELETE CASCADE". Then you could use DELETE on table1, but to delete all rows TRUNCATE is much faster.

Upvotes: 2

Related Questions