Sqlalchemy - How to properly bulk insert data into a database when the data has relationships

Question

I have datafiles of rows of things. Each thing has a gene listed in it. It's a one-many relationship because each gene can be part of multiple things but each thing can only have one gene.

Imagine models roughly like these:

class Gene(db.Model):

    __tablename__ = "gene"

    id          = db.Column(db.Integer, primary_key=True)

    name1   = db.Column(db.Integer, index=True, unique=True, nullable=False)  # nullable might be not right
    name2   = db.Column(db.String(120), index=True, unique=True)

    things = db.relationship("Thing", back_populates='gene')

    def __init__(self, name1, name2=None):
        self.name1 = name1
        self.name2 = name2

    @classmethod
    def find_or_create(cls, name1, name2=None):
        record = cls.query.filter_by(name1=name1).first()
        if record != None:
            if record.name2 == None and name2 != None:
                record.name2 = name2
        else:
            record = cls(name1, name2)
            db.session.add(record)
        return record


class Thing(db.Model):

    __tablename__ = "thing"

    id          = db.Column(db.Integer, primary_key=True)

    gene_id     = db.Column(db.Integer, db.ForeignKey("gene.id"), nullable=False, index=True)
    gene        = db.relationship("Gene", back_populates='thing')

    data    = db.Column(db.Integer)

I'd like to bulk-insert many things, but I'm afraid that by using

    db.engine.execute(Thing.__table__.insert(), things)

I won't have the relationships in the database. Is there some way of preserving the relationships with a bulk add, or somehow adding these sequentially and then establishing the relationships at a later point? All the documentation about bulk adding seems to assume that you want to insert extremely simple models and I'm a little lost as to how to do this when your models are more complex (the example above is a dumbed down version).

-- Update 1 --

This answer seems to indicate that there isn't really a solution to this.

This answer seems to confirm that.

Sqlalchemy - How to properly bulk insert data into a database when the data has relationships

Answers (1)

Related Questions