ju.
ju.

Reputation: 344

Python SQLAlchemy: Filter joined tables with subqueries

In sqlalchemy I defined a model for my database, in this case two tables "sample" and "experiment", which are linked over a many-to-many relationship to each other:

class Sample(Base):
    __tablename__ = 'sample'

    id = Column(Integer, primary_key=True)
    name = Column(String)
    date = Column(String)

class Experiment(Base):
    __tablename__ = 'experiment'

    id = Column(Integer, primary_key=True)
    name = Column(String)
    description = Column(String)
    result = Column(String)

    samples = relationship('Sample', secondary='sample_has_experiment', backref="experiments")

t_sample_has_experiment = Table(
    'sample_has_experiment', metadata,
    Column('sample_id', ForeignKey('sample.id'), primary_key=True, nullable=False, index=True),
    Column('experiment_id', ForeignKey('experiment.id'), primary_key=True, nullable=False, index=True)
)

In my database, I have a sample "Dilithium", and in experiments two experiments "Cold fusion" and "Worm hole".

I try to query the "Worm hole" experiment over a join to the sample table:

samples = s.query(Obj.Sample).join(Obj.Sample.experiments).\
    filter(Obj.Experiment.name == "Worm hole").all()


for sample in samples:
    for experiment in sample.experiments:
        print(experiment.name)

But as a result I still get both experiments "Worm hole" AND "Cold Fusion". So it looks like the filtering is not applied. How can I filter that way that I only receive the "Worm Hole" Experiment object? Thank you.

@Dublicate: Indeed it looks like the same question, but the given answer does not solve my problem. In my opinion, my suggested query does exactly the same like in the proposed answer.

Upvotes: 0

Views: 3831

Answers (2)

ju.
ju.

Reputation: 344

Edit: Actually I realized what my mistake in the first place was. I have to query all tables and then apply joins and filters:

qry = s.query(Obj.Sample, Obj.Experiment).\ # Query both tables!
    filter(Obj.Sample.name == "...").\
    filter(Obj.Experiment.name == "...").\
    join(Obj.Experiment, Obj.Sample.experiments).all()

Alternative way: The use of a subquery leads to the desired behaviour:

wormhole_exp = s.query(Obj.Experiment).filter(Obj.Experiment.name == "wormhole").subquery()

print(s.query(Obj.Sample, wormhole_exp).join(wormhole_exp, Obj.Sample.experiments).one())

Upvotes: 0

Sam Hartman
Sam Hartman

Reputation: 6489

What your code says is:

  • For all samples that were part of the wormhole experiment

    • Print all experiments that sample is part of

That is, given a particular sample, sample.experiments is always all the experiments that sample belongs to not just the experiment you got to that sample through.

You can see this if you go add a new sample that does not belong to the wormhole experiment. It should not appear in your query.

So, my answer to "why isn't my query filtering on the join," is that I strongly suspect it is.

If you want the sample objects that were in the wormhole experiment then something like

samples = session.query(Experiments).filter_by(name = 'wormhole').one().samples

Upvotes: 1

Related Questions