Demyanov
Demyanov

Reputation: 921

SQLAlchemy: filter model object with deep relationships?

I have three model objects with two one-to-many relationships: Match <-> Slot <-> Player Relationships are defined with backrefs 'player' and 'match'. So what I'm trying to do is find all matches that satisfy match_filter filter:

slot_filter = or_(Slot.account_id==None,
                  Slot.additional_units!=None)
                # func.count(Slot.player.slots)<=1)

match_filter = and_(Match.human_players==10,
                    Match.game_mode==2,
                   ~Match.slots.any(slot_filter)) 

query = session.query(Match).join(Slot)
query = query.filter(match_filter)
matches = query.all()

The problem is that I don't really know where should I add Player.slots in my query and how can I filter out matches in which at least one player has less (or equal) than n matches played. I already read something about group_by, having, and func.count methods, but I still don't understand how to use them in my case.

Edit I found almost the same question: Filter by grandchildren count in SQLAlchemy But I still have not figure out how to apply this to my relationships

Edit Here is my solution:

subquery = session.query(Player.id).\
           join(Slot).\
           group_by(Player.id).\
           having(func.count(Player.slots) > 1).\
           subquery()

matches =  session.query(Match).\
           join(Slot).\
           join(Player).\
           filter(Player.id.in_(subquery)).\
           filter(match_filter).all()

Upvotes: 1

Views: 2039

Answers (1)

pi.
pi.

Reputation: 21532

Preface: You have to think more like you are working within SQL and less like within an ORM.

If you want to filter by a grandparent, then - if working in SQL - you obviously would have to do a JOIN or use a sub-SELECT.

Having said that, filtering in SQLAlchemy is a very close match to doing a SELECT in pure SQL.

HAVING is something which only applies after your result-set has been generated by the RDBMS, so if your result-set is rather large, using HAVING will be very slow. So use with care.

As for the other options:

Using JOIN

Pro:

  • If you have to access joined objects through a relationship you can use this to pre-populate these objects using joinedload, etc.

Con:

  • One thing to have in mind is that if you do a many-to-many JOIN then your result-set will of course multiply to have more rows. SQLAlchemy therefore will have to read more data (and possibly discard it). Depending on your data this can take a hit on performance.
  • This is also the reason why you have to use group_by to arrive at a sane count again. Using GROUP BY is normally less performing than something which uses indices sanely.

Using sub-SELECTs with correlate

These have to be written using SQLAlchemy-CORE. For example:

subselect = (select([Player.id])
            .select_from(Player.__table__.join(Slot.__table__)
            .where(Player.slot_id == Slot.id)
            .correlate(Slot.__table__))

This is not a dis-advantage. In fact, I prefer using SQLAlchemy this way, as I get more predictable performance out of my queries.

Pro:

  • No performance hit like when using JOINs.

Con:

  • No possibility to pre-load attributes on joined relationships.

Upvotes: 1

Related Questions