Verbal_Kint
Verbal_Kint

Reputation: 1416

hybrid attribute expression: aggregate attribute on many side of one to many

assuming these models:

class A(Base):
    ...
    targets = relationship("B", back_populates='a')
    ...

class B(Base):
    ...
    a_id = Column(Integer, ForeignKey('a.id'))
    a = relationship("A", back_populates='targets')
    attr = Column(ENUM('a', 'b', 'c', name='options'), default='a')
    ...

How could I write a hybrid_attribute.expression to return True for all As where any related Bs have an attr attribute value in ('b', 'c')? (This should also return False if there are no related Bs.)

This gets me halfway there:

# hybrid_attribute expression in A model
@example_attr.expression
def example_attr(cls):
    return case(
        [
            (
                B.attr.in_(('b','c')),
                True
            )
        ],
        else_=False
    )

But then how do I group them by A.id where if any related Bs are True then the example_attr column value for that A row is True?

I want top be able to do: session.query(A).filter(A.example_attr.is_(True)).all()

EDIT 1:

This SQL seems to provide the desired result:

select a.id, coalesce(bool_or(a_id_b_attr.status), false)
    from a left outer join (
        select b.a_id as a_id, b.attr in ('b', 'c') as status from b
    ) as a_id_b_attr
    on a.id = a_id_b_attr.a_id group by a.id;

But I am having a slight issue referring to the aliased fields in the subquery:

sub = db.session.query(B.a_id.label('a_id'), B.attr.in_(('b', 'c')).label('status')).subquery() 
db.session.query( 
    A.id,  
    db.func.coalesce( 
        db.func.bool_or(sub.status),  
        False 
    ) 
).\ 
outerjoin(sub, A.id == sub.key_id).\ 
group_by(A.id) 

Which throws an AttribubteError seeing as sub doesn't have any of the aliased attributed.

Upvotes: 1

Views: 389

Answers (1)

Ilja Everilä
Ilja Everilä

Reputation: 52967

How could I write a hybrid_attribute.expression to return True for all As where any related Bs have an attr attribute value in ('b', 'c')? (This should also return False if there are no related Bs.)

You could use an EXISTS subquery expression:

class A(Base):
    __tablename__ = 'a'
    id = Column(Integer, primary_key=True)
    targets = relationship("B", back_populates='a')

    @hybrid_property
    def example_attr(self):
        return any(b.attr in {'b', 'c'} for b in self.targets)

    @example_attr.expression
    def example_attr(cls):
        return exists().\
            where(and_(B.a_id == cls.id,
                       B.attr.in_(['b', 'c']))).\
            correlate(cls)

class B(Base):
    __tablename__ = 'b'
    id = Column(Integer, primary_key=True)
    a_id = Column(Integer, ForeignKey('a.id'))
    a = relationship("A", back_populates='targets')
    attr = Column(ENUM('a', 'b', 'c', name='options'), default='a')

This does not rely on any grouping, but correlation, and can be used directly in your desired query:

# Note that comparing booleans with booleans is a bit redundant
session.query(A).filter(A.example_attr).all()

Postgresql is able to rewrite (some) queries using correlated EXISTS subquery expressions to semijoins, so the usual performance notes about correlation may not apply.

If you'd like to approach this using grouping, you'd use bool_or and explicit joins:

session.query(A).\
    join(B).\
    group_by(A.id).\
    having(func.bool_or(B.attr.in_(['b', 'c']))).\
    all()

Postgresql allows selecting all columns of A in this grouped query even though they are not aggregate expressions, because they are functionally dependent to the grouping column(s) – in other words A.id determines the selected columns.

Upvotes: 4

Related Questions