Reputation: 365

How do I add an aggregate as "virtual column" in a result?

I've got a query that normally looks like

def get_models_with_children(ids):
   query = MyModel.query.filter(MyModel.id.in_(ids))
           .join(Child, Child.parent_id = Child.id)
           .groupBy(MyModel.id)
           .having(func.count(Child.id) > 0)

   return query.all()

Sometimes, I want to actually retrieve the count, as well. I can make that happen easily enough:

def get_models_with_children(ids, return_count):
   query = MyModel.query

   if return_count:
       query = query.add_columns(func.count(Child.id).label("child_count"))

   query = query.filter(MyModel.id.in_(ids))
           .join(Child, Child.parent_id = Child.id)
           .groupBy(MyModel.id)
           .having(func.count(Child.id) > 0)

   return query.all()

This works fine, but now, instead of a List[MyModel] coming back, I've got a differently shaped result with MyModel and child_count keys. If I want the MyModel's id, I do result[0].id if I didn't add the count, and result[0].MyModel.id if I did.

Is there any way I can flatten the result, so that the thing that's returned looks like a MyModel with an extra child_count column?

def do_stuff_with_models():
    result = get_models_with_children([1, 2, 3], True)
    for r in result:
         # can't do this, but I want to:
         print(r.id)
         print(r.child_count)

         # instead I have to do this:
         print(r.MyModel.id)
         print(r.child_count)

Upvotes: 2

Answers (1)

saaj

Reputation: 25273

sqlalchemy.util.KeyedTuple is the type ^* of differently shaped result with MyModel and child_count keys:

Result rows returned by Query that contain multiple ORM entities and/or column expressions make use of this class to return rows.

You can effectively flatten them by explictly specifying the columns for your query. Here follows a complete example (tested on SQLAlchemy==1.3.12).

Plain table column attribute

Models:

import sqlalchemy as sa
from sqlalchemy.ext.declarative import declarative_base


Base = declarative_base()

class User(Base):

    __tablename__ = 'user'

    user_id = sa.Column(sa.Integer, sa.Sequence('user_id_seq'), primary_key=True)
    username = sa.Column(sa.String(80), unique=True, nullable=False)

    def __repr__(self):
        return f'User({self.user_id!r}, {self.username!r})'

class Token(Base):

    __tablename__ = 'token'

    token_id = sa.Column(sa.Integer, sa.Sequence('token_id_seq'), primary_key=True)
    user_id = sa.Column(sa.Integer, sa.ForeignKey('user.user_id'), nullable=False)
    user = sa.orm.relationship('User')
    value = sa.Column(sa.String(120), nullable=False)

    def __repr__(self):
        return f'Token({self.user.username!r}, {self.value!r})'

Connect and fill some data:

engine = sa.create_engine('sqlite://')
Base.metadata.create_all(engine)
Session = sa.orm.sessionmaker(bind=engine)
session = Session()

user1 = User(username='joe')
user2 = User(username='john')
token1 = Token(user=user1, value='q1w2e3r4t56')

session.add_all([user1, user2, token1])
session.commit()

Now, let's define the "virtual" column as whether user has a token:

query = session.query(User)
exists = (
    sa.exists()
    .where(User.user_id == Token.user_id)
    .correlate(User)
    .label("has_token")
)
query = query.add_columns(exists)
query.all()  # [(User(1, 'joe'), True), (User(2, 'john'), False)]

It's the undesired shape. And here's how to flatten it:

query = session.query(*[getattr(User, n) for n in User.__table__.columns.keys()])
query = query.add_columns(exists)
query.all()  # [(1, 'joe', True), (2, 'john', False)]

It's all possible to define columns for an existing query, given that you know the model:

query = session.query(User)
# later down the line
query = query.with_entities(*[
    getattr(User, n) for n in User.__table__.columns.keys()])
query = query.add_columns(exists)
query.all()  # [(1, 'joe', True), (2, 'john', False)]

Column bundle

The same can be achieved with sqlalchemy.orm.Bundle and passing single_entity to it.

bundle = sa.orm.Bundle(
    'UserBundle', User.user_id, User.username, exists, single_entity=True)
query = session.query(bundle)
query.all()  # [(1, 'joe', True), (2, 'john', False)]

Issue with relationship attribute

With complex models it gets complicated. It's possible to inspect the model (mapped class) attributes with sqlalchemy.orm.mapper.Mapper.attrs and take class_attribute:

# replace
[getattr(User, n) for n in User.__table__.columns.keys()]
# with
[mp.class_attribute for mp in sa.inspect(User).attrs]

But in this case relationship attributes turn into their target tables in FROM clause of the query without ON clause, effectively producing a cartesian product. And the "joins" have to be defined manually, so it's not a good solution. See this answer and a SQLAlchemy user group discussion.

Query expression attribute

Myself I ended up using query expressions, because of the issues with relationships in existing code. It's possible to get away with minimal modification of the model, with query-time SQL expressions as mapped attributes.

User.has_tokens = sa.orm.query_expression()
...
query = query.options(sa.orm.with_expression(User.has_tokens, exists))
query.all()  # [User(1, 'joe'), User(2, 'john')]
[u.has_tokens for u in query.all()]  # [True, False]

_{* Actually it's generated on-the-fly sqlalchemy.util._collections.result with MRO of sqlalchemy.util._collections.result, sqlalchemy.util._collections._LW, class sqlalchemy.util._collections.AbstractKeyedTuple, tuple, object, but that's details. More details on how the class is created with lightweight_named_tuple are available in this answer.}

Upvotes: 1

How do I add an aggregate as &quot;virtual column&quot; in a result?