SQLAlchemy using load_only with distinct

I'm trying to use the function distinct with SQLAlchemy but it doesn’t seem to work. I prepared a little example where you can see my problem:

#-*- coding: utf-8 -*-
from sqlalchemy import create_engine,Column, Integer
from sqlalchemy.orm import sessionmaker,load_only
from sqlalchemy.ext.declarative import declarative_base

engine = create_engine('sqlite:///:memory:')
Session = sessionmaker(bind=engine)
session = Session()
Base = declarative_base()

class my_class(Base):
    __tablename__ = 'my_table'
    id= Column(Integer, primary_key=True)
    data= Column(Integer)

Base.metadata.create_all(engine)

for i in range(10):
    p=my_class()
    p.id=i
    p.data=55
    session.add(p)
session.commit()

s=session.query(my_class).distinct(my_class.data).options(load_only(my_class.data))

print (s)
for a in s.all():
    print (a.id,a.data)

Executing this I would expect an output like this:

SELECT my_table.data AS my_table_data 
FROM my_table
None 55

But instead I’m getting this:

SELECT DISTINCT my_table.id AS my_table_id, my_table.data AS my_table_data 
FROM my_table
0 55
1 55
2 55
3 55
4 55
5 55
6 55
7 55
8 55
9 55

What I'm doing wrong?

Upvotes: 0

Views: 2683

Answers (1)

Ilja Everilä
Ilja Everilä

Reputation: 52929

Deferring the primary key would not make sense, if querying complete ORM entities, because an entity must have an identity so that a unique row can be identified in the database table. So the query includes the primary key though you have your load_only(). If you want the data only, you should query for that specifically:

In [12]: session.query(my_class.data).distinct().all()
2017-06-30 12:31:49,200 INFO sqlalchemy.engine.base.Engine SELECT DISTINCT my_table.data AS my_table_data 
FROM my_table
2017-06-30 12:31:49,200 INFO sqlalchemy.engine.base.Engine ()
Out[12]: [(55)]

There actually was an issue where having load_only() did remove the primary key from the select list, and it was fixed in 0.9.5:

[orm] [bug] Modified the behavior of orm.load_only() such that primary key columns are always added to the list of columns to be “undeferred”; otherwise, the ORM can’t load the row’s identity. Apparently, one can defer the mapped primary keys and the ORM will fail, that hasn’t been changed. But as load_only is essentially saying “defer all but X”, it’s more critical that PK cols not be part of this deferral.

Upvotes: 3

Related Questions