crooksey
crooksey

Reputation: 8809

SQLalchemy distinct, order_by different column

I am querying two table with SQLalchemy, I want to use the distinct feature on my query, to get a unique set of customer id's

I have the following query:

orders[n] = DBSession.query(Order).\
          join(Customer).\
          filter(Order.oh_reqdate == date_q).\
          filter(Order.vehicle_id == vehicle.id).\
          order_by(Customer.id).\
          distinct(Customer.id).\
          order_by(asc(Order.position)).all() 

If you can see what is going on here, I am querying the Order table for all orders out for a specific date, for a specific vehicle, this works fine. However some customers may have more than one order for a single date. So I am trying to filter the results to only list each customer once. This work fine, however In order to do this, I must first order the results by the column that has the distinct() function on it. I can add in a second order_by to the column I want the results ordered by, without causing a syntax error. But it gets ignored and results are simply ordered by the Customer.id.

I need to perform my query on the Order table and join to the customer (not the other way round) due to the way the foreign keys have been setup.

Is what I want to-do possible within one query? Or will I need to re-loop over my results to get the data I want in the right order?

Upvotes: 4

Views: 10092

Answers (1)

zzzeek
zzzeek

Reputation: 75127

you never need to "re-loop" - if you mean load the rows into Python, that is. You probably want to produce a subquery and select from that, which you can achieve using query.from_self().order_by(asc(Order.position)). More specific scenarios you can get using subquery().

In this case I can't really tell what you're going for. If a customer has more than one Order with the requested vehicle id and date, you'll get two rows, one for each Order, and each Order row will refer to the Customer. What exactly do you want instead ? Just the first order row within each customer group ? I'd do that like this:

highest_order = s.query(Order.customer_id, func.max(Order.position).label('position')).\
                    filter(Order.oh_reqdate == date_q).\
                    filter(Order.vehicle_id == vehicle.id).\
                    group_by(Order.customer_id).\
                    subquery()

s.query(Order).\
    join(Customer).\
    join(highest_order, highest_order.c.customer_id == Customer.id).\
    filter(Order.oh_reqdate == date_q).\
    filter(Order.vehicle_id == vehicle.id).\
    filter(Order.position == highest_order.c.position)

Upvotes: 3

Related Questions