saifuddin778
saifuddin778

Reputation: 7277

Get the most recent record for a user

Say I have records like this:

user_id  user_data_field  created
-------  ---------------  -------
1        some data        date_a
2        some data        date_b
1        some data        date_c
1        some data        date_d
2        some data        date_e

What should I do to get all the user records with most recent dates only (assuming that most recent dates are not the ones at the bottom...can be anywhere)? I am using an in_ clause to fetch the bulk users in sqlalchemy:

session.query(Table).filter(Table.user_id.in_([1,2])).order_by(Table.created.desc())

which just sorts them by the created order. Moreover, if I just add the first() clause at the end of this statement, it gets me just 1 row. So, do I have a way in sqlalchemy to get it done? Thanks.

Upvotes: 2

Views: 3044

Answers (3)

van
van

Reputation: 76992

@khan: you solution is almost working, but the flaw is described in the comment to your answer.

The code below solves this particular issue (but still relies on the fact that the would not be duplicate created values for the same user_id):

subq = (
    session
    .query(MyTable.user_id, func.max(MyTable.created).label("max_created"))
    .filter(MyTable.user_id.in_([1, 2]))
    .group_by(MyTable.user_id)
    .subquery()
)

q = (
    session.query(MyTable)
    .join(subq, and_(MyTable.user_id == subq.c.user_id,
                     MyTable.created == subq.c.max_created))
)

Upvotes: 3

saifuddin778
saifuddin778

Reputation: 7277

Okay, I found a way out, using subquery and a max function:

session.query(Table).filter(Table.user_id.in_([1,4])).\
filter(Table.created.in_(session.query(func.max(Table.created)).\
filter(Table.user_id.in_([1,4])).group_by(Table.user_id).subquery()))

In a more cleaner way, we can write out as:

subquery = session.query(func.max(Table.created)).\
           filter(Table.user_id.in_([1,4])).\
           group_by(Table.user_id).\
           subquery()

session.query(Table).filter(Table.user_id.in_([1,4])).filter(Table.created.in_(subq))

Upvotes: 0

Seb
Seb

Reputation: 61

It sounds to me that the SQL query you're looking for would be something like:

SELECT user_id, MAX(created) FROM Table WHERE user_id IN (1, 2) GROUP BY user_id;

So now the deal is to translate it using sqlalchemy, I'm guessing something like that would do:

session.query(Table.user_id, func.max(Table.created)).filter(Table.user_id.in_([1,2])).group_by(Table.user_id).all()

http://sqlalchemy.readthedocs.org/en/rel_1_0/core/functions.html?highlight=max#sqlalchemy.sql.functions.max

Upvotes: 1

Related Questions