AnApprentice
AnApprentice

Reputation: 110950

How to query for records based on a group and ordering?

Given the following models:

User
    id
UserPulses
    id, user_id, group_id, created_at

What I would like to do is obtain all of a user's UserPulses grouped by (group_id) and only obtain the most recent UserPulse per group_id. I've been able to do this by looping through group by group, but that takes a large number of queries. Is this possible with one query?

Something like: user.user_pulses.group("group_id)")

Any ideas? Thanks

Upvotes: 0

Views: 128

Answers (3)

Erwin Brandstetter
Erwin Brandstetter

Reputation: 656481

You can use the PostgreSQL specific extension of the SQL feature DISTINCT: DISTINCT ON

SELECT DISTINCT ON (group_id)
       id, user_id, group_id, created_at
FROM   user_pulses
WHERE  user_id = :user_id
ORDER  BY group_id, created_at DESC, id; -- id just to break ties

Simpler than window functions (but not as portable) and probably fastest.
More details under this related question.

Upvotes: 2

mu is too short
mu is too short

Reputation: 434635

You can't do this reliably through the usual ActiveRecord interface but you can do it through SQL using a window function. You want some SQL like this:

select id, user_id, group_id, created_at
from (
    select id, user_id, group_id, created_at, 
           row_number() over (partition by group_id order by created_at desc, id desc) as r
    from user_pulses
    where user_id = :user_id
) dt
where r = 1

and then wrap that in find_by_sql:

pulses = UserPulses.find_by_sql([%q{
    select id, user_id, group_id, created_at
    from (
        select id, user_id, group_id, created_at, 
               row_number() over (partition by group_id order by created_at desc, id desc) as r
        from user_pulses
        where user_id = :user_id
    ) dt
    where r = 1
}, :user_id => user.id])

The window function part essentially does a local GROUP BY with each group_id, sorts them (with id desc as the secondary sort key as a "just in case" tie breaker), and tacks the per-group row number in r. Then the outer query filters out the first in each group (where r = 1)and peels off the originaluser_pulses` columns.

Upvotes: 2

Glenn
Glenn

Reputation: 9150

Something like this, perhaps. But there could be multiple records for a user/group_id combo if they share the same date.

SELECT p.id, p.user_id, p.group_id, p.created_at
  FROM UserPulses p
      ,( SELECT user_id, group_id, MAX(created_at) as max_date
           FROM UserPulses
           GROUP BY user_id, group_id ) m
  WHERE u.user_id = m.user_id
    AND u.group_id = m.group_id
    AND u.created_at = m.max_date

Upvotes: 0

Related Questions