Nick
Nick

Reputation: 2911

How to get the most recent rows in a group

I have a Rails 4.2.5.x project running PostGres. I have a table with a similar structure to this:

id, contact_id, date,     domain, f1, f2, f3, etc
1,  ABC,        01-01-16, abc.com, 1,  2,  3, ...
2,  ABC,        01-01-15, abc.com, 1,  2,  3, ...
3,  ABC,        01-01-14, abc.com, 1,  2,  3, ...
4,  DEF,        01-01-15, abc.com, 1,  2,  3, ...
5,  DEF,        01-01-14, abc.com, 1,  2,  3, ...
6,  GHI,        01-11-16, abc.com, 1,  2,  3, ...
7,  GHI,        01-01-16, abc.com, 1,  2,  3, ...
8,  GHI,        01-01-15, abc.com, 1,  2,  3, ...
9,  GHI,        01-01-14, abc.com, 1,  2,  3, ...
...
...
99, ZZZ,        01-01-16, xyz.com, 1,  2,  3, ...

I need to query to find:

In MySQL, this would be a simple SELECT * FROM table WHERE domain='abc.com' GROUP BY contact_id ORDER BY date DESC, however, PostGres complains, in this case, that:

ActiveRecord::StatementInvalid: PG::GroupingError: ERROR: column "table.id" must appear in the GROUP BY clause or be used in an aggregate function

I expect to get back 3 rows; 1, 4 and 6. Ideally, I'd like to get back the full rows in a single query... but I accept that I may need to do one query to get the IDs first, then another to find the items I want.

This is the closest I have got:

ExampleContacts
  .select(:contact_id, 'max(date) AS max_date')
  .where(domain: 'abc.com')
  .group(:contact_id)
  .order('max_date desc')
  .limit(3)

However... this returns the contact_id, not the id. I cannot add the ID for the row.

EDIT:

Essentially, I need to get the primary key back for the row which is grouped on the non-primary key and sorted by another field.

Upvotes: 2

Views: 102

Answers (2)

Nick
Nick

Reputation: 2911

Just to clarify @murad-yusufov's accepted answer, I ended up doing this:

subquery = ExampleContacts.select('DISTINCT ON (contact_id) *')
                          .where(domain: 'abc.com')
                          .order(contact_id)
                          .order(date: :desc)

ExampleContacts.from("(#{subquery.to_sql}) example_contacts")
               .order(date: :desc)

Upvotes: 0

Murad Yusufov
Murad Yusufov

Reputation: 552

If you want the rows, you don't need grouping. It's simply Contact.select('DISTINCT ON (contact_id)').where(domain: 'abc.com').order(date: :desc).limit(3)

Upvotes: 2

Related Questions