Stephen
Stephen

Reputation: 3982

Use GROUP_BY with extra columns in postgreSQL

Let's suppose we have a simple table users:

id  | name | company |  other columns.......
----+------+---------+-------------------------
 1  | A    | A1
 2  | A    | A2
 3  | B    | B1
 4  | C    | C1
 5  | C    | C2
 6  | C    | C3
 ....

I want to group by name, and chose the latest value for id and company. The result I expect is three column table:

id  | name | company |
----+------+---------+
 2  | A    | A2
 3  | B    | B1
 6  | C    | C3
 ....

I am trying to use GROUP_BY, but don't know how to include company column:

SELECT
  max(id),
  name,
  ? # I don't know how to include company
FROM users
GROUP_BY name

Does anyone have better idea?

Upvotes: 2

Views: 991

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1270873

Use distinct on:

select distinct on (name) u.*
from users u
order by name, id desc;

distinct on is a very convenient Postgres extension. It returns the first row from a group of rows. The "grouping" is based on the columns after distinct on. The ordering is based on the order by clause.

There are two other common ways to address this. One method uses window functions:

select u.*
from (select u.*,
             row_number() over (partition by name order by id desc) as seqnum
      from users u
     ) u
where seqnum = 1;

Or a correlated subquery:

select u.*
from users u
where u.id = (select max(u2.id) from users u2 where u2.name -= u.name);

There is even a "clever" way of doing this using group by. Postgres doesn't have a "first" or "last" aggregation function. But you can use arrays:

select name, max(id),
       (array_agg(company order by id desc))[1] as country
from users u
group by name;

Upvotes: 5

Related Questions