Reputation: 471

SQL group by a column yet ignore the duplicated item of another column

I would like to solve a problem with a situation similar to the following data,

company customer pay sequence
a       x        5    1
a       x        6    2
b       x        3    3
b       y        4    4
a       y        2    5

And I want to have a result by calculating the summation of the "pay" group by the company, but the duplicated customer will only use once based on the latest sequence. So the expected result is

company sum(pay)
a        8
b        7

Meaning that, company "a" sum(pay) 8 comes from second row plus with the last row. And company "b" sum(pay) 7 comes from 3rd and 4th rows. And the 1st row is ignored.

Thus, I use such query (PostgreSQL) to solve the problem,

select t2.company,sum(t2.pay) from mytable t2 join (
    select company,customer,max(sequence) as sequence 
    from mytable group by company,customer
) t1 on t1.sequence=t2.sequence group by t2.company

I would like to know whether is there any easier way to use the query without subquery as above?

Upvotes: 0

Answers (3)

krokodilko

Reputation: 36137

Using a windows function is another option,
I am not sure it's an easier or faster method though.
In my opionion it can't be done without a subquery.

SELECT company,sum(pay)
FROM (
   SELECT *,
          max( sequence ) OVER (partition by company,customer ) max_seq
   FROM table1
) AS t
WHERE sequence = max_seq
GROUP BY company

Demo: http://www.sqlfiddle.com/#!15/f66b3/4

Upvotes: 0

Lamak

Reputation: 70678

Another way to do this would be to use ROW_NUMBER, but I don't know if you can say that it is an "easier" way:

WITH CTE AS
(
  SELECT *, 
         ROW_NUMBER() OVER(PARTITION BY company, customer 
                           ORDER BY sequence DESC) AS RN
  FROM YourTable
)
SELECT company, 
       SUM(pay) AS pay
FROM CTE
WHERE RN = 1
GROUP BY company
ORDER BY company

The results are:

╔═════════╦═════╗
║ COMPANY ║ PAY ║
╠═════════╬═════╣
║ b       ║   7 ║
║ a       ║   8 ║
╚═════════╩═════╝

And here is an sqlfiddle with a demo for you to try.

Upvotes: 0

Gordon Linoff

Reputation: 1271151

In Postgres, you can do this without the join, by using the distinct on syntax:

select t.company, sum(t.pay)
from (select distinct on (company, customer) company, customer, pay
      from mytable
      order by company, customer, sequence desc
     ) t
group by t.company;

I prefer the method in the OP because it is standard SQL. The distinct on statement is only in Postgres.

Upvotes: 2

SQL group by a column yet ignore the duplicated item of another column

Answers (3)

Related Questions