Reputation: 125
I'm having trouble finding the right sql query. I want to select all the rows with a unique x value and if there are rows with the same x value, then I want to select the row with the greatest y value. As an example I've put a part of my database below.
ID x y
1 2 3
2 1 5
3 4 6
4 4 7
5 2 6
The selected rows should then be those with ID 2, 4 and 5.
This is what I've got so far
SELECT *
FROM base
WHERE x IN
(
SELECT x
FROM base
HAVING COUNT(*) > 1
)
But this only results in the rows that occur more than once. I've added the tags R, postgresql and sqldf because I'm working in R with those packages.
Upvotes: 1
Views: 80
Reputation: 87064
You could try this query:
select x, max(y) from base group by x;
And, if you'd also like the id
column in the result:
select base.*
from base join (select x, max(y) from base group by x) as maxima
on (base.x = maxima.x and base.y = maxima.max);
Upvotes: 1
Reputation: 674
Example:
CREATE TABLE tmp(id int, x int ,y int);
INSERT INTO .....
test=# SELECT x, max(y) AS y FROM tmp GROUP BY x; x | y ---+--- 4 | 7 1 | 5 2 | 6
Upvotes: 0
Reputation: 1269633
Here is a typical way to formulate the query in ANSI SQL:
select b.*
from base b
where not exists (select 1
from base b2
where b2.x = b.x and
b2.y > b.y
);
In Postgres, you would use distinct on
for performance:
select distinct on (x) b.*
from base b
order by x, y desc;
Upvotes: 1