Reputation: 1309
I have a table which has its primary key as the combination of 4 columns say column_1, column_2, column_3, column_4
. There are some records in this table which has multiple values for the same combination of <column_1, column_2>
values. I want to write a SQL query so as to select distinct column_1 and column_2 combinations with any one random column_3 and column_4 combination.
Example: Table
column_1 column_2 column_3 column_4
column_1_value_1 column_2_value_1 2000 america
column_1_value_1 column_2_value_1 1000 europe
column_1_value_2 column_2_value_2 3000 asia
I want the output to be
column_1_value_1 column_2_value_1 2000 america
column_1_value_2 column_2_value_2 3000 asia
OR
column_1_value_1 column_2_value_1 1000 europe
column_1_value_2 column_2_value_2 3000 asia
I tried the query select column_1, column_2, max(column_3), max(column_4) from table group by column_1, column_2
but this provides the output:
column_1_value_1 column_2_value_1 2000 europe
column_1_value_2 column_2_value_2 3000 asia
This output is incorrect because column_3 value 2000
does not match with column_4 value europe
.
Can someone please help me formulate the query as per my requirement?
Upvotes: 0
Views: 1431
Reputation: 1309
Little modification to the answer posted by Tim.
SELECT t1.column_1, t1.column_2, t1.column_3, max(t1.column_4)
FROM yourTable t1
INNER JOIN
(
SELECT column_1, column_2, MAX(column_3) AS max_column_3
FROM yourTable
GROUP BY column_1, column_2
) t2
ON t1.column_1 = t2.column_1 AND
t1.column_2 = t2.column_2 AND
t1.column_3 = t2.max_column_3
GROUP BY t1.column_1, t1.column_2, t1.column_3;
Upvotes: 0
Reputation: 521249
We can use ROW_NUMBER
here for a MySQL 8+ option:
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY column_1, column_2 ORDER BY RAND()) rn
FROM yourTable
)
SELECT column_1, column_2, column_3, column_4
FROM cte
WHERE rn = 1;
A solution involving joins, and avoiding analytic functions, might look like:
SELECT t1.column_1, t1.column_2, t1.column_3, t1.column_4
FROM yourTable t1
INNER JOIN
(
SELECT column_1, column_2, MAX(column_3) AS max_column_3
FROM yourTable
GROUP BY column_1, column_2
) t2
ON t1.column_1 = t2.column_1 AND
t1.column_2 = t2.column_2 AND
t1.column_3 = t2.max_column_3;
This second approach arbitrarily chooses the (column_1, column_2)
single record as the one which has the highest value for column_3
.
Upvotes: 2