Reputation: 861
I am using PostgreSQL 9.4.
I have a table of workouts
. Users can create multiple results
for each workout
, and a result
has a score
.
Given a list of workout_ids and two user_ids, I want to return the best score for each workout for each user. If the user does not have a result for that workout, I want to return a padded/null result.
SELECT "results".*, "workouts".*
FROM "results" LEFT JOIN "workouts" ON "workouts"."id" = "results"."workout_id"
WHERE (
(user_id, workout_id, score) IN
(SELECT user_id, workout_id, MAX(score)
FROM results WHERE user_id IN (1, 2) AND workout_id IN (1, 2, 3)
GROUP BY user_id, workout_id)
)
In this query, the left join is acting as an inner join; I'm not getting any padding if the user has not got a result for the workout. This query should always return six rows, regardless of how many results exist.
Example data:
results
user_id | workout_id | score
-----------------------------
1 | 1 | 10
1 | 3 | 10
1 | 3 | 15
2 | 1 | 5
Desired result:
results.user_id | results.workout_id | max(results.score) | workouts.name
-------------------------------------------------------------------------
1 | 1 | 10 | Squat
1 | 2 | null | Bench
1 | 3 | 15 | Deadlift
2 | 1 | 5 | Squat
2 | 2 | null | Bench
2 | 3 | null | Deadlift
Upvotes: 5
Views: 15950
Reputation: 656291
SELECT DISTINCT ON (1, 2)
u.user_id
, w.id AS workout_id
, r.score
, w.name AS workout_name
FROM workouts w
CROSS JOIN (VALUES (1), (2)) u(user_id)
LEFT JOIN results r ON r.workout_id = w.id
AND r.user_id = u.user_id
WHERE w.id IN (1, 2, 3)
ORDER BY 1, 2, r.score DESC NULLS LAST;
Form a complete Cartesian product of given workouts and users.
Assuming the given workouts always exist.
Assuming that not all given users have results for all given workouts.
LEFT JOIN
to results
. All conditions go into the ON
clause of the LEFT JOIN
, not into the WHERE
clause, which would exclude (workout_id, user_id)
combinations that have no result. See:
Finally pick the best result per (user_id, workout_id)
with DISTINCT ON
. While being at it, produce the desired sort order. See:
Depending on the size of tables and data distribution there may be faster solutions. See:
If all you want is the maximum score
for each (user_id, workout_id)
combination, there is simple version:
SELECT user_id, workout_id, max(r.score) AS score
FROM unnest('{1,2}'::int[]) u(user_id)
CROSS JOIN unnest('{1,2,3}'::int[]) w(workout_id)
LEFT JOIN results r USING (user_id, workout_id)
GROUP BY 1, 2
ORDER BY 1, 2;
db<>fiddle here
Old sqlfiddle.
Upvotes: 2
Reputation: 1269483
How about using distinct on
or row_number()
?
SELECT DISTINCT ON (r.user_id, r.workout_id) r.*, w.*
FROM "results" r LEFT JOIN
"workouts" w
ON "w."id" = r."workout_id"
WHERE r.user_id IN (1, 2) AND r.workout_id IN (1, 2, 3)
ORDER BY r.user_id, r.workout_id, score desc;
The row_number()
equivalent requires a subquery:
SELECT rw.*
FROM (SELECT r.*, w.*,
row_number() over (partition by user_id, workout_id order by score desc) as seqnum
FROM "results" r LEFT JOIN
"workouts" w
ON "w."id" = r."workout_id"
WHERE r.user_id IN (1, 2) AND r.workout_id IN (1, 2, 3)
) rw
WHERE seqnum = 1;
You should choose the columns more judiciously than using a *
. The subquery might return errors in the case of duplicate column names.
EDIT:
You need to generate the rows first, and then the results for each. Here is one method, building on the second query:
SELECT u.user_id, w.workout_id, rw.score, rw.name
FROM (SELECT 1 as user_id UNION ALL SELECT 2) u CROSS JOIN
(SELECT 1 as workout_id UNION ALL SELECT 2 UNION ALL SELECT 3) w LEFT JOIN
(SELECT r.*, w.*,
row_number() over (partition by user_id, workout_id order by score desc) as seqnum
FROM "results" r LEFT JOIN
"workouts" w
ON "w."id" = r."workout_id"
WHERE r.user_id IN (1, 2) AND r.workout_id IN (1, 2, 3)
) rw
ON rw.user_id = u.user_id and rw.workout_id = w.workout_id and
rw.seqnum = 1;
Upvotes: 1
Reputation: 6084
The where filters out your NULL values, so that is why the result is not what you expect.
Joinint the WHERE clause results instead of filter the where clause results.
SELECT "results".*, "workouts".*,"max_score".*
FROM "results"
LEFT JOIN "workouts" ON "workouts"."id" = "results"."workout_id"
LEFT JOIN (SELECT user_id, workout_id, MAX(score)
FROM results WHERE user_id IN (1, 2) AND workout_id IN (1, 2, 3)
GROUP BY user_id, workout_id) max_score ON workouts.workout_id=max_score.workout_id;
You need to alter the SELECT to get the correct columns.
Upvotes: 4