Reputation: 399
I have 2 tables called Players
and Teams
. There are about 100 rows of data.
Players
columns: Player_ID, Player_Name, Team_ID, Country_ID, Captain_ID, Matches_Played
Teams
columns: Team_ID, Team_Name, Manager_ID, Matches_Won, Matches_Lost, Country_ID
Players
table:
--------------------------------------------------------------------------
| Player_ID Player_Name Team_Id Country_ID Captain_ID Matches_Played|
--------------------------------------------------------------------------
| 1 Ronaldo 1 1 1 250 |
| 2 Messi 2 2 2 220 |
| 3 Marcelo 1 1 1 185 |
| 4 Suarez 2 2 2 193 |
--------------------------------------------------------------------------
I want to find the player in each team who has played the most games, using an INNER JOIN.
Desired result:
--------------------------------------------------------------------------
| Player_ID Player_Name Team_Id Country_ID Captain_ID Matches_Played|
--------------------------------------------------------------------------
| 1 Ronaldo 1 1 1 250 |
| 2 Messi 2 2 2 220 |
--------------------------------------------------------------------------
The query I tried using:
SELECT
p.Player_Name, t.Team_Name, src.Matches_Played AS Matches_Played
FROM
Players p
INNER JOIN
Teams t ON p.Team_ID = t.Team_ID
INNER JOIN
(SELECT Team_ID, MAX(Matches_Played) AS Matches_Played
FROM Players
GROUP BY Team_ID) src ON t.Team_ID = src.Team_ID
AND p.Team_ID = src.Team_ID;
This query returns the whole table, with the same MAX
value of Matches_Played
next to each player.
How would I go about fixing my query to get the desired result?
Upvotes: 4
Views: 81
Reputation: 126
I think this situation would be a good place to use ROW_NUMBER.
Base Data:
CREATE TABLE #Players (Player_ID INT
,Player_Name VARCHAR(50)
,Team_ID INT
,Country_ID INT
,Captain_ID INT
,Matches_Played INT)
INSERT INTO #Players (Player_ID, Player_Name, Team_ID, Country_ID, Captain_ID, Matches_Played)
VALUES (1, 'Renaldo', 1, 1, 1, 250)
,(2, 'Messi', 2, 2, 2, 220)
,(3, 'Marcelo', 1, 1, 1, 185)
,(4, 'Suarez', 2, 2, 2, 193);
Then I used a basic Select statement and joined it to another select statement that uses ROW_NUMBER()
SELECT p.Player_ID
,p.Player_Name
,p.Team_ID
,p.Country_ID
,p.Captain_ID
,p.Matches_Played
FROM #Players p
INNER JOIN (SELECT Player_ID
,ROW_NUMBER() OVER (PARTITION BY Team_ID ORDER BY Matches_Played DESC) AS rnk
FROM #Players) AS p1 ON p1.Player_ID = p.Player_ID AND rnk = 1
What this does is use ROW_NUMBER() to assign a position to each player within the team. If you had 5 players on the same team, it would number them 1 to 5 with 1 having the most games and 5 having the least. Then when you join it on rnk=1, you are only joining the player with the most games played for each team.
If that is confusing to you, to put the statement into the JOIN, you can also do this other ways.
With a CTE(Common Table Expression):
WITH CTE (Player_ID, Rnk) AS
(SELECT Player_ID
,ROW_NUMBER() OVER (PARTITION BY Team_ID ORDER BY Matches_Played DESC)
FROM #Players)
SELECT p.Player_ID
,p.Player_Name
,p.Team_ID
,p.Country_ID
,p.Captain_ID
,p.Matches_Played
FROM #Players p
INNER JOIN CTE ON cte.Player_ID = p.Player_ID AND rnk = 1
With a Temp Table:
SELECT Player_ID
,ROW_NUMBER() OVER (PARTITION BY Team_ID ORDER BY Matches_Played DESC) AS rnk
INTO #RankTable
FROM #Players
SELECT p.Player_ID
,p.Player_Name
,p.Team_ID
,p.Country_ID
,p.Captain_ID
,p.Matches_Played
FROM #Players p
INNER JOIN #RankTable r ON r.Player_ID = p.Player_ID AND rnk = 1
Upvotes: 0
Reputation: 415880
You don't really need to join at all to do this. As of Sql Server 2005, there is something called the APPLY
operator that can work better for this query:
SELECT p.Player_Name, t.Team_Name, p.Matches_Played
FROM Teams t
CROSS APPLY (
SELECT TOP 1 Player_Name, Matches_Played
FROM Players p
WHERE p.Team_ID = t.Team_ID
ORDER BY Matches_Played DESC
) p
But if this is an assignment where you are required to use a JOIN for some reason, you need to do it in two steps. First find the number of matches of the target player, and then get the full row for that record:
SELECT p.Player_Name, t.Team_Name, p.Matches_Played
FROM Teams t
INNER JOIN (
SELECT Team_ID, MAX(Matches_Played) as Max_Played
FROM Players
GROUP BY Team_ID
) played ON played.Team_ID = t.Team_ID
INNER JOIN Players p ON p.Team_ID = played.Team_ID AND p.Matches_Played = played.Max_Played
Note this might show more than one row per team in the case of a tie, but the question doesn't really indicate what to do in that situation.
Also note that for both queries I start from the Teams table rather than the Players table. The query optimizer should be able to work it out either way, but I think for this query it makes for logical sense for a programmer to begin thinking in terms of finding the match for each Team record, especially when we see the APPLY
option never uses the Players table at the root of the query at all.
Finally, I suspect there is yet a third solution that would use a windowing function (ordered row_number + parition by) that might be even better still.
Upvotes: 3
Reputation: 5060
If I understood your question, I think you can try:
SELECT p.Player_Name, t.Team_Name, src.Matches_Played AS Matches_Played
FROM Players p
INNER JOIN Teams t
ON p.Team_ID = t.Team_ID
INNER JOIN (
SELECT Team_ID, MAX(Matches_Played) AS Matches_Played
FROM Players
GROUP BY Team_ID)src
ON p.Team_ID = src.Team_ID
AND p.Matches_Played = src.Matches_Played;
Upvotes: 3