CrashBandicoot
CrashBandicoot

Reputation: 399

How to use an Inner Join to get desired result in SQL Server 2008

I have 2 tables called Players and Teams. There are about 100 rows of data.

Players table:

  --------------------------------------------------------------------------
  | Player_ID  Player_Name  Team_Id  Country_ID  Captain_ID  Matches_Played|
  -------------------------------------------------------------------------- 
  |    1         Ronaldo       1         1           1            250      |
  |    2         Messi         2         2           2            220      |
  |    3         Marcelo       1         1           1            185      |
  |    4         Suarez        2         2           2            193      |
  --------------------------------------------------------------------------

I want to find the player in each team who has played the most games, using an INNER JOIN.

Desired result:

  --------------------------------------------------------------------------
  | Player_ID  Player_Name  Team_Id  Country_ID  Captain_ID  Matches_Played|
  -------------------------------------------------------------------------- 
  |    1         Ronaldo       1         1           1            250      |
  |    2         Messi         2         2           2            220      |
  --------------------------------------------------------------------------  

The query I tried using:

SELECT 
    p.Player_Name, t.Team_Name, src.Matches_Played AS Matches_Played
FROM 
    Players p
INNER JOIN 
    Teams t ON p.Team_ID = t.Team_ID
INNER JOIN 
    (SELECT Team_ID, MAX(Matches_Played) AS Matches_Played
     FROM Players
     GROUP BY Team_ID) src ON t.Team_ID = src.Team_ID
                           AND p.Team_ID = src.Team_ID;

This query returns the whole table, with the same MAX value of Matches_Played next to each player.

How would I go about fixing my query to get the desired result?

Upvotes: 4

Views: 81

Answers (3)

Nick A
Nick A

Reputation: 126

I think this situation would be a good place to use ROW_NUMBER.

Base Data:

CREATE TABLE #Players (Player_ID INT
                      ,Player_Name VARCHAR(50)
                      ,Team_ID INT
                      ,Country_ID INT
                      ,Captain_ID INT
                      ,Matches_Played INT)

INSERT INTO #Players (Player_ID, Player_Name, Team_ID, Country_ID, Captain_ID, Matches_Played)
VALUES (1, 'Renaldo', 1, 1, 1, 250)
      ,(2, 'Messi', 2, 2, 2, 220)
      ,(3, 'Marcelo', 1, 1, 1, 185)
      ,(4, 'Suarez', 2, 2, 2, 193);

Then I used a basic Select statement and joined it to another select statement that uses ROW_NUMBER()

SELECT   p.Player_ID
        ,p.Player_Name
        ,p.Team_ID
        ,p.Country_ID
        ,p.Captain_ID
        ,p.Matches_Played
FROM #Players p
INNER JOIN (SELECT Player_ID
                  ,ROW_NUMBER() OVER (PARTITION BY Team_ID ORDER BY Matches_Played DESC) AS rnk
  FROM #Players) AS p1 ON p1.Player_ID = p.Player_ID AND rnk = 1

What this does is use ROW_NUMBER() to assign a position to each player within the team. If you had 5 players on the same team, it would number them 1 to 5 with 1 having the most games and 5 having the least. Then when you join it on rnk=1, you are only joining the player with the most games played for each team.

If that is confusing to you, to put the statement into the JOIN, you can also do this other ways.

With a CTE(Common Table Expression):

WITH CTE (Player_ID, Rnk) AS
    (SELECT Player_ID
           ,ROW_NUMBER() OVER (PARTITION BY Team_ID ORDER BY Matches_Played DESC)
     FROM #Players)
SELECT   p.Player_ID
        ,p.Player_Name
        ,p.Team_ID
        ,p.Country_ID
        ,p.Captain_ID
        ,p.Matches_Played
FROM #Players p
INNER JOIN CTE ON cte.Player_ID = p.Player_ID AND rnk = 1

With a Temp Table:

SELECT Player_ID
      ,ROW_NUMBER() OVER (PARTITION BY Team_ID ORDER BY Matches_Played DESC) AS rnk
INTO #RankTable
FROM #Players

SELECT   p.Player_ID
        ,p.Player_Name
        ,p.Team_ID
        ,p.Country_ID
        ,p.Captain_ID
        ,p.Matches_Played
FROM #Players p
INNER JOIN #RankTable r ON r.Player_ID = p.Player_ID AND rnk = 1

Upvotes: 0

Joel Coehoorn
Joel Coehoorn

Reputation: 415880

You don't really need to join at all to do this. As of Sql Server 2005, there is something called the APPLY operator that can work better for this query:

SELECT p.Player_Name, t.Team_Name, p.Matches_Played
FROM Teams t
CROSS APPLY (
     SELECT TOP 1 Player_Name, Matches_Played 
     FROM Players p 
     WHERE p.Team_ID = t.Team_ID 
     ORDER BY Matches_Played DESC
) p

But if this is an assignment where you are required to use a JOIN for some reason, you need to do it in two steps. First find the number of matches of the target player, and then get the full row for that record:

SELECT p.Player_Name, t.Team_Name, p.Matches_Played
FROM Teams t
INNER JOIN (
     SELECT Team_ID, MAX(Matches_Played) as Max_Played
     FROM Players
     GROUP BY Team_ID
) played ON played.Team_ID = t.Team_ID
INNER JOIN Players p ON p.Team_ID = played.Team_ID AND p.Matches_Played = played.Max_Played

Note this might show more than one row per team in the case of a tie, but the question doesn't really indicate what to do in that situation.

Also note that for both queries I start from the Teams table rather than the Players table. The query optimizer should be able to work it out either way, but I think for this query it makes for logical sense for a programmer to begin thinking in terms of finding the match for each Team record, especially when we see the APPLY option never uses the Players table at the root of the query at all.

Finally, I suspect there is yet a third solution that would use a windowing function (ordered row_number + parition by) that might be even better still.

Upvotes: 3

etsa
etsa

Reputation: 5060

If I understood your question, I think you can try:

SELECT p.Player_Name, t.Team_Name, src.Matches_Played AS Matches_Played
FROM Players p
INNER JOIN Teams t
ON p.Team_ID = t.Team_ID
INNER JOIN (
            SELECT Team_ID, MAX(Matches_Played) AS Matches_Played
            FROM Players
            GROUP BY Team_ID)src
ON p.Team_ID = src.Team_ID
AND p.Matches_Played = src.Matches_Played;

Upvotes: 3

Related Questions