Alex Olteanu
Alex Olteanu

Reputation: 4248

How to find the max of a value in a column that corresponds to another value in another column? (example in description)

This is a sample of data from a sqlite3 database:

playerID    nameFirst   nameLast    HR (homeruns)

bondsba01   Barry       Bonds       73
mcgwima01   Mark        McGwire     70
sosasa01    Sammy       Sosa        66
mcgwima01   Mark        McGwire     65
sosasa01    Sammy       Sosa        64
sosasa01    Sammy       Sosa        63
marisro01   Roger       Maris       61
ruthba01    Babe        Ruth        60
ruthba01    Babe        Ruth        59
foxxji01    Jimmie      Foxx        58
greenha01   Hank        Greenberg   58
...truncated...

I've used this command to select this data:

SELECT Master.playerID, Master.nameFirst, Master.nameLast, Batting.HR
FROM Master
INNER JOIN Batting on Master.playerID = Batting.playerID
ORDER BY Batting.HR DESC
LIMIT 100;

Baiscally I want to answer the following question: what's the top 20 baseball players by number of homeruns? But, as you can see, some players have multiple entries, because the tables have data for multiple years that player was active.

I already answered the question using pandas and some other Python commands. But I want to be able to answer the question with a single SQL query. I've tried without success a couple of things with DISTINCT, GROUP BY etc.

To be more clear, I want the maximum number of home runs each unique player had. Then I want to sort all those unique names by the number of home runs.

I want to transfrom the above table into this:

playerID    nameFirst   nameLast    HR (homeruns)

bondsba01   Barry       Bonds       73
mcgwima01   Mark        McGwire     70
sosasa01    Sammy       Sosa        66
marisro01   Roger       Maris       61
ruthba01    Babe        Ruth        60
foxxji01    Jimmie      Foxx        58
greenha01   Hank        Greenberg   58

I guess this shouldn't be very hard, but I don't have much experience with SQL.

Upvotes: 0

Views: 41

Answers (1)

mohan111
mohan111

Reputation: 8865

using MAX ()

SELECT M.playerID, M.nameFirst, M.nameLast, MAX(B.HR) Runs
FROM Master M
INNER JOIN Batting B
    on M.playerID = B.playerID
GROUP BY M.playerID, M.nameFirst, M.nameLast

Upvotes: 4

Related Questions