Reputation: 73
I have 2 tables , the first one is called Athletes and has columns that include :
athlete_id,first_name,last_name,country_id
The second table is called medals and
has the columns:
year,sport,event,athlete_id,medal
This is what a part of Athletes looks like:
athlete_id first_name last_name country_id
1 BURKETOM01 Tom Burke USA
2 HOFMAFRI01 Fritz Hofmann DEU
3 LANEFRA01 Francis Lane USA
4 SZOKOALA01 Alajos Szokolyi HUN
5 BURKETOM01 Tom Burke USA
6 JAMISHER01 Herbert Jamison USA
This is what a part of Medals looks like:
year sport event athlete_id medal
1 1896 Track & Field 100m Men BURKETOM01 GOLD
2 1896 Track & Field 100m Men HOFMAFRI01 SILVER
3 1896 Track & Field 100m Men LANEFRA01 BRONZE
4 1896 Track & Field 100m Men SZOKOALA01 BRONZE
5 1896 Track & Field 400m Men BURKETOM01 GOLD
6 1896 Track & Field 400m Men JAMISHER01 SILVER
I need to find the athlete with the highest number of medals and the number of medals he has won. The correct answer is
full_name top_no_medals
1 Larisa Latynina 18
I saw a few posts similar to this and tried using what was suggested there This is my code:
dbGetQuery(olympics.db,statement =
'SELECT Athletes.first_name ||" "|| Athletes.last_name AS full_name,COUNT(Medals.athlete_id)
AS top_no_medals
FROM Athletes
JOIN Medals ON Medals.athlete_id=Athletes.athlete_id
GROUP BY full_name
ORDER BY COUNT(*) DESC
LIMIT 1'
)
what I am trying to do is combine the 2 tables according to the variable athlete_id and the count this variable. For some reason the answer I get is with too many medals.
full_name top_no_medals
1 Larisa Latynina 324
It seems there is a mistake couting the medal. Now I am sure the data is correct, when I check Medals table to see What is the most common athlete_id I get the correct one. This is the code and answer:
dbGetQuery(olympics.db,statement = ' Select athlete_id, count(*) AS top_no_medals
From Medals
Group By athlete_id
ORDER BY COUNT(*) DESC
LIMIT 1')
athlete_id top_no_medals
1 LATYNLAR01 18
This athlete_id is the one associated with Larisa Latynina so its not a problem in the medals table.
Upvotes: 0
Views: 805
Reputation: 676
Given that the second query (get medal count) works, I would suggest computing it first then joining with Athletes to get name matching that ID:
SELECT a.first_name || ' ' || a.last_name AS full_name, top_no_medals
FROM (SELECT athlete_id, count(*) AS top_no_medals
FROM Medals
GROUP BY athlete_id
ORDER BY COUNT(*) DESC
LIMIT 1) m
JOIN Athletes a ON m.athlete_id = a.athlete_id
Upvotes: 0
Reputation: 1270713
Based on the fact that your second query works, the problem would appear to be the column athletes(athlete_id)
. This seems to be duplicated.
Try running this query:
select athlete_id, count(*) as cnt
from athletes
group by athlete_id
order by count(*) desc;
The maximum cnt
should be 1. If not, you have a problem with your data.
Oh, never mind with that. Your Atheletes
table does have duplicates. Assuming the name is always the same, you can do:
SELECT a.full_name, COUNT(*) AS top_no_medals
FROM (SELECT a.athlete_id,
MAX(a.first_name || ' ' || a.last_name) as full_name
FROM Athletes a
GROUP BY athlete_id
) a
Medals m
ON m.athlete_id = a.athlete_id
GROUP BY full_name
ORDER BY COUNT(*) DESC
LIMIT 1;
You could also revise your query to use COUNT(DISTINCT)
, assuming that each medal has a unique identifier:
SELECT a.first_name || ' ' || a.last_name AS full_name,
COUNT(DISTINCT m.medal_id) as top_no_medals
FROM Athletes a JOIN
Medals m
ON m.athlete_id = a.athlete_id
GROUP BY full_name
ORDER BY COUNT(DISTINCT m.medal_id) DESC
LIMIT 1;
Upvotes: 2