Reputation: 1
I've got 3 tables I'm trying to join with inner joins, but for some reason they are resulting in a bunch of duplicate results. I've been ripping my hair out trying to figure out whats going wrong here. Below is the tables I am using:
Table:gameday.atbats
GameName Inning num b s o Batter Pitcher Result
-----------------------------------------------------------------------------------------
gid_2008_09_24_cinmlb_houmlb_1 1 1 2 3 1 457803 150116 Jay Bruce strikes out swinging.
gid_2008_09_24_cinmlb_houmlb_1 1 2 1 0 2 433898 150116 Jeff Keppinger lines out to right fielder Hunter Pence.
gid_2008_09_24_cinmlb_houmlb_1 1 3 3 1 2 458015 150116 Joey Votto singles on a line drive to right fielder Hunter Pence.
gid_2008_09_24_cinmlb_houmlb_1 1 4 2 3 3 429665 150116 Edwin Encarnacion called out on strikes.
gid_2008_09_24_cinmlb_houmlb_1 1 5 1 2 0 430565 459371 Kazuo Matsui singles on a line drive to right fielder Jay Bruce.
-----------------------------------------------------------------------------------------
Table: Gameday.pitches
GameName GameAtBatID Result
------------------------------------------------------
gid_2008_09_24_cinmlb_houmlb_1 1 Called Strike
gid_2008_09_24_cinmlb_houmlb_1 1 Ball
gid_2008_09_24_cinmlb_houmlb_1 1 Swinging Strike
gid_2008_09_24_cinmlb_houmlb_1 1 Ball
gid_2008_09_24_cinmlb_houmlb_1 1 Foul
gid_2008_09_24_cinmlb_houmlb_1 1 Foul
gid_2008_09_24_cinmlb_houmlb_1 1 Swinging Strike
gid_2008_09_24_cinmlb_houmlb_1 2 Ball
gid_2008_09_24_cinmlb_houmlb_1 2 In play, out(s)
gid_2008_09_24_cinmlb_houmlb_1 3 Called Strike
gid_2008_09_24_cinmlb_houmlb_1 3 Ball
--------------------------------------------------------
Table:batters
GameName id name_display_first_last
----------------------------------------------------------------------------------
gid_2008_09_24_cinmlb_houmlb_1 407783 Geoff Geary
gid_2008_09_24_cinmlb_houmlb_1 209315 David Newhan
gid_2008_09_24_cinmlb_houmlb_1 115629 LaTroy Hawkins
gid_2008_09_24_cinmlb_houmlb_1 113889 Darin Erstad
gid_2008_09_24_cinmlb_houmlb_1 457803 Jay Bruce
gid_2008_09_24_cinmlb_houmlb_1 433898 Jeff Keppinger
gid_2008_09_24_cinmlb_houmlb_1 458015 Joey Votto
gid_2008_09_24_cinmlb_houmlb_1 429665 Edwin Encarnacion
---------------------------------------------------------------------------
The code I'm using is as follows:
SELECT gameday.atbats.event
, gameday.atbats.inning
, gameday.batters.name_display_first_last
, gameday.pitchers.name_display_first_last
, gameday.pitches.result
FROM gameday.atbats
inner join gameday.pitches on gameday.atbats.num = gameday.pitches.gameAtBatID
and gameday.pitches.gamename=gameday.atbats.gamename
inner join gameday.batters on gameday.atbats.batter = gameday.batters.ID
AND gameday.atbats.gamename = gameday.batters.gameName
where gameday.atbats.gamename = "gid_2008_09_24_cinmlb_houmlb_1"
The current code results in each player doubling their results, for example, if the first batter in the game were to strike out on 3 pitches, then current code will output that batter striking out twice in the first inning on 6 pitches. Im joining on the game ID and the IDs from each table, but there seems to still be some sort of an issue
Upvotes: 0
Views: 117
Reputation: 11
In your query At the last line you have written the where condition with game name, So is game name unique if it is not then you will get multiple answers so better way to use primary key in where condition of your gameday.atbats table.
Upvotes: 0
Reputation: 271
You need to add this after Where condition like
AND gameday.atbats.Id = Gameday.pitches.GameAtBatID
AND batters.ID = gameday.atbats.batters
Upvotes: 0
Reputation: 692
No offense, but I don't like the "use distinct" answer because it would just mask the problem. What I do in cases like this is insert a SELECT COUNT(*)
line right above FROM
, then incrementally look at results adding one line at a time; ie, SELECT COUNT(*) FROM A
, then A JOIN B
, then A JOIN B JOIN C
, etc. to detect the offending JOIN
.
Upvotes: 1