Johnny
Johnny

Reputation: 1

Duplicate data resulting from inner join

I've got 3 tables I'm trying to join with inner joins, but for some reason they are resulting in a bunch of duplicate results. I've been ripping my hair out trying to figure out whats going wrong here. Below is the tables I am using:

 Table:gameday.atbats

   GameName                     Inning num  b   s   o   Batter   Pitcher      Result
-----------------------------------------------------------------------------------------
    gid_2008_09_24_cinmlb_houmlb_1  1   1   2   3   1   457803  150116  Jay Bruce strikes out swinging.  
    gid_2008_09_24_cinmlb_houmlb_1  1   2   1   0   2   433898  150116  Jeff Keppinger lines out to right fielder Hunter Pence.  
    gid_2008_09_24_cinmlb_houmlb_1  1   3   3   1   2   458015  150116  Joey Votto singles on a line drive to right fielder Hunter Pence.  
    gid_2008_09_24_cinmlb_houmlb_1  1   4   2   3   3   429665  150116  Edwin Encarnacion called out on strikes.  
    gid_2008_09_24_cinmlb_houmlb_1  1   5   1   2   0   430565  459371  Kazuo Matsui singles on a line drive to right fielder Jay Bruce.  
-----------------------------------------------------------------------------------------

Table: Gameday.pitches
 GameName                   GameAtBatID      Result
------------------------------------------------------
gid_2008_09_24_cinmlb_houmlb_1  1       Called Strike
gid_2008_09_24_cinmlb_houmlb_1  1       Ball
gid_2008_09_24_cinmlb_houmlb_1  1       Swinging Strike
gid_2008_09_24_cinmlb_houmlb_1  1       Ball
gid_2008_09_24_cinmlb_houmlb_1  1       Foul
gid_2008_09_24_cinmlb_houmlb_1  1       Foul
gid_2008_09_24_cinmlb_houmlb_1  1       Swinging Strike
gid_2008_09_24_cinmlb_houmlb_1  2       Ball
gid_2008_09_24_cinmlb_houmlb_1  2       In play, out(s)
gid_2008_09_24_cinmlb_houmlb_1  3       Called Strike
gid_2008_09_24_cinmlb_houmlb_1  3       Ball
--------------------------------------------------------

Table:batters
   GameName                     id         name_display_first_last
----------------------------------------------------------------------------------
gid_2008_09_24_cinmlb_houmlb_1  407783      Geoff Geary
gid_2008_09_24_cinmlb_houmlb_1  209315      David Newhan
gid_2008_09_24_cinmlb_houmlb_1  115629      LaTroy Hawkins
gid_2008_09_24_cinmlb_houmlb_1  113889      Darin Erstad
gid_2008_09_24_cinmlb_houmlb_1  457803      Jay Bruce
gid_2008_09_24_cinmlb_houmlb_1  433898      Jeff Keppinger
gid_2008_09_24_cinmlb_houmlb_1  458015      Joey Votto
gid_2008_09_24_cinmlb_houmlb_1  429665      Edwin Encarnacion
---------------------------------------------------------------------------

The code I'm using is as follows:

    SELECT gameday.atbats.event
         , gameday.atbats.inning
         , gameday.batters.name_display_first_last
         , gameday.pitchers.name_display_first_last
         , gameday.pitches.result
      FROM gameday.atbats
inner join gameday.pitches on gameday.atbats.num = gameday.pitches.gameAtBatID
                          and gameday.pitches.gamename=gameday.atbats.gamename
inner join gameday.batters on gameday.atbats.batter = gameday.batters.ID
                          AND gameday.atbats.gamename = gameday.batters.gameName
     where gameday.atbats.gamename = "gid_2008_09_24_cinmlb_houmlb_1"

The current code results in each player doubling their results, for example, if the first batter in the game were to strike out on 3 pitches, then current code will output that batter striking out twice in the first inning on 6 pitches. Im joining on the game ID and the IDs from each table, but there seems to still be some sort of an issue

Upvotes: 0

Views: 117

Answers (3)

Bhavi
Bhavi

Reputation: 11

In your query At the last line you have written the where condition with game name, So is game name unique if it is not then you will get multiple answers so better way to use primary key in where condition of your gameday.atbats table.

Upvotes: 0

Codec
Codec

Reputation: 271

You need to add this after Where condition like

AND gameday.atbats.Id = Gameday.pitches.GameAtBatID      
AND batters.ID = gameday.atbats.batters

Upvotes: 0

bwperrin
bwperrin

Reputation: 692

No offense, but I don't like the "use distinct" answer because it would just mask the problem. What I do in cases like this is insert a SELECT COUNT(*) line right above FROM, then incrementally look at results adding one line at a time; ie, SELECT COUNT(*) FROM A, then A JOIN B, then A JOIN B JOIN C, etc. to detect the offending JOIN.

Upvotes: 1

Related Questions