ajthyng
ajthyng

Reputation: 1275

SQL Query to select negatives

The below query works exactly as it should and gets me the results I want.

Is there a better way to write this though? The question reads "Print all teamIDs where the team played against the Phillies but not against the Braves" and the two tables are set up as follows:

Games: gameID, homeTeamID, guestTeamID, date

Teams: teamID, teamName, home, leagueName

homeTeamID and guestTeamID are foreign keys for the Teams table, gameID and teamID are both primary keys. Currently using sql server dialect.

I really feel like I wrote this in a ridiculous way and there must be something shorter I can do that I don't know about.

WITH philliesgames AS 
( 
       SELECT g.* 
       FROM   games g 
       JOIN   teams t 
       ON     g.hometeamid = t.teamid 
       OR     g.guestteamid = t.teamid 
       WHERE  t.teamname = 'Phillies'), bravesgames AS 
( 
       SELECT g.* 
       FROM   games g 
       JOIN   teams t 
       ON     g.hometeamid = t.teamid 
       OR     g.guestteamid = t.teamid 
       WHERE  t.teamname = 'Braves') 
( 
       SELECT guestteamid 
       FROM   philliesgames 
       WHERE  guestteamid <> 1 
       UNION 
       SELECT hometeamid 
       FROM   philliesgames 
       WHERE  hometeamid <> 1) 
EXCEPT 
       ( 
              SELECT guestteamid 
              FROM   bravesgames 
              UNION 
              SELECT hometeamid 
              FROM   bravesgames)

Upvotes: 1

Views: 155

Answers (2)

shawnt00
shawnt00

Reputation: 17935

Here's an alternate approach:

with teamGames(teamId, againstId) as (
    select homeTeamId, guestTeamId from games
    union
    select guestTeamId, homeTeamId from games
)
select teamId
from teamGames
group by teamId
having
        count(case when againstId =
                (select teamId from Teams where teamName = 'Phillies')
            then 1 end) >= 1
    and count(case when againstId =
                (select teamId from Teams where teamName = 'Braves')
            then 1 end) = 0;

EDIT: For completeness I wanted to come back and add some notes.

First, I kind of suspected that SQL Server would complain about my subqueries inside the having clause and indeed it doesn't allow that. I've fixed that below.

with teamGames(teamId, againstId) as (
    select homeTeamId, guestTeamId from games
    union
    select guestTeamId, homeTeamId from games
)
select g.teamId
from
    teamGames as g
    inner join teams as t1 on t1.teamId = g.teamId
    inner join teams as t2 on t2.teamId = g.againstId
where t1.teamName not in ('Phillies', 'Braves')
group by g.teamId
having
        count(case when t2.teamName = 'Phillies' then 1 end) >= 1
    and count(case when t2.teamName = 'Braves'   then 1 end)  = 0;

My version of the query has the huge advantage of being able to trivially add any number of similar conditions where yours would become much more complicated. It's also easier to add ancillary data to the output. But there's also another subtle point worth noting: your union and except logic essentially treats each team symmetrically in the relationship.

In my second query I've added a check to prevent the Braves from coming back as one of the results. The Braves can very well play the Phillies while it's impossible they play themselves. Depending on your definition they could be a match too. It's only by implicit common sense that we assume the Braves are not a team you expect to see in the results.

Your query had to handle the case where the Phillies never played the Braves and would otherwise not have been excluded from the list. That appears to explain the <> 1 checks. On the other hand there's nothing else that explicitly eliminates the Braves so you have to dig a little deeper. Where you'll find it buried is in the except query which effectively treats the Braves as having played themselves. That's how the Braves are removed from consideration and it's easy to miss.

I hope you don't find this to be pedantic. On the surface as a homework question about baseball, it's a little academic to go down this road. I do think that in other settings in the workplace you could easily find this particular point to be highly relevant and worth thinking about. This is one way undiscovered bugs readily creep into deployed code.

Play around with it here: http://rextester.com/MIHL65545

Upvotes: 1

Gurwinder Singh
Gurwinder Singh

Reputation: 39527

Try this:

select
    IIF(t.teamid = g.hometeamid, g.guestteamid, g.hometeamid) id
from games g inner join teams t
on t.teamid in (g.hometeamid, g.guestteamid)
where t.teamname = 'Phillies'
except
select
    IIF(t.teamid = g.hometeamid, g.guestteamid, g.hometeamid)
from games g inner join teams t
on t.teamid in (g.hometeamid, g.guestteamid)
where t.teamname = 'Braves';

If your version of SQL Server doesn't support IIF, use CASE instead:

IIF(t.teamid = g.hometeamid, g.guestteamid, g.hometeamid)

to

CASE t.teamid WHEN g.hometeamid THEN g.guestteamid ELSE g.hometeamid END

Upvotes: 1

Related Questions