jyablonski
jyablonski

Reputation: 929

How to create a game_id variable for team / opponent data in SQL

looking for some help on how to create a game_id variable in SQL. I'm working on an nba project and I can manipulate the data into team, date, opponent format where every team has a row for each game they play. A game_id variable would make my life easier for other work in the project but I don't know how to create it.

The variable itself can start from 1 or 100000, doesn't matter. I just need it to uniquely identify every game that is being played.

Below is an example table + data you can create to see my dilemma. Ideally the LAL and GSW rows would both have the same game_id, and the BKN and MIL rows would have the same game_id.

CREATE TABLE basketball_data (
    team text,
    dategame date,
    opponent text
);

INSERT INTO basketball_data (team, dategame, opponent)
VALUES ('GSW', '2021-10-19', 'LAL');

INSERT INTO basketball_data (team, dategame, opponent)
VALUES ('LAL', '2021-10-19', 'GSW');

INSERT INTO basketball_data (team, dategame, opponent)
VALUES ('BKN', '2021-10-19', 'MIL');

INSERT INTO basketball_data (team, dategame, opponent)
VALUES ('MIL', '2021-10-19', 'BKN');

Anyone have an idea of what would be a way of creating a variable like this? If it makes a difference, I'm working in PostgreSQL. Thanks!

Upvotes: 2

Views: 96

Answers (1)

ggordon
ggordon

Reputation: 10035

You may try the following using DENSE_RANK as a window function:

Retrieving a game id during queries

SELECT
     DENSE_RANK() OVER (
         ORDER BY 
              dategame,(
                CASE
                    WHEN team < opponent THEN CONCAT(team,opponent)
                    ELSE CONCAT(opponent,team)
                END
              )
     ) as game_id,     
     team,
     dategame,
     opponent
FROM 
     basketball_data;
game_id team dategame opponent
1 BKN 2021-10-19T00:00:00.000Z MIL
1 MIL 2021-10-19T00:00:00.000Z BKN
2 GSW 2021-10-19T00:00:00.000Z LAL
2 LAL 2021-10-19T00:00:00.000Z GSW

Creating a new table with the same data and game id

CREATE TABLE basketball_data_with_game_id AS 
SELECT
     DENSE_RANK() OVER (
         ORDER BY 
              dategame,(
                CASE
                    WHEN team < opponent THEN CONCAT(team,opponent)
                    ELSE CONCAT(opponent,team)
                END
              )
     ) as game_id,     
     team,
     dategame,
     opponent
FROM 
     basketball_data;

There are no results to be displayed.


SELECT * FROM basketball_data_with_game_id;
game_id team dategame opponent
1 BKN 2021-10-19T00:00:00.000Z MIL
1 MIL 2021-10-19T00:00:00.000Z BKN
2 GSW 2021-10-19T00:00:00.000Z LAL
2 LAL 2021-10-19T00:00:00.000Z GSW

Updating the existing table to have the game id

ALTER TABLE basketball_data
ADD game_id INT DEFAULT 0;

There are no results to be displayed.


UPDATE basketball_data
SET game_id = n.game_id
FROM (
SELECT
     DENSE_RANK() OVER (
         ORDER BY 
              dategame,(
                CASE
                    WHEN team < opponent THEN CONCAT(team,opponent)
                    ELSE CONCAT(opponent,team)
                END
              )
     ) as game_id,     
     team,
     dategame,
     opponent
FROM 
     basketball_data
) n
WHERE basketball_data.game_id=0 AND
      basketball_data.team=n.team AND
      basketball_data.dategame=n.dategame AND
      basketball_data.opponent=n.opponent;

There are no results to be displayed.


SELECT * FROM basketball_data;
team dategame opponent game_id
BKN 2021-10-19T00:00:00.000Z MIL 1
MIL 2021-10-19T00:00:00.000Z BKN 1
GSW 2021-10-19T00:00:00.000Z LAL 2
LAL 2021-10-19T00:00:00.000Z GSW 2

View on DB Fiddle

Let me know if this works for you.

Upvotes: 2

Related Questions