Reputation: 929
looking for some help on how to create a game_id variable in SQL. I'm working on an nba project and I can manipulate the data into team, date, opponent format where every team has a row for each game they play. A game_id variable would make my life easier for other work in the project but I don't know how to create it.
The variable itself can start from 1 or 100000, doesn't matter. I just need it to uniquely identify every game that is being played.
Below is an example table + data you can create to see my dilemma. Ideally the LAL and GSW rows would both have the same game_id, and the BKN and MIL rows would have the same game_id.
CREATE TABLE basketball_data (
team text,
dategame date,
opponent text
);
INSERT INTO basketball_data (team, dategame, opponent)
VALUES ('GSW', '2021-10-19', 'LAL');
INSERT INTO basketball_data (team, dategame, opponent)
VALUES ('LAL', '2021-10-19', 'GSW');
INSERT INTO basketball_data (team, dategame, opponent)
VALUES ('BKN', '2021-10-19', 'MIL');
INSERT INTO basketball_data (team, dategame, opponent)
VALUES ('MIL', '2021-10-19', 'BKN');
Anyone have an idea of what would be a way of creating a variable like this? If it makes a difference, I'm working in PostgreSQL. Thanks!
Upvotes: 2
Views: 96
Reputation: 10035
You may try the following using DENSE_RANK as a window function:
Retrieving a game id during queries
SELECT
DENSE_RANK() OVER (
ORDER BY
dategame,(
CASE
WHEN team < opponent THEN CONCAT(team,opponent)
ELSE CONCAT(opponent,team)
END
)
) as game_id,
team,
dategame,
opponent
FROM
basketball_data;
game_id | team | dategame | opponent |
---|---|---|---|
1 | BKN | 2021-10-19T00:00:00.000Z | MIL |
1 | MIL | 2021-10-19T00:00:00.000Z | BKN |
2 | GSW | 2021-10-19T00:00:00.000Z | LAL |
2 | LAL | 2021-10-19T00:00:00.000Z | GSW |
Creating a new table with the same data and game id
CREATE TABLE basketball_data_with_game_id AS
SELECT
DENSE_RANK() OVER (
ORDER BY
dategame,(
CASE
WHEN team < opponent THEN CONCAT(team,opponent)
ELSE CONCAT(opponent,team)
END
)
) as game_id,
team,
dategame,
opponent
FROM
basketball_data;
There are no results to be displayed.
SELECT * FROM basketball_data_with_game_id;
game_id | team | dategame | opponent |
---|---|---|---|
1 | BKN | 2021-10-19T00:00:00.000Z | MIL |
1 | MIL | 2021-10-19T00:00:00.000Z | BKN |
2 | GSW | 2021-10-19T00:00:00.000Z | LAL |
2 | LAL | 2021-10-19T00:00:00.000Z | GSW |
Updating the existing table to have the game id
ALTER TABLE basketball_data
ADD game_id INT DEFAULT 0;
There are no results to be displayed.
UPDATE basketball_data
SET game_id = n.game_id
FROM (
SELECT
DENSE_RANK() OVER (
ORDER BY
dategame,(
CASE
WHEN team < opponent THEN CONCAT(team,opponent)
ELSE CONCAT(opponent,team)
END
)
) as game_id,
team,
dategame,
opponent
FROM
basketball_data
) n
WHERE basketball_data.game_id=0 AND
basketball_data.team=n.team AND
basketball_data.dategame=n.dategame AND
basketball_data.opponent=n.opponent;
There are no results to be displayed.
SELECT * FROM basketball_data;
team | dategame | opponent | game_id |
---|---|---|---|
BKN | 2021-10-19T00:00:00.000Z | MIL | 1 |
MIL | 2021-10-19T00:00:00.000Z | BKN | 1 |
GSW | 2021-10-19T00:00:00.000Z | LAL | 2 |
LAL | 2021-10-19T00:00:00.000Z | GSW | 2 |
Let me know if this works for you.
Upvotes: 2