Reputation: 31
For this table, i want to make addition of the points for each team in each stage of each season with bigquery
saison |stage |team_home |team_away|home_goal|away_goal| home_point| away_point|
---------------------------------------------------------------------------------
2002 | 1 |France |Bresil |2 |1 | 3 |0 |
2002 | 1 |Italie |Bresil |4 |3 | 3 |0 |
2002 | 1 |France |Italie |1 |1 | 1 |1 |
2002 | 2 |Italie |Bresil |3 |4 | 0 |3 |
2002 | 2 |France |Italie |1 |1 | 1 |1 |
2002 | 2 |Italie |Bresil |3 |4 | 0 |3 |
2002 | 3 |France |Italie |1 |1 | 1 |1 |
2003 | 1 |Italie |Bresil |3 |4 | 0 |3 |
2003 | 1 |France |Italie |1 |2 | 0 |3 |
2003 | 1 |Bresil |France |0 |1 | 0 |3 |
2003 | 2 |France |Italie |1 |2 | 0 |3 |
2003 | 2 |Bresil |France |0 |1 | 0 |3 |
2003 | 2 |France |Italie |1 |2 | 0 |3 |
2003 | 3 |Italie |France |0 |1 | 0 |3 |
I want this result :
saison |stage |team |team_point|
--------------------------------------
2002 | 1 |France |4 |
2002 | 1 |Italie |4 |
2002 | 1 |Bresil |0 |
2002 | 2 |France |5 |
2002 | 2 |Italie |5 |
2002 | 2 |Bresil |3 |
2002 | 3 |France |6 |
2002 | 3 |Italie |6 |
2002 | 3 |Bresil |3 |
2003 | 1 |France |3 |
2003 | 1 |Italie |3 |
2003 | 1 |Bresil |3 |
2003 | 2 |France |6 |
2003 | 2 |Italie |9 |
2003 | 2 |Bresil |3 |
2003 | 3 |France |9 |
2003 | 3 |Italie |9 |
2003 | 3 |Bresil |3 |
I think to make unpivot and use an aggregation but i can't do it
Upvotes: 0
Views: 263
Reputation: 173046
Consider below approach
with temp as (
select *, 0 goal, 0 point from
(select distinct saison from your_table),
(select distinct stage from your_table),
(select distinct team from your_table, unnest([team_home, team_away]) team)
)
select distinct saison, stage, team,
sum(goal) over prev_stages team_goals,
sum(point) over prev_stages team_points
from (
select * except(col) from your_table
unpivot ((team, goal, point) for col in
((team_home, home_goal, home_point), (team_away, away_goal, away_point))
)
union all select * from temp
)
window prev_stages as (
partition by saison, team order by stage
range between unbounded preceding and current row
)
if applied to sample data in your question - output is
Upvotes: 2
Reputation: 12254
Consider below query:
SELECT saison, stage, p.team, SUM(SUM(p.point)) OVER (PARTITION by saison, team ORDER BY stage) team_point,
FROM sample, UNNEST([STRUCT(team_home AS team, home_point AS point), (team_away, away_point)]) p
GROUP BY 1, 2, 3
ORDER BY 1, 2, 4 DESC;
output is slightly different from your expected one :
Thanks to @Saransh's kind explanation, output get more close to your expected output except one thing.
(update)
WITH team_points AS (
SELECT saison, stage, p.team, SUM(SUM(p.point)) OVER (PARTITION by saison, team ORDER BY stage) team_point,
FROM sample, UNNEST([STRUCT(team_home AS team, home_point AS point), (team_away, away_point)]) p
GROUP BY 1, 2, 3
)
SELECT saison, stage, team, IFNULL(team_point, LAST_VALUE(team_point IGNORE NULLS) OVER w) AS team_point
FROM UNNEST([2002, 2003]) saison, UNNEST([1, 2, 3]) stage, UNNEST(['France', 'Bresil', 'Italie']) team
LEFT JOIN team_points p ON p.saison = saison AND p.stage = stage AND p.team = team
WINDOW w AS (PARTITION BY saison, team ORDER BY stage)
ORDER BY 1, 2, 3;
Upvotes: 2
Reputation: 147
OP the requirements of your post are not very clear, please update it to reflect the actual requirements. What I could deduce from the above output is that you require the cumulative sum for each team for each saison to be carry forwarded to the stage 2 and 3 respectively.
WITH
unioned_expr AS (
SELECT
saison,
stage,
team_home AS team,
home_goal AS goals,
home_Point AS points
FROM
`project.dataset.table`
UNION ALL
SELECT
saison,
stage,
team_away AS team,
away_goal AS goals,
away_point AS points
FROM
`project.dataset.table` ),
summed_expr AS (
SELECT
saison,
stage,
team,
SUM(points) AS points
FROM
unioned_expr
GROUP BY
saison,
stage,
team
ORDER BY
saison,
team,
stage )
SELECT
saison,
stage,
team,
SUM(points) OVER(PARTITION BY team, saison ORDER BY stage ROWS UNBOUNDED PRECEDING ) as team_point
FROM
summed_expr
ORDER BY
saison,
team,
stage
Output:
saison,stage,team,team_points
2002,1,Bresil,0
2002,2,Bresil,6
2002,1,France,4
2002,2,France,5
2002,3,France,6
2002,1,Italie,4
2002,2,Italie,5
2002,3,Italie,6
2003,1,Bresil,3
2003,2,Bresil,3
2003,1,France,3
2003,2,France,6
2003,3,France,9
2003,1,Italie,3
2003,2,Italie,9
2003,3,Italie,9
Upvotes: 0
Reputation: 11
If all teams have played both home and away in all stages and saisons, then you can just create a column (team_point) that sums the addition of home_point and away_point grouped by saison, stage and team_home (using team away wouldn't make a difference):
SELECT
saison,
stage,
team_home AS team,
SUM(home_point + away_point) AS team_point
FROM
your_dataset.your_table
GROUP BY
saison,
stage,
team
ORDER BY
saison,
stage,
team_point DESC --this is how teams with most points will show up first for each saison and stage.
If there's a chance that any team have only played home (or away) in any stage or saison, then you could list all diferent teams first.
WITH
AllTeams AS (
SELECT DISTINCT
team_away AS team
FROM
your_dataset.your_table
UNION DISTINCT
SELECT DISTINCT
team_home AS team
FROM
your_dataset.your_table)
SELECT
saison,
stage,
team,
SUM(home_point + away_point) AS team_point
FROM
your_dataset.your_table
LEFT JOIN
AllTeams ON team = team_home OR team = team_away
GROUP BY
saison,
stage,
team
ORDER BY
saison,
stage,
team_point DESC
Upvotes: 0