Reputation: 9
I would like to calculate the average and STDDEV for the normailzed time of a race. I first want to calculate the pace of each racer by doing finish_time / distance. Next, I want to do an average normalized to the current rows distance.
So for example to calculate the average for row 5 not including row 5: AVG(Pace * current_distance) = (Pace_1 * distance_5 + Pace_2 * distance_5 + Pace_3 * distance_5 + Pace_4 * distance_5 ) / 4
Pace_1 = finish_time_1 / distance_1
This is how I have tried to do it two different ways but it is not working:
AVG(pace * distance) OVER (ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) AS avg_normalized_finishing_time1, It is multiplying the pace not by "row 5's" distance
(AVG(finishing_time / distance) OVER (ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING))*distance AS avg_normalized_finishing_time2, It is taking an average first and then multiplying it by the distance. I think this may be correct. I don't think this will work for STDDEV though
Upvotes: 0
Views: 131
Reputation: 3528
The task is to calculate the expected finish time for a certain sport racing event from the previous pace times.
Let's start with some random data in CTE sample
. Then in each tbl
addition columns are calculated. Please use in the window function partition by player
to differ between the player. win1
is for all races by a player, win2
is only for the previous ones. Since the task is to forecast the finish time, we are summing over all finish times and substract the one from the current race. Then division by the racing counts minus one for the current one, gives the average. For STDDEV use the formula
With sample as (
Select "racer1" as player,offset as date, finish_time, (offset+1)*1000 distance
from
unnest([100,200,330,400,500]) finish_time with offset
),
tbl1 as (
Select *,
finish_time / distance as Pace
from sample
window win1 as (partition by player)
),
tbl2 as (
Select *,
avg(pace * distance) over win1 as avg_normalized_finishing,
avg(pace * distance) over win2 as avg_normalized_finishing_previous_ones,
sum(pace * distance) over win1 - pace * distance as sum_finishing_times,
(sum(pace * distance) over win1 - pace * distance) / (count(pace * distance) over win1 -1) as avg_finishing_times,
count(pace * distance) over win1 as races_done
,
from tbl1
window win1 as (partition by player),
win2 as (partition by player ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING)
),
tbl3 as (
Select * ,
SQRT((sum( pow(pace * distance- avg_finishing_times,2) ) over win1 - pow(pace * distance- avg_finishing_times,2))/ (races_done-1)) as STDDEV
from tbl2
window win1 as (partition by player),
win2 as (partition by player ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING)
)
Select * from tbl3
Upvotes: 0