Timogavk
Timogavk

Reputation: 869

How to get average value of last 3 records for each records groups using BigQuery?

I have a BigQuery table with ids and values. I want to get average value of last 3 record for each id.

WITH t0 as (SELECT 1 as bigid, 1 as id, 10 as value
UNION ALL 
SELECT 2, 1, 14 
UNION ALL 
SELECT 3, 1, 18 
UNION ALL 
SELECT 4, 1, 11 
UNION ALL 
SELECT 5, 1, 13 
UNION ALL 
SELECT 6, 1, 14 
UNION ALL 
SELECT 7, 2, 19 
UNION ALL 
SELECT 8, 2, 19
UNION ALL 
SELECT 9, 2, 11 
UNION ALL 
SELECT 10, 2, 13 
UNION ALL 
SELECT 11, 2, 14 
UNION ALL 
SELECT 12, 2, 19 
UNION ALL 
SELECT 13, 3, 19)
SELECT id, AVG(value) OVER (PARTITION BY id ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) FROM t0

This query returns results for each record inside window. How to get average value of last 3 records for each id?

Expected output

enter image description here

Upvotes: 1

Views: 578

Answers (3)

Tim Biegeleisen
Tim Biegeleisen

Reputation: 520918

Your current query is correct except that it is missing an ORDER BY clause in the call to AVG:

WITH cte AS (
    SELECT *, AVG(value) OVER (PARTITION BY id ORDER BY bigid
                  ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) avg_last_3
    FROM t0
)

SELECT DISTINCT id, avg_last_3
FROM cte
ORDER BY id;

Upvotes: 1

C.Georgiadis
C.Georgiadis

Reputation: 219

I think the solution you're looking for is this

WITH t0 as (SELECT 1 as bigid, 1 as id, 10 as value
UNION ALL 
SELECT 2, 1, 14 
UNION ALL 
SELECT 3, 1, 18 
UNION ALL 
SELECT 4, 1, 11 
UNION ALL 
SELECT 5, 1, 13 
UNION ALL 
SELECT 6, 1, 14 
UNION ALL 
SELECT 7, 2, 19 
UNION ALL 
SELECT 8, 2, 19
UNION ALL 
SELECT 9, 2, 11 
UNION ALL 
SELECT 10, 2, 13 
UNION ALL 
SELECT 11, 2, 14 
UNION ALL 
SELECT 12, 2, 19 
UNION ALL 
SELECT 13, 3, 19),

avgs as (
SELECT 
  id, 
  bigid,
  value,
  AVG(value) OVER (last_3_window ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) avg
FROM t0
WINDOW last_3_window AS (partition by id ORDER BY bigid)),

max_bigid as (
  select id, max(bigid) as max_bigid 
  from t0 group by 1)

select a.id, a.avg 
  from max_bigid m 
  left join avgs a on m.id = a.id and a.bigid = m.max_bigid

The step that you're not having is the join with the latest bigid for each of the ids

Upvotes: 0

Sergey Geron
Sergey Geron

Reputation: 10152

Maybe you wanted just 1 row per every id:

WITH t0 as (
  SELECT 1 as bigid, 1 as id, 10 as value UNION ALL 
  SELECT 2, 1, 14 UNION ALL 
  SELECT 3, 1, 18 UNION ALL 
  SELECT 4, 1, 11 UNION ALL 
  SELECT 5, 1, 13 UNION ALL 
  SELECT 6, 1, 14 UNION ALL 
  SELECT 7, 2, 19 UNION ALL 
  SELECT 8, 2, 19 UNION ALL 
  SELECT 9, 2, 11 UNION ALL 
  SELECT 10, 2, 13 UNION ALL 
  SELECT 11, 2, 14 UNION ALL 
  SELECT 12, 2, 19 UNION ALL 
  SELECT 13, 3, 19
),
last3 AS (
  SELECT 
    id, 
    ARRAY_AGG(value ORDER BY bigid DESC LIMIT 3) as values
  FROM t0
  GROUP BY id
)
SELECT 
  id,
  (SELECT AVG(x) FROM UNNEST(values) x) AS avg
FROM last3

enter image description here

Upvotes: 1

Related Questions