Alex
Alex

Reputation: 1537

ROW_NUMBER() fails when table is too big

I am working with Bigquery and I need to use ROW_NUMBER() in order to get only the first row which matches some condition.

Example:

select *except(rn)
from (
SELECT
  *,
  ROW_NUMBER() OVER (PARTITION BY id order by timedate desc) AS rn
FROM
 table
)
where rn = 1

However, the query will fail because table is too big. How can I apply such logic without running out of resources?

Upvotes: 2

Views: 678

Answers (1)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 172993

Below is for BigQuery Standard SQL

#standardSQL
SELECT AS VALUE ARRAY_AGG(t ORDER BY timedate DESC LIMIT 1)[OFFSET(0)]
FROM `project.dataset.table` t
GROUP BY id

You can test, play above with dummy data as below

#standardSQL
WITH `project.dataset.table` AS (
  SELECT 1 id, 2 timedate, 3 z UNION ALL
  SELECT 1,4,5 UNION ALL
  SELECT 1,6,7 UNION ALL
  SELECT 2,8,9 UNION ALL
  SELECT 2, 10, 11
)
SELECT AS VALUE ARRAY_AGG(t ORDER BY timedate DESC LIMIT 1)[OFFSET(0)]
FROM `project.dataset.table` t
GROUP BY id

result is

Row id  timedate    z    
1   1   6           7    
2   2   10          11   

Upvotes: 5

Related Questions