How can I make this query run efficiently?

Question

In BigQuery, we're trying to run:

SELECT day, AVG(value)/(1024*1024) FROM ( 
    SELECT value, UTC_USEC_TO_DAY(timestamp) as day, 
         PERCENTILE_RANK() OVER (PARTITION BY day ORDER BY value ASC) as rank 
    FROM [Datastore.PerformanceDatum]
    WHERE type = "MemoryPerf"
) WHERE rank >= 0.9 AND rank <= 0.91 
GROUP BY day 
ORDER BY day desc;

which returns a relatively small amount of data. But we're getting the message:

Error: Resources exceeded during query execution. The query contained a GROUP BY operator, consider using GROUP EACH BY instead. For more details, please see https://developers.google.com/bigquery/docs/query-reference#groupby

What is making this query fail, the size of the subquery? Is there some equivalent query we can do which avoids the problem?

Edit in response to comments: If I add GROUP EACH BY (and drop the outer ORDER BY), the query fails, claiming GROUP EACH BY is here not parallelizable.

How can I make this query run efficiently?

Answers (1)

Related Questions