Nils Ziehn
Nils Ziehn

Reputation: 4331

bigquery resource limited exeeded due to order by

Whem I am running the following query, I get a 'resource limited exceeded'-error. If I remove the last line (the order by clause) it works:

SELECT
  id,
  INTEGER(-position / (CASE WHEN fallback = 0 THEN 2 ELSE 1 END)) AS major_sort
FROM (
  SELECT
    id,
    fallback,
    ROW_NUMBER() OVER(PARTITION BY fallback) AS position
  FROM
    [table] AS r
  ORDER BY
    r.score DESC ) AS r
ORDER BY major_sort DESC

Actually the entire last line would be:

ORDER BY major_sort DESC, r.score DESC

But neither that would probably make things even worse.

Any idea how I could change the query to circumvent this problem?

((If you wonder what this query does: the table contains a 'ranking' with multiple fallback strategies and I want to create an ordering like this: 'AABAABAABAAB' with 'A' and 'B' being the fallback strategies. If you have a better idea how to achieve this; please feel free to tell me :D))

Upvotes: 1

Views: 1111

Answers (1)

Jeremy Condit
Jeremy Condit

Reputation: 7046

A top-level ORDER BY will always serialize execution of your query: it will force all computation onto a single node for the purpose of sorting. That's the cause of the resources exceeded error.

I'm not sure I fully understand your goal with the query, so it's hard to suggest alternatives, but you might consider putting an ORDER BY clause within the OVER(PARTITION BY ...) clause. Sorting a single partition can be done in parallel and may be closer to what you want.

More general advice on ordering:

  • Order is not preserved during BQ queries, so if there's an ordering that you want to preserve on the input rows, make sure it's encoded in your data as an extra field.

  • The use cases for large amounts of globally-sorted data are somewhat limited. Often when users run into resource limitations with ORDER BY, we find that they're actually looking for something slightly different (locally ordered data, or "top N"), and that it's possible to get rid of the global ORDER BY completely.

Upvotes: 2

Related Questions