Mari
Mari

Reputation: 195

How to sort an array in BigQuery standard SQL?

I am wondering if it is possible to order (apply order by) for individual array values in Google BigQuery?

I am able to achieve this by applying order by on the whole transactonal base table first, then aggregating array; but when table is too large, resource errors appear for ordering by a large table.

So I am wondering if each individual array value can be ordered by using SQL or UDF.

This was asked once Order of data in bigquery repeated records but it was 4.5 years ago.

Upvotes: 14

Views: 28851

Answers (3)

Yassir Ennouhi
Yassir Ennouhi

Reputation: 1

For a good optimisation in your query you can use this structure

WITH deduped_data AS (
SELECT DISTINCT group_col, col
FROM table
)
SELECT 
    group_col,
    ARRAY_AGG(col ORDER BY col)
FROM deduped_data
GROUP BY group_col;

Upvotes: 0

Elliott Brossard
Elliott Brossard

Reputation: 33745

Sure, you can use the ARRAY function. It supports an optional ORDER BY clause. You haven't provided sample data, but supposing that you have a top level array column named arr, you can do something like this:

SELECT
  col1,
  col2,
  ARRAY(SELECT x FROM UNNEST(arr) AS x ORDER BY x) AS arr
FROM MyTable;

This sorts the elements of arr by their values.

If you actually have an array of a struct type, such as ARRAY<STRUCT<a INT64, b STRING>>, you can sort by one of the struct fields:

SELECT
  col1,
  col2,
  ARRAY(SELECT x FROM UNNEST(arr) AS x ORDER BY a) AS arr
FROM MyTable;

Upvotes: 31

paradocslover
paradocslover

Reputation: 3294

If the array is obtained after aggregation using a group by clause, the query can look something like this:

SELECT 
ARRAY_AGG(distinct col order by col)
FROM table
GROUP BY group_col

So, no SELECT is required.

Ref: The accepted answer didn't help. Took help from here - https://count.co/sql-resources/bigquery-standard-sql/array_agg

Upvotes: 4

Related Questions