Andy Carlson
Andy Carlson

Reputation: 3909

BigQuery arrays - SELECT DISTINCT ordering guarantees?

I want to filter out the duplicates from a BigQuery array. I also need the order of the elements to be preserved. The docs mention that this can be done by combining SELECT DISTINCT with UNNEST. However, it doesn't mention any ordering behavior. I ran this query and got the desired ordering of [5, 3, 1, 4, 10, 8].

WITH an_array AS (
  SELECT [5, 5, 3, 1, 4, 4, 10, 8, 5, 1] AS nums
)

SELECT
  ARRAY((
    SELECT DISTINCT num
    FROM UNNEST(nums) num
  ))
FROM an_array;

I don't know if that's coincidence or if that ordering is guaranteed. I also tried adding WITH OFFSET with an ORDER BY to specify the order explicitly, but in that case I get Query error: ORDER BY clause expression references table alias offset which is not visible after SELECT DISTINCT.

Upvotes: 0

Views: 1388

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1270401

You should always be explicit about ordering if you care about it:WITH an_array AS (

WITH an_array as (
  SELECT [5, 5, 3, 1, 4, 4, 10, 8, 5, 1] AS nums
)
SELECT ARRAY((SELECT num
              FROM UNNEST(nums) num WITH OFFSET o
              GROUP BY num
              ORDER BY MIN(o)
             )
            )
FROM an_array;

Upvotes: 1

Related Questions