Son
Son

Reputation: 179

Array cannot have a null element; error in writing field result

I am trying to run the following query on Google Big Query.

It shows as valid in the console, when running it, I get this error message:

"Array cannot have a null element; error in writing field result"

How do resolve this?

standardSQL

SELECT
  ARRAY(
    SELECT regexp_extract(x, r'^(.*?)\:')
    FROM UNNEST(split(V2Themes,',')) AS x
  ) AS result
FROM `gdelt-bq.gdeltv2.gkg_partitioned`
WHERE _PARTITIONTIME BETWEEN TIMESTAMP('2019-02-02') AND TIMESTAMP('2019-02-02')

Upvotes: 3

Views: 7519

Answers (3)

Y.K.
Y.K.

Reputation: 692

maybe you want more than just first character before \:? here is some tests

select
    x,
    regexp_extract(x, r'^(.*?)\\:') as only_first_occurrence_nongreedy,
    regexp_extract(x, r'^(.*)\\:') as only_first_occurrence_greedy,
    regexp_extract_all(x, r'(.*?)\\:') as all_occurrences_nongreedy,
    regexp_extract_all(x, r'(.*?)\\:') as all_occurrences_greedy  -- same as regexp_extract, but returns array
from
    unnest(
        array[
            struct(r'abc\:' as x),
            struct(r'123\:abc\:'),
            struct(r'\:'),
            struct(r'\\')
        ]
    )

Upvotes: 0

Mikhail Berlyant
Mikhail Berlyant

Reputation: 172993

Below is for BigQuery Standard SQL

#standardSQL
SELECT * FROM (
  SELECT
    ARRAY(
      SELECT REGEXP_EXTRACT(x, r'^(.*?)\:')
      FROM UNNEST(SPLIT(V2Themes,',')) AS x
      WHERE REGEXP_CONTAINS(x, r'^(.*?)\:') -- << removes nulls from array
    ) AS result
  FROM `gdelt-bq.gdeltv2.gkg_partitioned`
  WHERE _PARTITIONTIME BETWEEN TIMESTAMP('2019-02-02') AND TIMESTAMP('2019-02-02')
)
WHERE ARRAY_LENGTH(result) > 0  -- << filters out output with empty array 

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1269803

Remove the NULL value:

ARRAY(
    SELECT regexp_extract(x, r'^(.*?)\:')
    FROM UNNEST(split(V2Themes,',')) AS x
    WHERE regexp_extract(x, r'^(.*?)\:') IS NOT NULL AND 
  )

Upvotes: 2

Related Questions