Reputation: 3098
I'm writing some queries with Google BigQuery. I want to extract the JSON from an array. Somehow I can extract it. And @Mikhail Berlyant helped me here. But now the problem is in extracting JSON from the array without Duplicates.
WITH
cte AS (
SELECT
labels,
cost
FROM
BILLING.gcp_billing_export_v1)
SELECT
la,
cost
FROM
cte,
UNNEST(labels) AS la
See the cost box, the COST value is repeated twice, because we have 2 KEY, VALUE pairs in the array.
So while doing sum(cost)
with the group by la.key
then I'm getting the wrong value.
Can anyone help me with this?
Upvotes: 1
Views: 7454
Reputation: 172944
Below is for BigQuery Standard SQL
#standardSQL
SELECT
description,
ARRAY(
SELECT AS STRUCT
JSON_EXTRACT_SCALAR(kv, '$.key') key,
JSON_EXTRACT_SCALAR(kv, '$.value') value
FROM UNNEST(SPLIT(labels, '},{')) kv_temp,
UNNEST([CONCAT('{', REGEXP_REPLACE(kv_temp, r'^\[{|}]$', ''), '}')]) kv
) labels,
cost
FROM `project.dataset.table`
You can test, play with above using excerpt of dummy data from your question as below
#standardSQL
WITH `project.dataset.table` AS (
SELECT 'aaa' description, '[{"key":"application","value":"scaled-server"},{"key":"department","value":"hrd"}]' labels, 0.323316 cost UNION ALL
SELECT 'bbb' description, '[{"key":"application2","value":"scaled-server2"},{"key":"department2","value":"hrd2"}]' labels, 0.342825 cost
)
SELECT
description,
ARRAY(
SELECT AS STRUCT
JSON_EXTRACT_SCALAR(kv, '$.key') key,
JSON_EXTRACT_SCALAR(kv, '$.value') value
FROM UNNEST(SPLIT(labels, '},{')) kv_temp,
UNNEST([CONCAT('{', REGEXP_REPLACE(kv_temp, r'^\[{|}]$', ''), '}')]) kv
) labels,
cost
FROM `project.dataset.table`
with result
Row description labels.key labels.value cost
1 aaa application scaled-server 0.323316
department hrd
2 bbb application2 scaled-server2 0.342825
department2 hrd2
Upvotes: 3