tototl
tototl

Reputation: 35

BigQuery How to get the sum of values in a JSON struct?

I have the below query

SELECT 
    JSON_EXTRACT(json, '$.Weights') as weight 
from 
(select '{"Weights":{"blue":1.0,"purple":0.0,"yellow":1.0,"green":1.0}}' as json)

which returns :

{"blue":1.0,"purple":0.0,"yellow":1.0,"green":1.0}

I would like to see if there is a way to sum up all the values of the colors. Meaning to return:

3.0

I have been trying using split and unnest functions without any success, any suggestions? Thanks.

Upvotes: 3

Views: 2006

Answers (2)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 173106

For the sake of exploring other options -

Below is for BigQuery Standard SQL

First example is to extract key:value pair for each row

#standardSQL
WITH `project.dataset.yourTbale` AS (
  SELECT 1 AS id, '{"Weights":{"blue":1.0,"purple":0.0,"yellow":1.0,"green":1.0}}' AS json 
  UNION ALL SELECT 2, '{"Weights":{"blue":1.0,"red":2.0,"yellow":1.0,"orange":3.0}}'
)
SELECT id,
  REPLACE(SPLIT(pair, ':')[OFFSET (0)], '"', '') color, 
  SAFE_CAST(SPLIT(pair, ':')[OFFSET (1)] AS FLOAT64) value
FROM `project.dataset.yourTbale`, 
UNNEST(SPLIT(REGEXP_REPLACE(JSON_EXTRACT(json, '$.Weights'), r'{|}', ''))) pair

this gives you below result

id  color   value    
1   blue    1.0  
1   purple  0.0  
1   yellow  1.0  
1   green   1.0  
2   blue    1.0  
2   red     2.0  
2   yellow  1.0  
2   orange  3.0  

so now it is simple to extend above to your original question of if there is a way to sum up all the values of the colors and even extend it more with potentially filtering on specific color(s) - see example below

#standardSQL
WITH `project.dataset.yourTbale` AS (
  SELECT 1 AS id, '{"Weights":{"blue":1.0,"purple":0.0,"yellow":1.0,"green":1.0}}' AS json 
  UNION ALL SELECT 2, '{"Weights":{"blue":1.0,"red":2.0,"yellow":1.0,"orange":3.0}}'
)
SELECT id,
  SUM(SAFE_CAST(SPLIT(pair, ':')[OFFSET (1)] AS FLOAT64)) AS total
FROM `project.dataset.yourTbale`, 
UNNEST(SPLIT(REGEXP_REPLACE(JSON_EXTRACT(json, '$.Weights'), r'{|}', ''))) pair
WHERE REPLACE(SPLIT(pair, ':')[OFFSET (0)], '"', '') != 'blue'
GROUP BY id

with result as below (which excludes color=blue from calc)

id  total    
1   2.0  
2   6.0   

Upvotes: 1

Elliott Brossard
Elliott Brossard

Reputation: 33765

Here is an example using REGEXP_EXTRACT_ALL:

WITH T AS (
  SELECT '{"Weights":{"blue":1.0,"purple":0.0,"yellow":1.0,"green":1.0}}' AS json
)
SELECT
  (
    SELECT SUM(CAST(val AS FLOAT64))
    FROM UNNEST(
      REGEXP_EXTRACT_ALL(
        JSON_EXTRACT(json, '$.Weights'),
        r':([^,}]+)')
    ) AS val
  )
FROM T;

Upvotes: 3

Related Questions