denim
denim

Reputation: 463

Extract information from a json string in BigQuery

I am storing a table in Bigquery with the results of a classification algorithm. The table schema is INT, STRING and looks something like this :

ID Output
1001 {'Apple Cider': 0.7, 'Coffee' : 0.2, 'Juice' : 0.1}
1002 {'Black Coffee':0.9, 'Tea':0.1}

The problem is how to fetch the first (or second or any order) element of each string together with its score. It doesn't seem likely that JSON_EXTRACT can work and most likely it can be done with Javascript. Was wondering what an elegant solution would look like here.

Upvotes: 0

Views: 385

Answers (1)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 173191

Consider below

select ID, 
  trim(split(kv, ':')[offset(0)], " '") element, 
  cast(split(kv, ':')[offset(1)] as float64) score, 
  element_position
from `project.dataset.table` t,
unnest(regexp_extract_all(trim(Output, '{}'), r"'[^':']+'\s?:\s?[^,]+")) kv with offset as element_position           

If applied to sample data in your question - output is

enter image description here

Note: you can use less verbose unnest statement if you wish

unnest(split(trim(Output, '{}'))) kv with offset as element_position

Upvotes: 1

Related Questions