Reputation: 12264
I have the value category:ops,client:acompany,type:sometype
which as you can see is effectively a dictionary, I would like to extract the value for the dictionary key client, in other words I want to extract acompany.
Here's how I've done it:
select CASE WHEN INSTR(client_step1, ",") > 0 THEN SUBSTR(client_step1, 0, INSTR(client_step1, ",") - 1)
ELSE client_step1
END AS client
from (
select CASE WHEN INSTR(dict, "client") > 0 THEN SUBSTR(dict, INSTR(dict, "client") + 7)
ELSE CAST(NULL as STRING)
END as client_step1
from (
select "category:ops,client:acompany,type:sometype" as dict
)
)
but that seems rather verbose (and frankly, slicing up strings with a combination of INSTR()
, SUBSTR()
and derived tables feels a bit meh). I'm wondering if there's a better way to do it that I don't know about (I'm fairly new to bq).
thanks in advance
Upvotes: 1
Views: 3923
Reputation: 172993
Another option to parse dictionary like yours is as below (for BigQuery Standard SQL)
#standardSQL
WITH `project.dataset.table` AS (
SELECT 1 id, "category:ops,client:acompany,type:sometype" AS dict
)
SELECT id,
ARRAY(
SELECT AS STRUCT
SPLIT(x, ':')[OFFSET(0)] key,
SPLIT(x, ':')[OFFSET(1)] value
FROM UNNEST(SPLIT(dict)) x
) items
FROM `project.dataset.table`
with result as below
Row id items.key items.value
1 1 category ops
client acompany
type sometype
As you can see here - you parse out all dictionary items
If you still need to only specific element's value - you can use below
#standardSQL
WITH `project.dataset.table` AS (
SELECT 1 id, "category:ops,client:acompany,type:sometype" AS dict
)
SELECT id,
( SELECT
SPLIT(x, ':')[OFFSET(1)]
FROM UNNEST(SPLIT(dict)) x
WHERE SPLIT(x, ':')[OFFSET(0)] = 'client'
LIMIT 1
) client
FROM `project.dataset.table`
Upvotes: 1
Reputation: 33745
It sounds like you want the REGEXP_EXTRACT
function. Here is an example:
SELECT REGEXP_EXTRACT(dict, r'client:([^,:]+)') AS client_step1
FROM (
SELECT "category:ops,client:acompany,type:sometype" AS dict
)
This returns the string acompany
as its result. The regexp looks for client:
inside the string, and matches everything after it up until the next ,
or :
or of the end of the string.
Upvotes: 2