Reputation: 253
I want to split the field in big query data table.
I give you sample case..
==case 1== Source Filed= "idx1-cnt1-name1,idx2-cnt2-name2... same pattern"
Result table
idx | cnt | name |
idx1,cnt1,name1
idx2,cnt2,name2,....
in query:
select
regexp_extract(split_col, r'([\d]*)-') as ItemIdx,
regexp_extract(split_col, r'-([\d]*)-') as Cnt,
regexp_extract(split_col, r'-([\d]*)$') as TitleIdx
From (
Select pid,now, split(source field, ',') split_col from (
SELECT * FROM table ))
but I can't make a code this case
this case string has multiple saparator.
==case 2==
Source String = "item1-name1-type1-value1,.... same pattern"
Result Table
name | type
name1, type1
name2,type2
field count is different.
but just I need second, third field value.
how can I make a query..
Upvotes: 2
Views: 9057
Reputation: 172993
I see you are using BigQuery legacy sql - so below example in it (note - it is quite recommended to use BigQuery Standard SQL whenever you can - so consider migrating) - below is simplified to make logic better readable for you so you can easily extend it to potentially more similar cases
Case 1 / example
#legacySQL
SELECT
REGEXP_EXTRACT(split_col, r'^(.*?)-.*?-.*?$') AS idx,
REGEXP_EXTRACT(split_col, r'^.*?-(.*?)-.*?$') AS cnt,
REGEXP_EXTRACT(split_col, r'^.*?-.*?-(.*?$)') AS name
FROM (
SELECT SPLIT(source_field, ',') split_col
FROM (SELECT "idx1-cnt1-name1,idx2-cnt2-name2" source_field)
)
result:
Row idx cnt name
1 idx1 cnt1 name1
2 idx2 cnt2 name2
Case 2 / example
#legacySQL
SELECT
REGEXP_EXTRACT(split_col, r'^.*?-(.*?)-.*?') AS name,
REGEXP_EXTRACT(split_col, r'^.*?-.*?-(.*?)-') AS type
FROM (
SELECT SPLIT(source_string, ',') split_col
FROM (SELECT "item1-name1-type1-value1, item2-name2-type2-value2" source_string)
)
result:
Row name type
1 name1 type1
2 name2 type2
Below is example of same but for BigQuery Standard SQL (just case 2 as they really similar)
#standardSQL
WITH `project.dataset.table` AS (
SELECT "item1-name1-type1-value1, item2-name2-type2-value2" source_string
)
SELECT
REGEXP_EXTRACT(split_col, r'^.*?-(.*?)-.*?') AS name,
REGEXP_EXTRACT(split_col, r'^.*?-.*?-(.*?)-') AS type
FROM `project.dataset.table`, UNNEST(SPLIT(source_string, ',')) split_col
obviously - same result
Row name type
1 name1 type1
2 name2 type2
Yet another option would be -
#standardSQL
WITH `project.dataset.table` AS (
SELECT "item1-name1-type1-value1, item2-name2-type2-value2" source_string
)
SELECT
SPLIT(split_col, '-')[SAFE_OFFSET(1)] AS name,
SPLIT(split_col, '-')[SAFE_OFFSET(2)] AS type
FROM `project.dataset.table`, UNNEST(SPLIT(source_string, ',')) split_col
and so on ...
Upvotes: 3