Prithwiraj Samanta
Prithwiraj Samanta

Reputation: 131

Bigquery SQL to convert single value string to multi value based on separator/prefix

I have a column value string with + or - orefix as below :

id       val
 1    +a+b+c-d-e-f+g

Now based on + or - separator I need to build the dataset as follows :

 id new_val prefix
 1   a       +
 1   b       +
 1   c       +
 1   d       -
 1   e       -
 1   f       -
 1   g       +

And to add the string is not fixed length ie it would continue with either separator (+ or -) for different rows. Any guide on big-query SQL to do this transformation would be helpful.

Update :

I am using this query but missing some value though :

with mytable as (
  select 1 as id, '+a+b+c-d-f+g' as val1, 
)

select * from (
select id, new_val1 , '+' symbol
from mytable, unnest(split(val1,  '+')) as new_val1 WITH OFFSET AS val1_offset
union all
select id, new_val1 , '-' symbol
from mytable, unnest(split(val1,  '-')) as new_val1 WITH OFFSET AS val1_offset
) where length(new_val1) = 1 and new_val1 is not null

Upvotes: 0

Views: 45

Answers (2)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 172993

Consider below approach

select id, substr(part, 2) new_val, substr(part, 1, 1) prefix
from `project.dataset.table`, 
unnest(regexp_extract_all(val, r'[+-][^+-]+')) part       

If applied to sample data in your question - output is

enter image description here

Upvotes: 1

Samuel
Samuel

Reputation: 3528

The split into substrings can be done, by adding a further separator, which does the string do not contains:

select id, substr(vals,2) as new_val, substr(vals,1,1) as prefix
from (
SELECT id, split(substr(replace(replace(val,'-',';-'),'+',';+'),2) ,';') as val_tmp

from (select 1 as id,  "+a+b+c-d-e-f+g" as val)

) as t, unnest(t.val_tmp) as vals

If you have more than + and -, regex would be a better option:

SELECT id, split(substr(REGEXP_REPLACE(val,r"([+-]+)", ";\\1"),2) ,';') as val_tmp

Upvotes: 0

Related Questions