Reputation: 47
I want to get the next value in an array delimited by "->"
For example, in this example below, i want to get the next value after "Res" in each
ROW 1- "Q -> Res -> tes -> Res -> twet"
ROW 2- "rw -> gewg -> tes -> Res -> twet"
ROW 3- "Y -> Res -> Res -> Res -> twet"
Output would be:
ROW 1- tes
ROW 2- tewt
ROW 3- tewt
Ive tried the following but it gets me the previous value,
Array_reverse(split(regexp_extract(COLUMN_NAME, '(.*?)Res'), '->'))[safe_offset(1)]
Upvotes: -1
Views: 237
Reputation: 7277
Consider the approach below using distinct to remove the duplicates and will retain the 1st occurence:
with sample_data as (
select 'ROW 1'as id, "Q -> Res -> tes -> Res -> twet" as test_str
union all select 'ROW 2'as id ,"rw -> gewg -> tes -> Res -> twet" as test_str
union all select 'ROW 3'as id, "Y -> Res -> Res -> Res -> twet" as test_str
),
remove_dups as (
select
id,
array_to_string(array(select distinct * from unnest(split(test_str,' -> '))),',') as clean_str
from sample_data
)
select
id,
regexp_extract(clean_str,r',?Res\,(\w+)\,?') as next_word
from remove_dups
Output:
Upvotes: 0
Reputation: 172974
Consider below approach
select id,
( select word from unnest(arr) word with offset
where offset > (select offset from unnest(arr) word with offset where trim(word) = 'Res' limit 1)
and trim(word) != 'Res' order by offset limit 1
) as next_word
from your_table, unnest([struct(split(path, '->') as arr)])
if applied to sample data in your question - output is
Another option is
select id,
( select split(pair, ' -> ')[offset(1)]
from unnest(arr) pair with offset
where trim(pair) != 'Res -> Res'
order by offset limit 1
) as next_word
from your_table, unnest([struct(regexp_extract_all(path, r' Res -> \w+') as arr)])
with same output
The benefit of later solution is that it can easily be adjusted to catch all instances (in one row) of words placed after 'Res' - like in below example
select id,
array( select split(pair, ' -> ')[offset(1)]
from unnest(arr) pair with offset
where trim(pair) != 'Res -> Res'
order by offset
) as next_words
from your_table, unnest([struct(regexp_extract_all(path, r' Res -> \w+') as arr)])
with output
Upvotes: 1