Reputation: 179
I have a table in BigQuery:
ab_col_jfsfhfd_ggg_sdf
arfd_am_fdsf_fddg_fg
d_fdf_fdddg_ffddd_f
I would like to extract those characters that go right after the first _ character and followed by the second _ character. I want to get the following:
col
am
fdf
I used the following regular expression to extract the characters but it does not work as intended:
^.*\_(\D+)\_.*$
regexp_replace(id,'^.*\\_(\\D+)\\_.*$' , '\\1')
Please help!
Upvotes: 0
Views: 958
Reputation: 18641
Use regexp_extract
:
regexp_extract(id,'^[^_]+_([^_]+)')
See proof
Explanation
--------------------------------------------------------------------------------
^ the beginning of the string
--------------------------------------------------------------------------------
[^_]+ any character except: '_' (1 or more times
(matching the most amount possible))
--------------------------------------------------------------------------------
_ '_'
--------------------------------------------------------------------------------
( group and capture to \1:
--------------------------------------------------------------------------------
[^_]+ any character except: '_' (1 or more
times (matching the most amount
possible))
--------------------------------------------------------------------------------
) end of \1
Upvotes: 0
Reputation: 1271231
split()
is a very simply way of solving this. But regular expressions are also quite simple:
with t as (
select 'ab_col_jfsfhfd_ggg_sdf' as id union all
select 'arfd_am_fdsf_fddg_fg' union all
select 'd_fdf_fdddg_ffddd_f'
)
select id, regexp_extract(id, '[^_]+', 1, 2)
from t;
The logic for the pattern is: "Look for any string of characters that is not an underscore. Then take the second one in the string."
Upvotes: 2
Reputation: 222722
If I follow you correctly, you can use split()
:
(split(col, '_'))[safe_ordinal(2)]
split()
turns the string column to an array of values, given a separator (here, we use _
). Then we can just grab second array element.
Upvotes: 2