Reputation: 1530
I have a table in BigQuery that I need to clean up. In column content_column
I have values like:
content_column
/a68786776
/g263647
/g47664
/galleries
/music
I want to replace all values that start with /g
followed by any number of integers to a string gallery
. The desired output is:
content_column
/a68786776
gallery
gallery
/galleries
/music
I have tried the following:
SELECT regexp_replace(content_column,r'\/(\w+\d+)\/', "gallery") as content_column
FROM `my_table`
This works but it also converts /a68786776
to gallery
. I tried to replace (\w+\d+)
to (\^g\d+)
but that didn't convert anything to gallery.
Thanks for help in advance.
Upvotes: 1
Views: 666
Reputation: 173161
Consider below approach
select if(
regexp_contains(content_column,r'^/g\d+'),
'gallery',
content_column
) as content_column
from `my_table`
if applied to sample data in your question - output is
Upvotes: 1
Reputation: 163577
If you want to replace the full match with gallery
you don't need a capture group in the pattern.
\w+\d+
matches 1+ word chars and 1+ digits, which is a minimum of 2 characters and does not specify a g
char\^g\d+
matches ^g
and 1+ digits.Instead you can match /
and 1 or more digits:
r'/g\d+
Or match from the start of the string using an anchor ^
(wihout the backslash) with a multiline flag:
(?m)^/g\d+
Upvotes: 2