Chique_Code
Chique_Code

Reputation: 1530

regex_replace value that starts with char, followed by integers BigQuery SQL

I have a table in BigQuery that I need to clean up. In column content_column I have values like:

content_column
/a68786776
/g263647
/g47664
/galleries
/music

I want to replace all values that start with /g followed by any number of integers to a string gallery. The desired output is:

content_column
/a68786776
gallery
gallery
/galleries
/music

I have tried the following:

SELECT regexp_replace(content_column,r'\/(\w+\d+)\/', "gallery") as content_column
FROM `my_table`

This works but it also converts /a68786776 to gallery. I tried to replace (\w+\d+) to (\^g\d+) but that didn't convert anything to gallery.

Thanks for help in advance.

Upvotes: 1

Views: 666

Answers (2)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 173161

Consider below approach

select if(
    regexp_contains(content_column,r'^/g\d+'), 
    'gallery', 
    content_column
  ) as content_column
from `my_table`         

if applied to sample data in your question - output is

enter image description here

Upvotes: 1

The fourth bird
The fourth bird

Reputation: 163577

If you want to replace the full match with gallery you don't need a capture group in the pattern.

  • This pattern \w+\d+ matches 1+ word chars and 1+ digits, which is a minimum of 2 characters and does not specify a g char
  • This pattern \^g\d+ matches ^g and 1+ digits.

Instead you can match / and 1 or more digits:

r'/g\d+

Regex demo

Or match from the start of the string using an anchor ^ (wihout the backslash) with a multiline flag:

(?m)^/g\d+

Upvotes: 2

Related Questions