Jonathan Francis
Jonathan Francis

Reputation: 113

Conditional rules for results of a regexp_replace in bigquery

I'm writing a bigquery standard sql statement whereby I want to keep alphanumerics and hyphens whilst removing spaces and all other characters, which is easy:

REGEXP_REPLACE(field_1, '[^0-9a-zA-Z-]','')

However if, as a result of this regex, two numbers touch that weren't previously touching, I want them to be separated by an "@".

Examples:
"his weight is 2...... 1" --> hisweightis2@1
"his weight is 2 1 now" --> hisweightis2@1now
"his weight is 2-1 more or less" --> hisweightis2-1moreorless
"his weight is 21 and 3 now" --> hisweightis21and3now

Upvotes: 0

Views: 236

Answers (1)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 172974

Try below (BigQuery Standard SQL)

#standardSQL
select field_1, 
  regexp_replace(regexp_replace(regexp_replace(
    field_1, 
    r'(\d)[^0-9a-zA-Z-]+(\d)', r'\1@\2'), 
      r'[^0-9a-zA-Z-@]', ''),
        r'([\D])@([\D])', r'\1\2'
    ) result
from `project.dataset.table` 

if to apply to sample data from your question - output is

enter image description here

Upvotes: 1

Related Questions