Reputation: 113
I'm writing a bigquery standard sql statement whereby I want to keep alphanumerics and hyphens whilst removing spaces and all other characters, which is easy:
REGEXP_REPLACE(field_1, '[^0-9a-zA-Z-]','')
However if, as a result of this regex, two numbers touch that weren't previously touching, I want them to be separated by an "@".
Examples:
"his weight is 2...... 1" --> hisweightis2@1
"his weight is 2 1 now" --> hisweightis2@1now
"his weight is 2-1 more or less" --> hisweightis2-1moreorless
"his weight is 21 and 3 now" --> hisweightis21and3now
Upvotes: 0
Views: 236
Reputation: 172974
Try below (BigQuery Standard SQL)
#standardSQL
select field_1,
regexp_replace(regexp_replace(regexp_replace(
field_1,
r'(\d)[^0-9a-zA-Z-]+(\d)', r'\1@\2'),
r'[^0-9a-zA-Z-@]', ''),
r'([\D])@([\D])', r'\1\2'
) result
from `project.dataset.table`
if to apply to sample data from your question - output is
Upvotes: 1