jbfdataguru
jbfdataguru

Reputation: 23

Only keep numbers in a string field in Big Query

I have a filed that is a free text input for phone numbers and I only want to keep the numerical values from that field to remove any spaces, symbols etc.. Below are some examples of what the records look like:

+1 123-123-1234 or (123)123-1234 or (123) 123-1234 or 123-123-1234 and so on.

Below is my query I was trying but it's only giving me the first set of numbers back if the field has any other characters in it.

select noteattributes.value.name as name_type, REGEXP_EXTRACT(noteattributes.value.value, r'[\d]+') as name_value, order_number from test.orders, unnest(note_attributes) as noteattributes where noteattributes.value.name = 'Pickup-Collector-Phone'

My results look like this 1234567890 or 123 or 1 and etc..

Upvotes: 1

Views: 1738

Answers (1)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 172993

You can use REGEXP_REPLACE function for this as in below example with use of sample data from your question

#standardSQL
WITH `project.dataset.table` AS (
  SELECT '+1 123-123-1234' phone_number UNION ALL
  SELECT '(123)123-1234' UNION ALL
  SELECT '(123) 123-1234' UNION ALL
  SELECT '123-123-1234' 
)
SELECT *, REGEXP_REPLACE(phone_number, r'[^\d]', '') cleaned_phone_number
FROM `project.dataset.table`   

with output

Row phone_number    cleaned_phone_number     
1   +1 123-123-1234 11231231234  
2   (123)123-1234   1231231234   
3   (123) 123-1234  1231231234   
4   123-123-1234    1231231234   

Upvotes: 1

Related Questions