Max Krog
Max Krog

Reputation: 181

BigQuery - Replace whitespace in string

My hair is going grey over here. I have a table with a column where i want to remove the whitespace. Expected result: "8 800 000 kr" -> "8800000kr"

I can't get this to work. Please see query output below:

Image removed by the author because it contained explicit language.

Why is the white space not removed?

When i perform the query against mock data it works: enter image description here

What could be wrong with the string in the first example? Everywhere i look i looks identical to the mocked string in example 2.

Upvotes: 4

Views: 18659

Answers (1)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 172974

Below example is for BigQuery Standard SQL and explains your problem

#standardSQL
with `project.dataset.table` as (
  select 'with spaces' space_type, '8 800 000 kr' slutpris union all
  select 'with non-breaking spaces', replace('8 800 000 kr', chr(32), chr(160)) slutpris
)
select space_type, slutpris,
  replace(slutpris, ' ', ''),
  regexp_replace(slutpris, r'\s', ''),
  regexp_replace(slutpris, r'\s|kr', '') 
from `project.dataset.table`    

with output

enter image description here

So, as you can see - non-breaking space is not recognized as a space character or any white space

Forgot to mention possible solution -

#standardSQL
with `project.dataset.table` as (
  select 'with spaces' space_type, '8 800 000 kr' slutpris union all
  select 'with non-breaking spaces', replace('8 800 000 kr', chr(32), chr(160)) slutpris
)
select space_type, slutpris,
  translate(slutpris, chr(32) || chr(160), ''),
  regexp_replace(slutpris, '[\u00A0\\s]', ''),
  regexp_replace(slutpris, '[\u00A0\\s]|kr', '')
from `project.dataset.table` 

with output

enter image description here

Upvotes: 7

Related Questions