Ilja
Ilja

Reputation: 1053

BigQuery string comparison and adjustment with replace/trim

I tried to accommodate in BigQuery for inputs with bad spacing.
However, I keep getting weird results, basically saying that abc <> abc.

Can anyone explain how this is possible? I tried to adjust this by using replace and rtrim but none of them served the purpose.

Thank you

SELECT o = t,o, t, length(o) as lon_o, length(t) as len_t
from
(select replace('abc                                                                ',' ','') o,
'abc' as t)

enter image description here

Upvotes: 0

Views: 749

Answers (2)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 172974

I was able easily reproduce your case by having (for example) few tabs inside that string instead of spaces -

enter image description here

visually they look exactly alike - but first row has 64 spaces in it whereas second row has 5 tabs and 44 spaces. In Web UI each tab occupy 2 char-places so that is why total 5 tabs + 44 spaces look like 64 spaces (exactly as in first row)

To address this - you can use REGEXP_REPLACE instead of REPLACE
Using REGEXP_REPLACE allows you to 'remove' all white spaces, as in example below (the only difference here is in second row - using REGEXP_REPLACE instead of REPLACE and use of r'\s' instead ' '

enter image description here

And, finally - if to get back to first example - the good way of seeing what actually makes those 8 chars vs. 3 chars in result's lon_o is to switch from Table viewtoJSON` view as below

enter image description here

Upvotes: 3

Tamir Klein
Tamir Klein

Reputation: 3632

This is working for me enter image description here

Check if you don't have hidden ASCII chars in your text which might be causing this for example HEX 80

Upvotes: 1

Related Questions