Reputation: 612
Currently normalizing the address column in BQ using a lot of regex_replace functions that each have a unique match and replacement value. Example;
regexp_replace(regexp_replace(regexp_replace(regexp_replace(regexp_replace(regexp_replace(regexp_replace(regexp_replace(regexp_replace(regexp_replace(regexp_replace
(regexp_replace(regexp_replace(regexp_replace(regexp_replace(UPPER(address),
"STREET", "ST"),
"ROAD", "RD"),
...
Is there a simpler way of doing this?
Upvotes: 1
Views: 480
Reputation: 3528
You can clean up your query by using a UDF:
Create temp function replace_me(s string,a1 string,a2 string,b1 string,b2 string,c1 string,c2 string) as
(replace(replace(replace(s,a1,a2),b1,b2),c1,c2));
select replace_me(replace_me(upper("Road 5, Street 7"),"ROAD","RD","STREET","ST","",""),"","","","","","")
In case you need to abbreviate only word, a regex can be used:
SELECT
REGEXP_REPLACE("ROAD 7, STREET 5, NEW YORK", CONCAT(r'(?i)(RO)AD|(ST)REET|(N)EW (Y)ORK'), r'\1\2\3\4')
Upvotes: 1