BMX_01
BMX_01

Reputation: 45

Split single row value to multiple rows dynamically in Snowflake

split unstructured address into multiple rows using snowflake.

consider the table

col_A
4402, 4420, 4330, 4502 hecson Blvd SW
2643-2714 Nargay Matle Ct, 2685-2733 Osase Ci
4-60 Brook Ave, 2-55 Day Drive, 6-90 Gale Dr, 27-87 Moile Road, 580 More Road
1200 1550 1750 mart Way 12231 12301 12335 12425 12427 Buck Road
241 and 251 A Street, 260 B Street
7232, 7242, 7252, 7262, 7272, 7282 south hawk St.
100 Jamal Pl,100-148 Oaklohoma Hill,11,15 Turn Pl,160-167,170,172,230-238,242 Burrows St,2200 Bentley St
100,111 goldman Place, 228-290, 306-336 Oaklohoma Hill, 340-400 azerban Place

i need to parse the above col_a as multiple rows based on addresses present

for eg: 4402, 4420, 4330, 4502 hecson Blvd SW the above address has 4 different addresses(4 housenumbers) in comma seperated format with street name need to parse them in the below format. likewise for the other formats as well. i tried to use lateral flatten to convert them into multiple rows but i got only housenumbers as outcome. if they are having '2643-2714' ranges then they can be taken as a whole with street name in case of individual housenumbers they should be populated sepereately.

output expected

col_A col_a_cleansed
4402, 4420, 4330, 4502 hecson Blvd SW 4402 hecson Blvd SW
4402, 4420, 4330, 4502 hecson Blvd SW 4420 hecson Blvd SW
4402, 4420, 4330, 4502 hecson Blvd SW 4330 hecson Blvd SW
4402, 4420, 4330, 4502 hecson Blvd SW 4502 hecson Blvd SW
2643-2714 Nargay Matle Ct, 2685-2733 Osase Ci 2643-2714 Nargay Matle Ct
2643-2714 Nargay Matle Ct, 2685-2733 Osase Ci 2685-2733 Osase Ci
4-60 Brook Ave, 2-55 Day Drive, 6-90 Gale Dr, 27-87 Moile Road, 580 More Road 4-60 Brook Ave
4-60 Brook Ave, 2-55 Day Drive, 6-90 Gale Dr, 27-87 Moile Road, 580 More Road 2-55 Day Drive
4-60 Brook Ave, 2-55 Day Drive, 6-90 Gale Dr, 27-87 Moile Road, 580 More Road 6-90 Gale Dr
4-60 Brook Ave, 2-55 Day Drive, 6-90 Gale Dr, 27-87 Moile Road, 580 More Road 27-87 Moile Road
4-60 Brook Ave, 2-55 Day Drive, 6-90 Gale Dr, 27-87 Moile Road, 580 More Road 580 More Road
1200 1550 1750 mart Way 12231 12301 12335 12425 12427 Buck Road 1200 mart Way
1200 1550 1750 mart Way 12231 12301 12335 12425 12427 Buck Road 1550 mart Way
1200 1550 1750 mart Way 12231 12301 12335 12425 12427 Buck Road 1750 mart Way
1200 1550 1750 mart Way 12231 12301 12335 12425 12427 Buck Road 12231 Buck Road
1200 1550 1750 mart Way 12231 12301 12335 12425 12427 Buck Road 12301 Buck Road
1200 1550 1750 mart Way 12231 12301 12335 12425 12427 Buck Road 12335 Buck Road
1200 1550 1750 mart Way 12231 12301 12335 12425 12427 Buck Road 12425 Buck Road
1200 1550 1750 mart Way 12231 12301 12335 12425 12427 Buck Road 12427 Buck Road
241 and 251 A Street, 260 B Street 241 A Street
241 and 251 A Street, 260 B Street 251 A Street
241 and 251 A Street, 260 B Street 260 B Street
7232, 7242, 7252, 7262, 7272, 7282 south hawk St. 7232 south hawk St
7232, 7242, 7252, 7262, 7272, 7282 south hawk St. 7242 south hawk St
7232, 7242, 7252, 7262, 7272, 7282 south hawk St. 7252 south hawk St
7232, 7242, 7252, 7262, 7272, 7282 south hawk St. 7262 south hawk St
7232, 7242, 7252, 7262, 7272, 7282 south hawk St. 7272 south hawk St
7232, 7242, 7252, 7262, 7272, 7282 south hawk St. 7282 south hawk St
100 Jamal Pl,100-148 Oaklohoma Hill,11,15 Turn Pl,160-167,170,172,230-238,242 Burrows St,2200 Bentley St 100 Jamal Pl
100 Jamal Pl,100-148 Oaklohoma Hill,11,15 Turn Pl,160-167,170,172,230-238,242 Burrows St,2200 Bentley St 100-148 Oaklohoma Hill
100 Jamal Pl,100-148 Oaklohoma Hill,11,15 Turn Pl,160-167,170,172,230-238,242 Burrows St,2200 Bentley St 11 Turn Pl
100 Jamal Pl,100-148 Oaklohoma Hill,11,15 Turn Pl,160-167,170,172,230-238,242 Burrows St,2200 Bentley St 15 Turn Pl
100 Jamal Pl,100-148 Oaklohoma Hill,11,15 Turn Pl,160-167,170,172,230-238,242 Burrows St,2200 Bentley St 160-167 Burrows St
100 Jamal Pl,100-148 Oaklohoma Hill,11,15 Turn Pl,160-167,170,172,230-238,242 Burrows St,2200 Bentley St 170 Burrows St
100 Jamal Pl,100-148 Oaklohoma Hill,11,15 Turn Pl,160-167,170,172,230-238,242 Burrows St,2200 Bentley St 172 Burrows St
100 Jamal Pl,100-148 Oaklohoma Hill,11,15 Turn Pl,160-167,170,172,230-238,242 Burrows St,2200 Bentley St 230-238 Burrows St
100 Jamal Pl,100-148 Oaklohoma Hill,11,15 Turn Pl,160-167,170,172,230-238,242 Burrows St,2200 Bentley St 242 Burrows St
100,111 goldman Place, 228-290, 306-336 Oaklohoma Hill, 340-400 azerban Place 100 goldman Place
100,111 goldman Place, 228-290, 306-336 Oaklohoma Hill, 340-400 azerban Place 111 goldman Place
100,111 goldman Place, 228-290, 306-336 Oaklohoma Hill, 340-400 azerban Place 228-290 Oaklohoma Hill
100,111 goldman Place, 228-290, 306-336 Oaklohoma Hill, 340-400 azerban Place 306-336 Oaklohoma Hill
100,111 goldman Place, 228-290, 306-336 Oaklohoma Hill, 340-400 azerban Place 340-400 azerban Place

have tried the lateral flaten but outcome is not as expected.

SELECT col_A,A.value AS ADDR ,REGEXP_SUBSTR(trim(left(col_a,15),' '), '^+[0-9]+') as start_val
FROM table, 
LATERAL SPLIT_TO_TABLE(col_a,',')A 

Upvotes: 3

Views: 584

Answers (1)

Adrian White
Adrian White

Reputation: 1804

enter image description here

This hopefully is a good starting point. I didn't finish the mart Way|Buck Road combo - but it's pretty straight forward if you follow the same approach.

Extract the wordy bits from the numbers, then stick them back together.

I'm sure there's a much smarter way to do this - hopefully one of the other answery people has a peak.

Functions used : REGEXP_SUBSTR() LEAD() - NOTE THE NULLS IGNORED STRTOK_SPLIT_TO_TABLE() REPLACE()

with cte as (select  '4402, 4420, 4330, 4502 hecson Blvd SW' col_A
union all select  '2643-2714 Nargay Matle Ct, 2685-2733 Osase Ci' col_A
union all select  '4-60 Brook Ave, 2-55 Day Drive, 6-90 Gale Dr, 27-87 Moile Road, 580 More Road' col_A
union all select  '1200 1550 1750 mart Way 12231 12301 12335 12425 12427 Buck Road' col_A
union all select  '241 and 251 A Street, 260 B Street' col_A
union all select  '7232, 7242, 7252, 7262, 7272, 7282 south hawk St.' col_A
union all select  '100 Jamal Pl,100-148 Oaklohoma Hill,11,15 Turn Pl,160-167,170,172,230-238,242 Burrows St,2200 Bentley St' col_A
union all select  '100,111 goldman Place, 228-290, 306-336 Oaklohoma Hill, 340-400 azerban Place' col_A)

 SELECT 
     COL_A
    ,TRIM(REGEXP_SUBSTR(TRIM(VALUE),'[A-Za-z]+\\s*[A-Za-z]+.*'))GRAB_ADDRESS
    ,TRIM(REPLACE(TRIM(VALUE),COALESCE(GRAB_ADDRESS,''))) GRAB_NUMBER
    ,GRAB_NUMBER
       ||' '||
 COALESCE(REGEXP_SUBSTR(TRIM(VALUE),'[A-Za-z]+\\s*[A-Za-z]+.*')
     ,LEAD(GRAB_ADDRESS)IGNORE NULLS OVER(PARTITION BY SEQ ORDER BY INDEX ASC))STICK_TOGETHER
FROM  
    CTE,
    TABLE(STRTOK_SPLIT_TO_TABLE( replace(col_A,'and',','),',')) 

enter image description here

Upvotes: 2

Related Questions