Stupid_Intern
Stupid_Intern

Reputation: 3450

Generate many addresses from one address line

So this is a problem I am stuck on and I would like to get some help/ideas on how can I solve this.

Problem is we are given a line of string which has data in this possible formats

Input sample #1:

100, 200, 300 Route 45

Output for #1

100, 200, 300 Route 45    
100 Route 45
200 Route 45
300 Route 45

Input sample #2

1000 Wildforest Drive; 2000 Wildridge Circle

Output for #2:

1000 Wildforest Drive; 2000 Wildridge Circle
1000 Wildforest Drive
2000 Wildridge Circle

Input sample #3

100-107 and 109 Grove Hill Drive, 400-418, 420, 422, 424, 426, 428, 430, 432, 434 and 436-441 Olive Branch Way

Output for #3

100-107 Grove Hill Drive and 109 Grove Hill Drive, 400-418, 420, 422, 424, 426, 428, 430, 432, 434 and 436-441 Olive Branch Way
100-107 Grove Hill Drive
109 Grove Hill Drive
400-418
420, 422, 424, 426, 428, 430, 432, 434 and 436-441 Olive Branch Way
422 Olive Branch Way
424 Olive Branch Way
426 Olive Branch Way
428 Olive Branch Way
430 Olive Branch Way
434 Olive Branch Way
436-411 Olive Branch Way

My attempt

Dim frowI As Long, i As Long, j As Long, frowO As Long, m As Long
Dim cet, fet, addR As String, stName As String

Sub Clean_Data()

frowI = INP.Range("A" & Rows.Count).End(xlUp).Row

For i = 2 To frowI

    frowO = frowO + 1
    addR = INP.Range("B" & i)
    OUT.Range("C" & frowO) = addR

    addR = Replace(addR, "and", ",")
    Debug.Print addR

    cet = Split(addR, ";")

    For j = LBound(cet) To UBound(cet)
        If InStr(cet(j), ",") > 0 Then
            fet = Split(cet(j), ",")

            For m = LBound(fet) To UBound(fet)
                fet(m) = Trim(fet(m))

                frowO = frowO + 1
                OUT.Range("C" & frowO) = fet(m) & " " & stName

            Next m

        End If
    Next j

Next i    

End Sub

Problem is I can't find a way to get the names of the street from the string.

Any help is appreciated.

Upvotes: 0

Views: 78

Answers (2)

Tom
Tom

Reputation: 9878

Have a look at the below, it uses Regular expressions to search the strings firstly for the street addresses and then secondly for the building number.

Someone might be able to come up with a better RegExp but this works in my tests.

Option Explicit
Sub Clean_Data()
    Dim RegExStreet As Object, RegExNo As Object, MatchesStreet As Object, MatchesNo As Object
    Dim rng As Range
    Dim nme As String, tmp As String
    Dim i As Long
    Dim c, no, street

    Set RegExStreet = CreateObject("vbscript.regexp")
    Set RegExNo = CreateObject("vbscript.regexp")

    With RegExStreet
        .IgnoreCase = True
        .Global = True
        .Pattern = "([a-z]+\s[a-z]+\s[a-z]+|[a-z]+\s[a-z]+|[a-z]+\s\d+)"
    End With
    With RegExNo
        .IgnoreCase = True
        .Global = True
        .Pattern = "(\d\-|\d|\w\d)+"
    End With

    With INP
        Set rng = .Range(.Cells(1, 2), .Cells(.Cells(.Rows.Count, 2).End(xlUp).Row, 2))
    End With

    i = 2
    For Each c In rng
        nme = vbNullString
        tmp = vbNullString

        Set MatchesStreet = RegExStreet.Execute(Replace(Replace(c.Value2, " and ", vbNullString, compare:=vbTextCompare), " et al", vbNullString, compare:=vbTextCompare))
        If MatchesStreet.Count > 0 Then
                tmp = c.Value2
                OUT.Cells(i, 3).Value2 = tmp
                i = i + 1
                For Each street In MatchesStreet
                    nme = street
                    Set MatchesNo = RegExNo.Execute(Left(tmp, InStr(1, tmp, street) - 1))
                    If MatchesNo.Count > 0 Then
                        For Each no In MatchesNo
                            OUT.Cells(i, 3).Value2 = no & " " & nme
                            i = i + 1
                        Next no
                    End If
                    tmp = Right(tmp, Len(tmp) - InStr(1, tmp, street))
                Next street
        End If
    Next c
End Sub

Also, as a side note don't declare variables outside of subs unless you really need to. This can lead to errors

Upvotes: 2

CLR
CLR

Reputation: 12279

I'm posting this, because it took a few minutes of my life to write. It's not my most elegant work. Tom's answer though is probably better and if I understood Regex better, would be the way I'd write it.

Dim fRowO As Long
Dim strNum As String, strAddr As String, wrdaddress As String
Dim cl, wrds, wrd, nums, n


fRowO = 1
For Each cl In Range("A1:A3")
    cl = Replace(cl, " and ", " , ")
    cl = Replace(cl, ";", ",")
    wrds = Split(cl, ",")
    strNum = ""
    strAddr = ""
    For Each wrd In wrds
        wrd = Trim(wrd)
        If LCase(wrd) Like "*[a-z]*" Then
            wrdaddress = Mid(wrd, InStr(wrd, " ") + 1, Len(wrd) - InStr(wrd, " ") + 1)
            strNum = strNum & Left(wrd, InStr(wrd, " ") - 1)
            strAddr = wrdaddress
            nums = Split(strNum, ";")
            For Each n In nums
                If n <> "" Then 
                    Cells(fRowO, 2) = n & " " & strAddr
                    fRowO = fRowO + 1
                End If
            Next
            strNum = ""
            strAddr = ""
        Else
            strNum = strNum & wrd & ";"
        End If
    Next
Next

Upvotes: 1

Related Questions