Reputation: 3450
So this is a problem I am stuck on and I would like to get some help/ideas on how can I solve this.
Problem is we are given a line of string which has data in this possible formats
Input sample #1:
100, 200, 300 Route 45
Output for #1
100, 200, 300 Route 45
100 Route 45
200 Route 45
300 Route 45
Input sample #2
1000 Wildforest Drive; 2000 Wildridge Circle
Output for #2:
1000 Wildforest Drive; 2000 Wildridge Circle
1000 Wildforest Drive
2000 Wildridge Circle
Input sample #3
100-107 and 109 Grove Hill Drive, 400-418, 420, 422, 424, 426, 428, 430, 432, 434 and 436-441 Olive Branch Way
Output for #3
100-107 Grove Hill Drive and 109 Grove Hill Drive, 400-418, 420, 422, 424, 426, 428, 430, 432, 434 and 436-441 Olive Branch Way
100-107 Grove Hill Drive
109 Grove Hill Drive
400-418
420, 422, 424, 426, 428, 430, 432, 434 and 436-441 Olive Branch Way
422 Olive Branch Way
424 Olive Branch Way
426 Olive Branch Way
428 Olive Branch Way
430 Olive Branch Way
434 Olive Branch Way
436-411 Olive Branch Way
My attempt
Dim frowI As Long, i As Long, j As Long, frowO As Long, m As Long
Dim cet, fet, addR As String, stName As String
Sub Clean_Data()
frowI = INP.Range("A" & Rows.Count).End(xlUp).Row
For i = 2 To frowI
frowO = frowO + 1
addR = INP.Range("B" & i)
OUT.Range("C" & frowO) = addR
addR = Replace(addR, "and", ",")
Debug.Print addR
cet = Split(addR, ";")
For j = LBound(cet) To UBound(cet)
If InStr(cet(j), ",") > 0 Then
fet = Split(cet(j), ",")
For m = LBound(fet) To UBound(fet)
fet(m) = Trim(fet(m))
frowO = frowO + 1
OUT.Range("C" & frowO) = fet(m) & " " & stName
Next m
End If
Next j
Next i
End Sub
Problem is I can't find a way to get the names of the street from the string.
Any help is appreciated.
Upvotes: 0
Views: 78
Reputation: 9878
Have a look at the below, it uses Regular expressions to search the strings firstly for the street addresses and then secondly for the building number.
Someone might be able to come up with a better RegExp but this works in my tests.
Option Explicit
Sub Clean_Data()
Dim RegExStreet As Object, RegExNo As Object, MatchesStreet As Object, MatchesNo As Object
Dim rng As Range
Dim nme As String, tmp As String
Dim i As Long
Dim c, no, street
Set RegExStreet = CreateObject("vbscript.regexp")
Set RegExNo = CreateObject("vbscript.regexp")
With RegExStreet
.IgnoreCase = True
.Global = True
.Pattern = "([a-z]+\s[a-z]+\s[a-z]+|[a-z]+\s[a-z]+|[a-z]+\s\d+)"
End With
With RegExNo
.IgnoreCase = True
.Global = True
.Pattern = "(\d\-|\d|\w\d)+"
End With
With INP
Set rng = .Range(.Cells(1, 2), .Cells(.Cells(.Rows.Count, 2).End(xlUp).Row, 2))
End With
i = 2
For Each c In rng
nme = vbNullString
tmp = vbNullString
Set MatchesStreet = RegExStreet.Execute(Replace(Replace(c.Value2, " and ", vbNullString, compare:=vbTextCompare), " et al", vbNullString, compare:=vbTextCompare))
If MatchesStreet.Count > 0 Then
tmp = c.Value2
OUT.Cells(i, 3).Value2 = tmp
i = i + 1
For Each street In MatchesStreet
nme = street
Set MatchesNo = RegExNo.Execute(Left(tmp, InStr(1, tmp, street) - 1))
If MatchesNo.Count > 0 Then
For Each no In MatchesNo
OUT.Cells(i, 3).Value2 = no & " " & nme
i = i + 1
Next no
End If
tmp = Right(tmp, Len(tmp) - InStr(1, tmp, street))
Next street
End If
Next c
End Sub
Also, as a side note don't declare variables outside of subs unless you really need to. This can lead to errors
Upvotes: 2
Reputation: 12279
I'm posting this, because it took a few minutes of my life to write. It's not my most elegant work. Tom's answer though is probably better and if I understood Regex better, would be the way I'd write it.
Dim fRowO As Long
Dim strNum As String, strAddr As String, wrdaddress As String
Dim cl, wrds, wrd, nums, n
fRowO = 1
For Each cl In Range("A1:A3")
cl = Replace(cl, " and ", " , ")
cl = Replace(cl, ";", ",")
wrds = Split(cl, ",")
strNum = ""
strAddr = ""
For Each wrd In wrds
wrd = Trim(wrd)
If LCase(wrd) Like "*[a-z]*" Then
wrdaddress = Mid(wrd, InStr(wrd, " ") + 1, Len(wrd) - InStr(wrd, " ") + 1)
strNum = strNum & Left(wrd, InStr(wrd, " ") - 1)
strAddr = wrdaddress
nums = Split(strNum, ";")
For Each n In nums
If n <> "" Then
Cells(fRowO, 2) = n & " " & strAddr
fRowO = fRowO + 1
End If
Next
strNum = ""
strAddr = ""
Else
strNum = strNum & wrd & ";"
End If
Next
Next
Upvotes: 1