rlo5029
rlo5029

Reputation: 41

VBA or formula to separate street names and suffixes in Excel?

Trying to separate millions of street names from their suffixes ("Ave", "Dr", "St", etc.) in Excel. Sometimes streets do not have a suffix, so an empty cell is needed for those.

Example result needed

I tried using a formula at first, but ran into some issues:

Listed the suffixes with the highlighted example formula below and then using CONCAT in the last column to get them all in one column , but sometimes it also grabs directionals ("S", "N", "E", "W") after the suffix, enter image description here

lists two suffixes with the same letters (e.g. "CI", "CIR"), two suffixes with same letters

or grabs the street name if it includes the same letters as a suffix (see "ST" column). street names with same letters as a listed suffix

I've also tried Flash Fill, but this yields worse results.

I came to the conclusion VBA would be most efficient because I will have to repeat this process for multiple datasets, but if there are other suggestions I am open.

I am very novice in VBA, but this is what I have so far and I'm not sure how to expand/continue it:

Sub getSuffix()

Dim i As Range



For Each i In Range("A1")
  Do
    If InStr(ActiveCell.Value, " AVE") > 0 Then
        ActiveCell.Offset(0, 1).Value = "AVE"
        ActiveCell.Offset(1, 0).Select
    Else
        
        If IsEmpty(ActiveCell.Value) Then
        Exit For
    
        
        End If
    
    End If

Loop

Next i



End Sub

Upvotes: 1

Views: 385

Answers (2)

Zev Spitz
Zev Spitz

Reputation: 15357

AFAICT the rules are as follows:

  • You are looking for one of a set of suffixes (e.g. Ave, Dr, St ,Cir)
  • that are preceded by a space, and
  • are succeeded by either a space or the end of the string

I would suggest using regular expressions for this.

Add a reference (Tools -> References...) to the Microsoft VBScript Regular Expressions 5.5 library.

Then you can write code like the following:

Option Explicit

Dim re As RegExp

Function GetSuffix(address As String) As String
If re Is Nothing Then
    Set re = New RegExp
    re.IgnoreCase = True
    re.Pattern = " (Ave|Dr|St|Cir)( |$)"
End If
Dim matches As MatchCollection
Set matches = re.Execute(address)
If matches.Count > 0 Then GetSuffix = matches(0).SubMatches(0)
End Function

and use the function in a cell formula:

=GetSuffix(A4)

It will return the first matched suffix, excluding any subsequent characters.

You probably will want to add more suffixes, e.g.:

    re.Pattern = " (Ave|Dr|St|Cir|Wy|Ci|Pl|Aisle)( |$)"

Links:

Upvotes: 2

FaneDuru
FaneDuru

Reputation: 42236

Try the next code, please. It should be fast enough, using arrays, working only in memory and dropping the processing result at once:

Sub SplitBySuffix()
 Dim sh As Worksheet, lastR As Long, i As Long
 Dim arr, arrStr, arrS, arrSuff, arrInt, boolFound As Boolean
 
 arrSuff = Split("AVE,PL,LN,ST,DR,WY", ",") 'put here all the necessary suffixes (Upper case)
                                            '"AVE" is different than ".AVE"...
 Set sh = ActiveSheet                       'use here your necessary sheet
 lastR = sh.Range("A" & sh.rows.count).End(xlUp).row 'last row of A:A

 arr = sh.Range("A2:A" & lastR).value   'put the range to be processed in an array
 ReDim arrStr(1 To UBound(arr), 1 To 1) 'redim the street array
 ReDim arrS(1 To UBound(arr), 1 To 1)   'redim the suffix array
 For i = 1 To UBound(arr) 'iterate between the array to be processed elements
    arrInt = Split(arr(i, 1), " ")      'split the array element by space (" ")
    'check if the last array element is a suffix:
    If Not IsError(Application.match(UCase(arrInt(UBound(arrInt))), arrSuff, 0)) Then
        arrS(i, 1) = arrInt(UBound(arrInt)): boolFound = True
    End If
    If boolFound Then 'if a suffix has been found:
        'eliminate the last array element_________________________
        arrInt(UBound(arrInt)) = arrInt(UBound(arrInt)) & "##$$@@"
        arrInt = Filter(arrInt, arrInt(UBound(arrInt)), False)
        '_________________________________________________________
        arrStr(i, 1) = Join(arrInt, " ") 'put in the street array the joined filter result
        boolFound = False          'reinitialize the boolean variable
    Else
        arrStr(i, 1) = arr(i, 1)   'put in the street array the element as it is
    End If
 Next i
 'drop the arrays content at once:
 sh.Range("B2").Resize(lastR - 1, 1) = arrStr
 sh.Range("C2").Resize(lastR - 1, 1) = arrS
End Sub

Upvotes: 1

Related Questions