H BG
H BG

Reputation: 81

Find / Replace to exclude if string is part of longer word

I am trying to search for three letter target words and replace them with corrected three letter words.

e.g.
CHI - as a single cell entry (with hyphen) to be replaced with "ORD -".

There will be instances where the target word is part of a word pair within a cell, e.g. CHI - SHA.

The code below captures these cases.

I realized that when the cell is e.g. XIANCHI - SHA it would also correct the part "CHI -" resulting in XIANORD - SHA.

How can I limit the fndlist to skip the target letters if they are part of a longer word?

Sample

If I use lookat:xlwhole the code would only catch the CHI - case but not the pair but if I use xlpart it will catch the pair CHI - PVG but also corrects any word it finds with that element.

Sub Adjust_Airport_Codes2()

    Dim sht As Worksheet
    Dim fndList As Variant
    Dim rplcList As Variant
    Dim x As Long

    fndList = Array("BUE -", "CHI -", "DCA -", "HOU -", "LGA -", "NYC -", "WAS -", "AEJ -", "BUS -", "CGH -", "CPS -", "DGM -", "EHA -", "EHB -", "EHF -", "FOQ -", "FQC -", "JBN -", "LCY -", "LGW -", "LIN -", "LON -", "MIL -", "MOW -", "NAY -", "ORY -", "OSA -", "PAR -", "PUS -", "QPG -", "RIO -", "SAO -", "SAW -", "SDU -", "SDV -", "SEL -", "PVG -", "TSF -", "TYO -", "UAQ -", "VIT -", "YMX -", "YTO -", "ZIS -", "CNF -", "HND -", "IZM -", "JKT -", "LTN -", "MMA -", "UXM -", "VCE -", "VSS -")
    rplcList = Array("EZE -", "ORD -", "IAD -", "IAH -", "JFK -", "JFK -", "IAD -", "AMS -", "ICN -", "GRU -", "VCP -", "HKG -", "AMS -", "BRU -", "HHN -", "HKG -", "FRA -", "PRG -", "LHR -", "LHR -", "MXP -", "LHR -", "MXP -", "SVO -", "PEK -", "CDG -", "KIX -", "CDG -", "ICN -", "SIN -", "GIG -", "GRU -", "IST -", "GIG -", "TLV -", "ICN -", "SHA -", "MXP -", "NRT -", "EZE -", "BIO -", "YUL -", "YYZ -", "HKG -", "BHZ -", "NRT -", "ADB -", "CGK -", "LHR -", "MMX -", "FRA -", "MXP -", "MHG -")

    'Loop through each item in Array lists
    For x = LBound(fndList) To UBound(fndList)
        'Loop through each worksheet in ActiveWorkbook
        For Each sht In ActiveWorkbook.Worksheets
            sht.Cells.Replace What:=fndList(x), Replacement:=rplcList(x), _
              LookAt:=xlpart, SearchOrder:=xlByRows, MatchCase:=True, _
              SearchFormat:=False, ReplaceFormat:=False
          Next sht  
      Next x

End Sub

Upvotes: 0

Views: 530

Answers (2)

VBasic2008
VBasic2008

Reputation: 54807

Replace Using Lists

  • Caution: This code replaces the whole range in each worksheet with values. If there are formulas, they will be 'lost'.
  • I could not figure out why you would need " -" so I removed them. Add them if you need them.

The Flow (not all steps and some inaccuracies)

  • Writes the values from the arrays to the Dictionary.
  • Loops through each worksheet.
  • Writes the values from its used range (Data Range) to Data Array.
  • Loops through each element in the Data Array.
  • Checks the element if not error or empty value.
  • Splits it by the space character into Current Values Array.
  • Checks each element in Current Values Array against the Keys of the Dictionary and replaces it with the Dictionary's Value if found.
  • Joins back the elements in Current Values Array. And writes possibly modified value back to current element in Data Array.
  • Writes possibly modified values from Data Array back to Data Range.

The Code

Option Explicit

Sub Adjust_Airport_Codes2()

    ' Define Find and Replace Arrays.

    ' Define Find Array.
    Dim fndList As Variant
    fndList = Array("BUE", "CHI", "DCA", "HOU", "LGA", "NYC", "WAS", "AEJ", _
                    "BUS", "CGH", "CPS", "DGM", "EHA", "EHB", "EHF", "FOQ", _
                    "FQC", "JBN", "LCY", "LGW", "LIN", "LON", "MIL", "MOW", _
                    "NAY", "ORY", "OSA", "PAR", "PUS", "QPG", "RIO", "SAO", _
                    "SAW", "SDU", "SDV", "SEL", "PVG", "TSF", "TYO", "UAQ", _
                    "VIT", "YMX", "YTO", "ZIS", "CNF", "HND", "IZM", "JKT", _
                    "LTN", "MMA", "UXM", "VCE", "VSS")
    ' Define Replace Array.
    Dim rplcList As Variant
    rplcList = Array("EZE", "ORD", "IAD", "IAH", "JFK", "JFK", "IAD", "AMS", _
                     "ICN", "GRU", "VCP", "HKG", "AMS", "BRU", "HHN", "HKG", _
                     "FRA", "PRG", "LHR", "LHR", "MXP", "LHR", "MXP", "SVO", _
                     "PEK", "CDG", "KIX", "CDG", "ICN", "SIN", "GIG", "GRU", _
                     "IST", "GIG", "TLV", "ICN", "SHA", "MXP", "NRT", "EZE", _
                     "BIO", "YUL", "YYZ", "HKG", "BHZ", "NRT", "ADB", "CGK", _
                     "LHR", "MMX", "FRA", "MXP", "MHG")
    
    ' Write values from Find and Replace Arrays to the Dictionary.
    
    Dim dict As Object         ' The Dictionary
    Set dict = CreateObject("Scripting.Dictionary")
    Dim n As Long              ' Find and Replace Arrays Element Counter
    For n = LBound(fndList) To UBound(fndList)
        dict(fndList(n)) = rplcList(n)
    Next n
    
    ' Find and replace values in each worksheet of the ActiveWorkbook.
    
    ' Declare variables to be used in loop.
    Dim sht As Worksheet       ' Current Worksheet
    Dim rng As Range           ' Current Data Range
    Dim Data As Variant        ' Current Data Array
    Dim CurVal As Variant      ' Current Value:
                               ' The value of the current element of Data Array
    Dim CurValues As Variant   ' Current Values Array:
                               ' The 'words' contained in current element
                               ' of Data Array
    Dim i As Long              ' Data Array Rows Counter
    Dim j As Long              ' Data Array Columns Counter
    Dim DataChanged As Boolean ' Data Changed Switch
    
    ' Iterate worksheets in ActiveWorkbook.
    For Each sht In ActiveWorkbook.Worksheets
        ' Define Data Range (there are other ways).
        Set rng = sht.UsedRange
        ' Write values from Data Range to Data Array.
        If rng.Rows.Count > 1 Or rng.Columns.Count > 1 Then
            Data = rng.Value
        Else
            ReDim Data(1 To 1, 1 To 1)
            Data(1, 1) = rng.Value
        End If
        ' Iterate rows in Data Array.
        For i = 1 To UBound(Data, 1)
            ' Iterate columns in Data Array.
            For j = 1 To UBound(Data, 2)
                ' Write value of current element to Current Value.
                CurVal = Data(i, j)
                ' Check if Current Value is not an error or empty value.
                If Not IsError(CurVal) And Not IsEmpty(CurVal) Then
                    ' Split Current Value by the space character into
                    ' Current Values Array.
                    CurValues = Split(CurVal)
                    ' Iterate elements of Current Values Array.
                    For n = LBound(CurValues) To UBound(CurValues)
                        ' Check if they exist as a Key in the Dictionary.
                        If dict.Exists(CurValues(n)) Then
                            ' Write value of Dictionary to current element
                            ' in Current Values Array.
                            CurValues(n) = dict(CurValues(n))
                            DataChanged = True
                            ' You can increase performance if you're expecting
                            ' only one possibly found value per cell:
                            'Exit For
                        End If
                    Next n
                    ' Write elements of Current Values Array, joined with
                    ' the space character, to current element in Data Array.
                    If DataChanged Then
                        Data(i, j) = Join(CurValues)
                        DataChanged = False
                    End If
                End If
            Next j
        Next i
        ' Write values from Data Array to Data Range.
        rng.Value = Data
    Next sht
    
End Sub

Upvotes: 1

basodre
basodre

Reputation: 5770

Edit: I wanted to give you something a bit more complete. In the below code, I used a separate function that creates a map between before and after values. This cleans up the code because now all of these values are stored in one place (also easier to maintain). I use this object to then create the search pattern, since a regular expression can search for multiple patterns at once. Finally, I use the dictionary to return the replacement value. Try this revised code, and see if it better meets your use case.

I ran quick performance test to see if it performed better/worse than built-in VBA replace function. In my test, I used only three of the possibilities in my regular expression search/replace, and I ran a test against 103k rows. It performed equally as well as a built-in search and replace using only one value. The search and replace would have had to be re-run for each of the search values.

Let me know if this helps.

Function GetMap() As Object
    Dim oMap As Object
    
    Set oMap = CreateObject("Scripting.Dictionary")
    
    oMap.Add "BUE -", "EZE -"
    oMap.Add "CHI -", "ORD -"
    oMap.Add "DCA -", "IAD -"
    ''' Add the rest of the mapped items
    '''
    '''
    
    Set GetMap = oMap
End Function

Sub TestIt()
    Dim oReg As Object
    Dim oMap As Object
    Dim m As Object
    Dim rng As Range
    Dim cel As Range
    Dim t As Double
    
    Set oReg = CreateObject("VbScript.Regexp")
    Set oMap = GetMap()
    
    With oReg
        .Global = False
        
        'Multiple patterns can be searched at once by
        'separating them with pipes. Since we have the
        'patterns to search for in the oMap dictionary,
        'we can simply join it here. The benefit is that if
        'you have to support new items, you only have to add
        'them in the GetMap() function
        
        .Pattern = "^" & Join(oMap.Keys, "|^")
    End With
    
    'Set your range appropriately
    Set rng = Range("A1:A103680")
    
    t = Timer
    
    Application.ScreenUpdating = False
    
    For Each cel In rng
        If oReg.Test(cel.Value) Then
            Set m = oReg.Execute(cel.Value)
            
            cel.Value = oReg.Replace(cel.Value, oMap(m(0).Value))
        End If
    Next cel
    
    Debug.Print "Process took " & Timer - t & " seconds."
    
    Application.ScreenUpdating = True
End Sub

You can consider using regular expression pattern matching. When using pattern matching, you can use the ^ symbol to indicate the start of the string. See the below code for a simple example, and try to insert it into your code. Let us know if you run into issues.

Sub Tester()
    Dim oReg As Object
    
    Set oReg = CreateObject("VbScript.Regexp")
    
    With oReg
        .Global = False
        .Pattern = "^CHI -"
    End With
    
    'Will return: ORD PVG
    If oReg.test("CHI - PVG") Then
        Debug.Print oReg.Replace("CHI - PVG", "ORD")
    End If
    
    'Won't trigger below
    If oReg.test("XIANCHI - PVG") Then
        Debug.Print oReg.Replace("XIANCHI - PVG", "ORD")
    End If

End Sub

Upvotes: 1

Related Questions