Annabanana
Annabanana

Reputation: 95

How to remove part of a string if it is separated by white space only in Excel using a VBA formula?

I have a string in an Excel cell. I need to remove certain words from the string, but only if they are separated by white space. Here is an example:

I have: hello include in the formula for attachment a dog

I want to delete "in", "for" and "a", but only as entire words, not when they are a part of another word (for example "include", "formula", "attachment".)

I should end up with: hello include the formula attachment dog

I tried:

Function REMOVETEXTS(strInput As String, rngFind As Range) As String

Dim strTemp As String
Dim strFind As String

strTemp = strInput

For Each cell In rngFind
    strFind = cell.Value
    strTemp = Replace(strTemp, strFind, "" "", , , 1)
Next cell

REMOVETEXTS = strTemp

But it removes, for example, "in" from "include". Any advice?

Upvotes: 0

Views: 1045

Answers (3)

Marcucciboy2
Marcucciboy2

Reputation: 3259

My function splits up your original string by spaces " ", and then compares each word of the original string against the words in the other specified range and does not include them in the result if they match.

A1 has the string you start with, and A2:A4 have all of the words that you want to exclude ("in", "for", "a").

If you want to have multiple delimiters beyond just spaces, you can just add them to the arguments as one big string. In the example, it would split words by any of the characters in the string " ,=", a space, a comma, or an equals.

Option Explicit

Sub Test()
    MsgBox RemoveWholeWords(Range("A1").Value2, Range("A2:A4"), " ,=")
End Sub

Function RemoveWholeWords(ByVal str As String, ByVal remove As Range, ByVal delimiters) As String

    Dim i As Long

    If Len(delimiters) > 1 Then
        For i = 1 To Len(delimiters)
            str = Replace(str, Mid(delimiters, i, 1), Right(delimiters, 1))
        Next i
    End If

    Dim words() As String
    words = Split(str, Right(delimiters, 1))

    Dim removed As Boolean

    For i = LBound(words) To UBound(words)
        removed = False

        Dim cel As Range
        For Each cel In remove
            If words(i) = cel.Value2 Then removed = True
        Next cel

        If Not removed And Len(words(i)) > 0 Then
            RemoveWholeWords = Trim$(RemoveWholeWords & " " & words(i))
        End If

    Next i

End Function

Upvotes: 1

Marco Vos
Marco Vos

Reputation: 2968

This should work:

Function REMOVETEXTS(strInput As String, rngFind As Range) As String

Dim strTemp As String
strTemp = strInput

Dim cell As Range
Dim strFind As String

For t = 1 To 10 'The For Each loop has to run multiple times in cases the same search-word occurs back to back.

    For Each cell In rngFind
        strFind = cell.Value
        strTemp = Trim(Replace(" " & strTemp & " ", " " & strFind & " ", " "))        
    Next cell

Next t

REMOVETEXTS = strTemp

End Function

enter image description here

Upvotes: 2

QHarr
QHarr

Reputation: 84465

The following uses a regex to do the removal. It has an additional regex pattern at the end to get rid of excessive white space left behind.

The list of words to remove is passed as a comma separated list without white space. Inside the regex function this is converted to a pattern of, in this case, \b(in|for|a)\b. This is essentially an OR list of individual words i.e. either "in", "for" or "a".

There is an optional 3rd parameter that allows you to choose whether to ignore the case of the matched words. It defaults to False.

You can use it in the sheet as an UDF.


VBA:

Option Explicit  
Public Sub test()
    Dim j As Long, arr()
    arr = Array("hello include in the formula for attachment a dog")

    For j = LBound(arr) To UBound(arr)
        Debug.Print RemoveUnWantedStrings(arr(j), "in,for,a", False)
    Next
End Sub

Public Function RemoveUnWantedStrings(ByVal inputString As String, ByVal sPattern As String, Optional ignoreCase As Boolean = False) As Variant
    Dim matches As Object, iMatch As Object

    With CreateObject("vbscript.regexp")
        .Global = True
        .MultiLine = True
        .ignoreCase = ignoreCase
        .Pattern = "\b(" & Replace$(sPattern, ",", "|") & ")\b"

        If .test(inputString) Then
           ' Set matches = .Execute(inputString)
           'For Each iMatch In matches
                inputString = .Replace(inputString, vbNullString)
           ' Next iMatch
        Else
            RemoveUnWantedStrings = inputString
            Exit Function
        End If
        .Pattern = "\s{2,}"
        RemoveUnWantedStrings = .Replace(inputString, Chr$(32))
    End With
End Function

In sheet UDF:

enter image description here


Regex: Try it here.

/
\b(in|for|a)\b
/
gm

\b assert position at a word boundary (^\w|\w$|\W\w|\w\W)

1st Capturing Group (in|for|a)

1st Alternative in

in matches the characters in literally (case sensitive)

2nd Alternative for

for matches the characters for literally (case sensitive)

3rd Alternative a

a matches the character a literally (case sensitive)

\b assert position at a word boundary (^\w|\w$|\W\w|\w\W)

Upvotes: 1

Related Questions