pokemon_Man
pokemon_Man

Reputation: 902

Remove unicode characters from string on excel sheet

I need some directions on how to use regex to remove special characters such as fractions,exponents,degree symbol and any other non normal letters in a string. I know the code below find the string base on those criteria but does it include all unicode characters?

Code for your attention:

Dim strPattern As String: strPattern = "[^\u0000-\u007F]"
Dim regEx As Object

Set regEx = CreateObject("VBScript.RegExp")
regEx.Global = True
regEx.IgnoreCase = True
regEx.Pattern = strPattern

For Each cell In ActiveSheet.Range("C:C") ' Define your own range here
    If strPattern <> "" Then              ' If the cell is not empty
        If regEx.Test(cell.Value) Then    ' Check if there is a match
            cell.Interior.ColorIndex = 6  ' If yes, change the background color
        End If
    End If
Next

Upvotes: 3

Views: 3824

Answers (1)

Gary&#39;s Student
Gary&#39;s Student

Reputation: 96753

This does not use regular expressions.

There are many potentially "bad" characters. Rather than trying to remove them, just keep the "good" ones.

Select some cell and run this short macro:

Sub UniKiller()
    Dim s As String, temp As String, i As Long
    Dim C As String

    s = ActiveCell.Value
    If s = "" Then Exit Sub
    temp = ""

    For i = 1 To Len(s)
        C = Mid(s, i, 1)
        If AscW(C) > 31 And AscW(C) < 127 Then
            temp = temp & C
        End If
    Next i
    ActiveCell.Value = temp
End Sub

If you need to "clean" more than one cell, put the logic in a loop.

Upvotes: 1

Related Questions