Jeevan
Jeevan

Reputation: 487

Remove Unicode characters in a String

How do I remove all special characters which don't fall under ASCII category in VBA?

These are some of the symbols which appear in my string.

Œ œ Š š Ÿ ƒ

There are many more such characters.

These don't belong to ASCII category as you can see here http://www.ascii.cl/htmlcodes.htm

I tried something like this

strName = Replace(strName, ChrW(376), " ")

Upvotes: 7

Views: 31968

Answers (6)

Sorin GFS
Sorin GFS

Reputation: 529

Don't need to loop each character

Maybe late, but maybe it helps someone:

Public Function StripNonAsciiChars(ByVal InputString As String) As String
    Dim i As Integer
    Dim RegEx As Object
    Set RegEx = CreateObject("VBScript.RegExp")
    With RegEx
        .Global = True
        .MultiLine = True
        .IgnoreCase = True
        .Pattern = "[^\u0000-\u007F]"
        StripNonAsciiChars = Application.WorksheetFunction.Trim(RegEx.Replace(InputString, " "))
    End With
End Function

Upvotes: 4

Ambie
Ambie

Reputation: 4977

Would a RegEx solution be of interest to you?

There are plenty of examples for different languages on this site - here's a C# one: How can you strip non-ASCII characters from a string? (in C#).

Try this for VBA:

Private Function GetStrippedText(txt As String) As String
    Dim regEx As Object

    Set regEx = CreateObject("vbscript.regexp")
    regEx.Pattern = "[^\u0000-\u007F]"
    GetStrippedText = regEx.Replace(txt, "")

End Function

Upvotes: 8

Karthick Gunasekaran
Karthick Gunasekaran

Reputation: 2713

Try with below

Function ClearUnwantedString(fulltext As String) As String
    Dim output As String
    Dim character As String
    For i = 1 To Len(fulltext)
        character = Mid(fulltext, i, 1)
        If (character >= "a" And character <= "z") Or (character >= "0" And character <= "9") Or (character >= "A" And character <= "Z") Then
            output = output & character
        End If
    Next
    ClearUnwantedString = output
End Function

Sub test()
    a = ClearUnwantedString("dfjŒœŠdskl")
End Sub

Upvotes: 3

Axel Richter
Axel Richter

Reputation: 61995

Supposing you have:

enter image description here

Then the following code will get the String from A1 and let through only ANSI (code 0 to 255) in A2.

Sub test()
 Dim s1 As String, s2 As String, c As String, i As Long, iAsc As Integer

 s1 = Range("A1").Value

 s2 = ""

 For i = 1 To Len(s1)
  c = Mid(s1, i, 1)
  iAsc = AscW(c)
  If iAsc <= 255 Then
   s2 = s2 & c
  End If
 Next

 Range("A2").Value = s2

End Sub

Upvotes: 0

Vityata
Vityata

Reputation: 43595

What do you get when you write the following in the immediate window?

?Replace("ŸŸŸŸ", ChrW(376), "ale")

I get: alealealeale

Upvotes: 1

Rosetta
Rosetta

Reputation: 2725

try application.clean()

it will remove all unprintable characters

Upvotes: 0

Related Questions