user4951
user4951

Reputation: 33120

How to create special character string in vb.net

In ancient time, we can specify all characters with chr(56)

For example, say the character is unprintable. We want to put it in a string. Just do

Dim a as string = chr (56)

Now we have UTF8 or unicode (or whatever encoding).

Say I want variable a to contain

             en space
             em space
           thin space
‌ ‌    ‌      zero width non-joiner
‍ ‍    ‍       zero width joiner
‎ ‎    ‎       left-to-right mark
 ‏    ‏       right-to-left mark

In fact, say I want to create a function that'll get rid all of such characters from my string.

How would I do so?

I want the function to leave chinese, korean, japanese characters intact and then get rid really really vague ones.

Upvotes: 1

Views: 6063

Answers (3)

SSS
SSS

Reputation: 5403

''' <summary>
''' This function replaces 'smart quotes' (ASC 145, 146, 147, 148, 150) with their correct ASCII versions (ASC 39, 34, 45), and replaces any other non-ASCII characters with "?"
''' </summary>
''' <param name="expression"></param>
''' <returns></returns>
''' <remarks></remarks>
Public Function Unicode2ASCII(ByVal expression As String) As String
  Dim sb As New System.Text.StringBuilder
  For i As Integer = 1 To Len(expression)
    Dim s As String = Mid(expression, i, 1)
    Select Case Asc(s)
      Case 145, 146 'apostrophes'
        sb.Append("'"c)
      Case 147, 148 'inverted commas'
        sb.Append(""""c)
      Case 150 'hyphen'
        sb.Append("-"c)
      Case Is > 127
        sb.Append("?"c)
      Case Else
        sb.Append(s)
    End Select
  Next i
  Return sb.ToString
End Function

Or to add them...

Dim s As String = "a" & ChrW(8194) & "b"
MsgBox(s)

Upvotes: 1

Alexei Levenkov
Alexei Levenkov

Reputation: 100545

Replace removes whatever you want. ChrW produces Unicode characters by code (to produce characters outside Unicode Plane 0 you need to concatenate 2 Char).

Something like:

Replace("My text", ChrW(8194), "");

Upvotes: 1

Steven Doggart
Steven Doggart

Reputation: 43743

It seems like there ought to be a better way, but the best I can come up with that would work in all situations would be something like this:

Private Function getString(ByVal xmlCharacterCode As String) As String
    Dim doc As XmlDocument = New XmlDocument()
    doc.LoadXml("<?xml version=""1.0"" encoding=""utf-8""?><test>" + xmlCharacterCode + "</test>")
    Return doc.InnerText
End Function

And then use it like this:

myString = myString.Replace(getString("&#8194;"), "")

Also, you may want to take a look at this page I found:

Easy way to convert &#XXXX; from HTML to UTF-8 xml either programmaticaly in .Net or using tools

Upvotes: 0

Related Questions