user38349
user38349

Reputation: 3025

How do you determine if a Char is a Letter from A-Z?

How do you determine if a letter is in the range from A-Z or Digit 0-9? We are getting some corrupted data "I_999Š=ÄÖÆaðøñòòñ".

I thought I could use Char.IsLetterOrDigit("Š") to ID the corrupted data from "I_999Š", but unexpectedly this is returning true. I need to trap this, any thoughts?

Upvotes: 3

Views: 25450

Answers (8)

Rob Williams
Rob Williams

Reputation: 7921

I can't help but notice that everyone seems to be missing the real issue: your data "corruption" appears to be an obvious character encoding problem. Therefore, no matter what you do with the data, you will be (mis)treating the symptom and ignoring the root cause.

To be specific, you appear to be attempting to interpret the received binary BYTES as ASCII text, when those BYTES were almost-certainly intended to represent text encoded as something-other-than-ASCII.

You should find out what character encoding applies to the string of text that you received. Then you should read that data while applying the appropriate character encoding transformations.

You should read Joel Spolsky's article "The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets". Especially the section with the heading "There Ain't No Such Thing As Plain Text", which emphasizes exactly that.

Upvotes: 13

jinzai
jinzai

Reputation: 446

The only way to ensure that you are dealing with printable ASCII characters, regardless of the encoding in the program or even in the string in question is to check each character for a legal value between 32 and 126 (127 = Delete -- not actually a 'printable' character).

i.e.

Public Module StringExtensions
<Extension()>

Public Function IsASCII(inString As String, Optional bPrintableOnly As Boolean = True) ' 127 = Delete (non-printing) < 32 = control characters also, non-printing

Dim lowerLimit As Int32 = If(bPrintableOnly, 32, 0)
Dim upperLimit As Int32 = If(bPrintableOnly, 127, 128)

For Each ch In inString.ToCharArray()
  If Not Asc(ch) < upperLimit OrElse Asc(ch) < lowerLimit Then
    Return False
  End If
Next

Return True

End Function
End Module

Upvotes: 1

Ghassen Arfaoui
Ghassen Arfaoui

Reputation: 59

Try the following code:

NOT isNumeric(char)

Upvotes: 0

Minh Ho&#224;ng
Minh Ho&#224;ng

Reputation: 121

Use Asc(char) function. It returns a ANSI Character Code from 0 to 255. Check ANSI Character Codes Chart

Upvotes: 0

P Daddy
P Daddy

Reputation: 29527

For Each m As Match In Regex.Matches("I_999Š=ÄÖÆaðøñòòñ", "[^A-Z0-9]")
    '' Found a bad character
Next

or

For Each c As Char In "I_999Š=ÄÖÆaðøñòòñ"
    If Not (c >= "A"c AndAlso c <= "Z"c OrElse c >= "0"c AndAlso c <= "9"c) Then
        '' Found a bad character
    End If
Next

EDIT:

Is there something wrong with this answer that warrants the two anonymous downvotes? Speak up, and I'll fix it. I notice that I left out a "Then" (fixed now), but I intended this as pseudocode.

Upvotes: 1

Yuliy
Yuliy

Reputation: 17718

You could use a regular expression to filter out the bad characters ... (use Regex.IsMatch instead if you only need to detect it)

str = Regex.Replace(str, "[^A-Za-z0-9]","", RegexOptions.None);

Upvotes: 0

weiran
weiran

Reputation: 735

Should just be:

if (Regex.IsMatch(input, "[A-Za-z0-9]"))
{
    // do you thang
}

Upvotes: 1

EBGreen
EBGreen

Reputation: 37740

Well there are two quick options. The first is to use a regular expression the second is to use the Asc() function to determine if the Ascii value is in the range of those allowable characters. I would personally use Asc() for this.

Upvotes: 6

Related Questions