VB6 vs .NET encoding issue (Arabic Charesters)

I am porting an existing VB 6 application to .NET (C# 7). A VB 6 function which is currently creating issue for is simply,

Private Function VB6Function(Name As String) As String
Dim I As Integer
Dim str_len As Integer
Dim search_str As String
Dim Search As String
Dim search_asc As Integer

For I = 1 To Len(Name)
    search_str = Mid$(Name, I, 1)
    search_asc = Asc(search_str)
      Select Case search_asc
        Case 200, 202, 203 To 214, 216, 217, 218, 219, 221 To 223, 225, 227, 228, 230
           Search = Search & search_str
      End Select
Next

GetSearchName = Search
End Function

When I converted into quick C# version,

    public static string CSharpMethod(string str)
    {            
        if (string.IsNullOrWhiteSpace(str))
        {
            return str;
        }
        var validAsciiCharecters = new List<int> { 200, 202, 216, 217, 218, 219, 221, 222, 223, 225, 227, 228, 230 };
        for (int i = 203; i <= 214; i++)
        {
            validAsciiCharecters.Add(i);
        }
        var newStr = "";
        foreach (var ch in str)
        {
            if (validAsciiCharecters.Contains((int)ch))
            {
                newStr += ch.ToString();
            }
        }
        return newStr;
    }

VB6 input م سلطانة and output مسلطن. After digging inside VB6 I have found,

enter image description here

When I copy pasted these values in notepad I have found,

enter image description here

In C# (int)'Ê' equal to 202 and in VB 6 Asc("ت") is 202. But the problem is that if I call C# function with input م سلطانة I get the wrong result.

Upvotes: 2

Views: 220

Answers (1)

After doing a lot of research I have found this Decoding an UTF-8 string to Windows-1256 which helps me to solve my problem by tweaking a little bit,

    public static string CSharpMethod(string str)
    {            
        if (string.IsNullOrWhiteSpace(str))
        {
            return str;
        }
        var validAsciiCharecters = new List<int> { 200, 202, 216, 217, 218, 219, 221, 222, 223, 225, 227, 228, 230 };
        for (int i = 203; i <= 214; i++)
        {
            validAsciiCharecters.Add(i);
        }
        var win1256Bytes = Encoding.GetEncoding(1256).GetBytes(str);
        var newBytes = new List<byte>();
        foreach (var b in win1256Bytes)
        {
            if (validAsciiCharecters.Contains((int)b))
            {
                newBytes.Add(b);
            }
        }
        return Encoding.GetEncoding(1256).GetString(newBytes.ToArray());
    }

Upvotes: 2

Related Questions