Carl Onager
Carl Onager

Reputation: 4122

Converting byte array to string with correct encoding

I have this bit of C# code that I have translated to VB using http://www.developerfusion.com/tools/convert/csharp-to-vb/

private string DecodeToken (string token, string key)
{           
    byte [] buffer = new byte[0];
    string decoded = "";
    int i;
        if (Scramble (Convert.FromBase64String(token), key, ref buffer))
    {
        for (i=0;i<buffer.Length;i++)
        {
            decoded += Convert.ToString((char)buffer[i]);
        }
    }
    return(decoded);
}

Which, after a little modification, gives this:

Private Function DecodeToken(token As String, key As String) As String
    Dim buffer As Byte()
    Dim decoded As String = ""
    Dim index As Integer
    If Scramble(Convert.FromBase64String(token), key, buffer) Then
        For index = 0 To buffer.Length - 1
            decoded += Convert.ToString(ChrW(buffer(index)))
        Next
        'decoded = UTF8Encoding.ASCII.GetString(pbyBuffer)
        'decoded = UnicodeEncoding.ASCII.GetString(pbyBuffer)
        'decoded = ASCIIEncoding.ASCII.GetString(pbyBuffer)
    End If
    Return decoded
End Function

Scramble just rearranges the array in a specific way and I've checked the VB and C# outputs against each other so it can be ignored. It's inputs and outputs are byte arrays so it shouldn't affect the encoding.

The problem lies in that the result of this function is fed into a hashing algorithm which is then compared against the hashing signature. The result of the VB version, when hashed, does not match to the signature.

You can see from the comments that I've attempted to use different encodings to get the byte buffer out as a string but none of these have worked.

The problem appears to lie in the transalation of decoded += Convert.ToString((char)buffer[i]); to decoded += Convert.ToString(ChrW(buffer(index))).

Does ChrW produce the same result as casting as a char and which encoding will correctly duplicate the reading of the byte array?

Edit: I always have Option Strict On but it's possible that the original C# doesn't so it may be affected by implicit conversion. What does the compiler do in that situation?

Upvotes: 3

Views: 7178

Answers (4)

Ralph Esslinger
Ralph Esslinger

Reputation: 41

A small improvement

Private Function DecodeToken(encodedToken As String, key As String) As String
    Dim scrambled = Convert.FromBase64String(encodedToken)
    Dim buffer As Byte()
    Dim index As Integer

    If Not Scramble(scrambled, key, buffer) Then
        Return Nothing
    End If

    Dim descrambled = System.Text.Encoding.Unicode.GetString(buffer, 0, buffer.Length);

    Return descrambled
End Function

Upvotes: 0

Jodrell
Jodrell

Reputation: 35766

Quick answer

decoded += Convert.ToString((char)buffer[i]);

is equivalent to

decoded &= Convert.ToString(Chr(buffer[i]));

VB.Net stops you taking the hacky approach used in the c# code, a Char is Unicode so consists of two bytes.


This looks likes a better implementation of what you have.

Private Function DecodeToken(encodedToken As String, key As String) As String
    Dim scrambled = Convert.FromBase64String(encodedToken)
    Dim buffer As Byte()
    Dim index As Integer

    If Not Scramble(scrambled, key, buffer) Then
        Return Nothing
    End If

    Dim descrambled = new StringBuilder(buffer.Length);

    For index = 0 To buffer.Length - 1
        descrambled.Append(Chr(buffer(index)))
    Next

    Return descrambled.ToString()
End Function

Upvotes: 3

J. Tanner
J. Tanner

Reputation: 575

Give the following a go:

decoded += Convert.ToChar(foo)

It will work (unlike my last attempt that made assumptions about implicit conversions being framework specific and not language specific) but I can't guarantee that it will be the same as the .NET.

Given you say in comments you expected to use Encoding.xxx.GetString then why don't you use that? Do you know what the encoding was in the original string to byte array? If so then just use that. It is the correct way to convert a byte array to a string anyway since doing it byte by byte will definitely break for any multi-byte characters (clearly).

Upvotes: 0

Chris
Chris

Reputation: 27627

have you tried the most direct code translation:

decoded += Convert.ToString(CType(buffer[i], char))

When covnerting a byte array to a string you should really make sure you know the encoding first though. If this is set in whatever is providing the byte array then you should use that to decode the string.

For more details on the ChrW (and Chr) functions look at http://msdn.microsoft.com/en-us/library/613dxh46%28v=vs.80%29.aspx . In essence ChrW assumes that the passed int is a unicode codepoint which may not be a valid assumption (I believe from 0 to 127 this wouldn't matter but the upper half of the byte might be different). if this is the problem then it will likely be accented and other such "special" characters that are causing the problem.

Upvotes: 0

Related Questions