Jyothish
Jyothish

Reputation: 561

Convert UTF-8 to Chinese Simplified (GB2312)

Is there a way to convert UTF-8 string to Chinese Simplified (GB2312) in C#. Any help is greatly appreciated.

Regards Jyothish George

Upvotes: 2

Views: 14585

Answers (2)

Fatih Mert Doğancan
Fatih Mert Doğancan

Reputation: 1092

Try this;

public string GB2312ToUtf8(string gb2312String)
{
    Encoding fromEncoding = Encoding.GetEncoding("gb2312");
    Encoding toEncoding = Encoding.UTF8;
    return EncodingConvert(gb2312String, fromEncoding, toEncoding);
}

public string Utf8ToGB2312(string utf8String)
{
    Encoding fromEncoding = Encoding.UTF8;
    Encoding toEncoding = Encoding.GetEncoding("gb2312");
    return EncodingConvert(utf8String, fromEncoding, toEncoding);
}

public string EncodingConvert(string fromString, Encoding fromEncoding, Encoding toEncoding)
{            
    byte[] fromBytes = fromEncoding.GetBytes(fromString);
    byte[] toBytes = Encoding.Convert(fromEncoding, toEncoding, fromBytes);

    string toString = toEncoding.GetString(toBytes);
    return toString;
}

source here

Upvotes: 0

Jon Skeet
Jon Skeet

Reputation: 1503984

The first thing to be aware of is that there's no such thing as a "UTF-8 string" in .NET. All strings in .NET are effectively UTF-16. However, .NET provides the Encoding class to allow you to decode binary data into strings, and re-encode it later.

Encoding.Convert can convert a byte array representing text encoded with one encoding into a byte array with the same text encoded with a different encoding. Is that what you want?

Alternatively, if you already have a string, you can use:

byte[] bytes = Encoding.GetEncoding("gb2312").GetBytes(text);

If you can provide more information, that would be helpful.

Upvotes: 9

Related Questions