Reputation: 690
I am converting my string to byte array using ASCII encoding using below code.
String data = "<?xml version="1.0" encoding="utf-8"?><ns0:ReceivedPayment Amount="1.01"/>"
byte[] buffer = Encoding.ASCII.GetBytes(data);
The problem i am facing is it's adding "?" in my string.
Now if i again convert back my byte array to string
var str = System.Text.Encoding.Default.GetString(buffer);
my string becomes
string str = "?<?xml version="1.0" encoding="utf-8"?><ns0:ReceivedPayment Amount="1.01"/>"
Does any one know why it's adding "?" in my string and how to remove it.
Upvotes: 1
Views: 3385
Reputation: 20772
There a several things wrong here. One is not showing the relevant code.
Nonetheless, if you use valid methods to read text from a UTF-8, UTF-32, etc file, you won't have a BOM in your string because the string will hold the text and the BOM is not part of the text.
One the other hand, if you are reading an XML file, it is not a "text" file. You should use an XML reader. That would take care to use the encoding that is (most likely) indicated in the file.
And, when you write an XML file (which I presume you'll be doing with the byte array), you should use an XML writer. That would take care to use the encoding you specify and write it into the file.
Keep in mind, though, that conversion from Unicode (for which UTF-8 is one encoding) to some other character set can silently corrupt your data with a replacement character (typically '?') for those that are not in the target character set.
Upvotes: 0
Reputation: 66
It seems that you showed only simplified code. Am I right that you read data from a file? If yes, check for a BOM (byte order mark) field at the begining of the file. It is used for encoding: UTF-8, UTF-16 and UTF-32.
Upvotes: 5
Reputation: 12419
Here is my extension method:
public static byte[] ToByteArray(this string str)
{
var bytes = new byte[str.Length * sizeof(char)];
Buffer.BlockCopy(str.ToCharArray(), 0, bytes, 0, bytes.Length);
return bytes;
}
Upvotes: -1