Reputation: 2751
I have an application that reads string data in from a stream. The string data is typically in English but on occasion it encounters something like 'Jalapeño' and the 'ñ' comes out as '?'. In my implementation I'd prefer to read the stream contents into a byte array but I could get by reading the contents into a string. Any idea what I can do to make this work right?
Current code is as follows:
byte[] data = new byte[len]; // len is known a priori
byte[] temp = new byte[2];
StreamReader sr = new StreamReader(input_stream);
int position = 0;
while (!sr.EndOfStream)
{
int c = sr.Read();
temp = System.BitConverter.GetBytes(c);
data[position] = temp[0];
position++;
}
input_stream.Close();
sr.Close();
Upvotes: 3
Views: 4385
Reputation: 6002
You can pass the encoding to the StreamReader as in:
StreamReader sr = new StreamReader(input_stream, Encoding.UTF8);
However, I understand that Encoding.UTF8 is used by default according to the documentation.
Update
The following reads 'Jalapeño' fine:
byte[] bytes;
using (var stream = new FileStream("input.txt", FileMode.Open, FileAccess.Read, FileShare.Read))
{
var index = 0;
var count = (int) stream.Length;
bytes = new byte[count];
while (count > 0)
{
int n = stream.Read(bytes, index, count);
if (n == 0)
throw new EndOfStreamException();
index += n;
count -= n;
}
}
// test
string s = Encoding.UTF8.GetString(bytes);
Console.WriteLine(s);
As does this:
byte[] bytes;
using (var stream = new FileStream("input.txt", FileMode.Open, FileAccess.Read, FileShare.Read))
{
var reader = new StreamReader(stream);
string text = reader.ReadToEnd();
bytes = Encoding.UTF8.GetBytes(text);
}
// test
string s = Encoding.UTF8.GetString(bytes);
Console.WriteLine(s);
From what I understand the 'ñ' character is represented as 0xc391 in the text when the text is stored with UTF encoding. When you only read a byte, you'll loose data.
I'd suggest reading the whole stream as a byte array (the first example) and then do the encoding. Or use StreamReader to do the work for you.
Upvotes: 4
Reputation: 1062502
Since you're trying to fill the contents into a byte-array, don't bother with the reader - it isn't helping you. Use just the stream:
byte[] data = new byte[len];
int read, offset = 0;
while(len > 0 &&
(read = input_stream.Read(data, offset, len)) > 0)
{
len -= read;
offset += read;
}
if(len != 0) throw new EndOfStreamException();
Upvotes: 1