Luca
Luca

Reputation: 12001

FileStream and Encoding

I have a program write save a text file using stdio interface. It swap the 4 MSB with the 4 LSB, except the characters CR and/or LF.

I'm trying to "decode" this stream using a C# program, but I'm unable to get the original bytes.

        StringBuilder sb = new StringBuilder();
        StreamReader sr = new StreamReader("XXX.dat", Encoding.ASCII);
        string sLine;

        while ((sLine = sr.ReadLine()) != null) {
            string s = "";
            byte[] bytes = Encoding.ASCII.GetBytes(sLine);

            for (int i = 0; i < sLine.Length; i++) {
                byte c = bytes[i];
                byte lb = (byte)((c & 0x0F) << 4), hb = (byte)((c & 0xF0) >> 4);
                byte ascii = (byte)((lb) | (hb));

                s += Encoding.ASCII.GetString(new byte[] { ascii });
            }
            sb.AppendLine(s);
        }
        sr.Close();

        return (sb);

I've tried to change encoding in UTF8, but it didn't worked. I've also used a BinaryReader created using the 'sr' StreamReader, but nothing good happend.

     StringBuilder sb = new StringBuilder();
        StreamReader sr = new StreamReader("XXX.shb", Encoding.ASCII);
        BinaryReader br = new BinaryReader(sr.BaseStream);
        string sLine;
        string s = "";

        while (sr.EndOfStream == false) {
            byte[] buffer = br.ReadBytes(1);
            byte c = buffer[0];
            byte lb = (byte)((c & 0x0F) << 4), hb = (byte)((c & 0xF0) >> 4);
            byte ascii = (byte)((lb) | (hb));

            s += Encoding.ASCII.GetString(new byte[] { ascii });
        }
        sr.Close();

        return (sb);

If the file starts with 0xF2 0xF2 ..., I read everything except the expected value. Where is the error? (i.e.: 0xF6 0xF6).

Actually this C code do the job:

            ...
while (fgets(line, 2048, bfd) != NULL) {
    int cLen = strlen(xxx), lLen = strlen(line), i;

    // Decode line
    for (i = 0; i < lLen-1; i++) {
        unsigned char c = (unsigned char)line[i];
        line[i] = ((c & 0xF0) >> 4) | ((c & 0x0F) << 4);
    }

    xxx = realloc(xxx , cLen + lLen + 2);
    xxx = strcat(xxx , line);
    xxx = strcat(xxx , "\n");
}
fclose(bfd);

What wrong in the C# code?

Upvotes: 3

Views: 19164

Answers (2)

Luca
Luca

Reputation: 12001

Got it.

The problem is the BinaryReader construction:

StreamReader sr = new StreamReader("XXX.shb", Encoding.ASCII);
BinaryReader br = new BinaryReader(sr.BaseStream);

I think this construct a BinaryReader based on StreaReader which "translate" characters coming from the file.

Using this code, actually works well:

FileInfo fi = new FileInfo("XXX.shb");
BinaryReader br = new BinaryReader(fi.OpenRead());

I wonder if it is possible to read those kind of data with a Text stream reader line by line, since line endings are preserved during "encoding" phase.

Upvotes: 2

jishi
jishi

Reputation: 24634

I guess you should use a BinaryReader and ReadBytes(), then only use Encoding.ASCII.GetString() on the bytesequence after you have swapped the bits.

In your example, you seem to read the file as ascii (meaning, you convert bytes to .NET internal dual-byte code upon read telling it that it is ascii), then convert it BACK to bytes again, as ascii-bytes.

That is unnecessary for you.

Upvotes: 0

Related Questions