GrzesiekO
GrzesiekO

Reputation: 1199

ASCII.GetString() stop on null character

I have big problem... My piece of code:

string doc = System.Text.Encoding.ASCII.GetString(stream);

Variable doc is ending at first null (/0) character (a lot of data is missing at this point). I want to get whole string. What's more, when I copied this piece of code and run in immediate window in Visual Studio - everything is fine...

What I'm doing wrong?

Upvotes: 2

Views: 2190

Answers (1)

xanatos
xanatos

Reputation: 111850

No, it doesn't:

string doc = System.Text.Encoding.ASCII.GetString(new byte[] { 65, 0, 65 }); // A\0A
int len = doc.Length; //3

But Winforms (and Windows API) truncate (when showing) at first \0.

Example: https://dotnetfiddle.net/yjwO4Y

I'll add that (in Visual Studio 2013), the \0 is correctly showed BUT in a single place: if you activate the Text Visualizer (the magnifying glass), that doesn't support the \0 and truncates at it.

Why this happens? because historically there were two "models" for string, C-strings that are NUL (\0) terminated (and so can't use \0 as a character) and Pascal strings that have the length prepended, and so can have the \0 as a character. From the wiki

Null-terminated strings were produced by the .ASCIZ directive of the PDP-11 assembly languages and the ASCIZ directive of the MACRO-10 macro assembly language for the PDP-10. These predate the development of the C programming language, but other forms of strings were often used.

Now, Windows is written in C, and uses null terminated strings (but then Microsoft changed idea, and COM strings are more similar to Pascal strings and can contain the NUL character). So Windows API can't use the \0 character (unless they are COM based, and probably quite often the COM based could be buggy, because they aren't fully tested for the \0). For .NET Microsoft decided to use something similar to Pascal strings and COM strings, so .NET strings can use the \0.

Winforms is built directly on top of Windows API, so it can't show the \0. WPF is instead built "from the ground up" in .NET, so in general it can show the \0 character.

Upvotes: 11

Related Questions