Reputation: 7636
I am using VS 2012 and programming in C++. I have a wide string
wchar_t *str = L"Hello world".
Technically I read the string from a file but I don't know if that makes a difference. When I look at str
in the memory window it looks like this:
00 48 00 65 00 6c 00 6c 00 6f 00 2c 00 20 00 77 00 6f 00 72 00 6c 00 64 00 21 00
As you can see the string is stored in memory as big-endian.
When I hover my mouse over the string I get:
L"䠀攀氀氀漀Ⰰ 眀漀爀氀搀℀"
And after I reverse the endianness of str
the memory looks like:
48 00 65 00 6c 00 6c 00 6f 00 2c 00 20 00 77 00 6f 00 72 00 6c 00 64 00 21 00 00
And the hover over looks like:
L"Hello, world!"
It seems that the debugger displays UTF-16 in little-endian by default. My program reads big-endian files so it is very tedious to keep reversing the endianness of all strings to debug them. Is there any way to change the endianness of the debugger's display?
Except for debug purposes I can do all my processing in big endian.
Upvotes: 3
Views: 2184
Reputation: 1
You can set the endianness of the Memory window using the context menu. Right-click in the Memory window and check "Big Endian".
Upvotes: 0
Reputation: 153929
As has been pointed out, VS uses the native endianness, which is
little endian on an Intel/AMD. The problem is that you're not
reading the strings correctly; you should imbue the
std::istream
with a locale which reads UTF-16BE (since this is
apparently the encoding form you're trying to read).
std::istream
(or rather the backing std::filebuf
) will
automatically do the code translation on the fly when reading
and writing.
Upvotes: 1
Reputation: 36896
Your best shot in getting this to work is to define your own type and create a debugger type visualizer for it (see Customizing the Visual Studio Debugger Display of Your Data, or here).
Or maybe you can quick-hack it by shifting the address by 1 byte in watch window.
You're working with a non-native string format that just happens to "feel" similar to the native format. So you are tempted to think there should be almost a way to do it. But to the debugger, it's just a foreign binary format. The debugger is not designed to handle foreign endianness just as it does not handle visualizing an OGG stream packet.
If you want to use available tools for manipulating native-endian Unicode strings, you'll need to convert to native-endian Unicode format.
Upvotes: 1
Reputation: 13690
It's not only the debugger. The wchar_t function of Visual Studio are little endian as the host is. When you want to process the data you need to reverse the string endianess to little endian anyway.
It's worth to have this change even if you output the strings to a file with a different endianess. Strings are defined as a byte sequence, your endianess applied to a string looks strange anyhow.
Upvotes: 3