Reputation: 4227
I'm using a piece of code (found else where on this site) that checks endianness at runtime.
static bool isLittleEndian()
{
short int number = 0x1;
char *numPtr = (char*)&number;
std::cout << numPtr << std::endl;
std::cout << *numPtr << std::endl;
return (numPtr[0] == 1);
}
When in debug mode, the value numPtr looks like this: 0x7fffffffe6ee "\001"
I assume the first hexadecimal part is the pointer's memory address, and the second part is the value it holds. I'm know that \0 is null termination in old-style C++, but why is it at the front? Is it to do with endianness?
On a little-endian machine: 01 the first byte and therefore least significant (byte place 0), and \0 the second byte/final byte (byte place 1)?
In addition, the cout statements do not print the pointer address or it's value. Reasons for this?
Upvotes: 3
Views: 1312
Reputation: 153899
For starters: this type of function is totally worthless: on a machine
where sizeof(int)
is 4, there are 24 possible byte orders. Most, of
course, don't make sense, but I've seen at least three. And endianness
isn't the only thing which affects integer representation. If you have
an int
, and you want to get the low order 8 bits, use intValue &
0xFF
, for the next 8 bits, (intValue >> 8) & 0xFF
.
With regards to your precise question: I presume what you are describing
as "looks like this" is what you see in the debugger, when you break at
the return. In this case, numPtr
is a char*
(a unsigned char
const*
would make more sense), so the debugger assumes a C style
string. The 0x7fffffffe6ee
is the address; what follows is what the
compiler sees as a C style string, which it displays as a string, i.e.
"..."
. Presumably, your platform is a traditional little-endian
(Intel); the pointer to the C style string sees the sequence (numeric
values) of 1, 0
. The 0
is of course the equivalent of '\0'
, so it
considers this a one character string, with that one character having
the encoding of 1. There is no printable character with an encoding of
one, and it doesn't correspond to any of the normal escape sequences
(e.g. '\n'
, '\t'
, etc.) either. So the debugger outputs it using
the octal escape sequence, a '\'
followed by 1 to 3 octal digits.
(The traditional '\0'
is just a special case of this; a '\'
followed
by a single octal digit.) And it outputs 3 digits, because (probably)
it doesn't want to look ahead to ensure that the next character isn't an
octal digit. (If the sequence were the two bytes 1, 49
, for example,
49 is '1'
in the usual encodings, and if it output only a single byte
for the octal encoding of 1, the results would be "\11", which is a
single character string—corresponding in the usual encodings to
'\t'
.) So you get "
this is a string, \001
with first character
having an encoding of 1 (and no displayable representation), and "
that's the end of the string.
Upvotes: 1
Reputation: 103693
In addition, the cout statements do not print the pointer address or it's value. Reasons for this?
Because chars and char pointers are treated differently than integers when it comes to printing.
When you print a char, it prints the character from whatever character set is being used. Usually, this is ASCII, or some superset of ASCII. The value 0x1 in ASCII is non-printing.
When you print a char pointer, it doesn't print the address, it prints it as a null-terminated string.
To get the results you desire, cast your char pointer to a void pointer, and cast your char to an int.
std::cout << (void*)numPtr << std::endl;
std::cout << (int)*numPtr << std::endl;
Upvotes: 0
Reputation: 26094
The others have given you a clear answer to what "\000"
means, so this is an answer to your question:
On a little-endian machine: 01 the first byte and therefore least significant (byte place 0), and \0 the second byte/final byte (byte place 1)?
Yes, this is correct. Of you look at value like 0x1234, it consists of two bytes, the high part 0x12 and the low part 0x34. The term "little endian" means that the low part is stored first in memory:
addr: 0x34
addr+1: 0x12
Did you known that the term "endian" predated the computer industry? It was originally used by Jonathan Swift in his book Gulliver's Travels, where it described if people were eating the egg from the pointy or the round end.
Upvotes: 2
Reputation: 5532
The \0 isn't a NUL, the debugger is showing you numPtr as a string, the first character of which is \001 or control-A in ASCII. The second character is \000, which isn't displayed because NULs aren't shown when displaying strings. The two character string version of 'number' would appear as "\000\001" on a big-endian machine, instead of "\001\000" as it appears on little-endian machines.
Upvotes: 0
Reputation: 34563
That's not a \0
followed by "01", it's the single character \001
, which represents the number 1 in octal. That's the only byte "in" your string. There's another byte after it with the value zero, but you don't see that since it's treated as the string terminator.
Upvotes: 1
Reputation: 409136
The "\001"
you are seeing is just one byte. It's probably octal notation, which needs three digits to properly express the (decimal) values 0 to 255.
Upvotes: 0
Reputation: 65254
the easiest way to check for endianness is to let the system do it for you:
if (htonl(0xFFFF0000)==0xFFFF0000) printf("Big endian");
else printf("Little endian");
Upvotes: 1