Reputation: 37
I'm pretty inexperienced in c++, and I wrote the following code to see how characters and strings work.
#include "stdio.h"
#include <iostream>
#include <string>
using namespace std;
int main()
{
char asdf[] = "hello";
char test[5] = {'h','e','l','l','o'};
cout << test;
}
I was expected it to output "hello", but instead I got "hellohello", which is really puzzling to me. I did some experimenting:
If I change the asdf to another string of a different length, it outputs "hello" normally. If I change the amount of characters in test it outputs "hello" normally.
I thought this only happened when the two were the same length, but when I change them both to "hell" it seems to output "hell" normally.
To make things more confusing, when I asked a friend to run this code on their computer, it outputted "hello" and then a random character.
I'm running a fresh install of code blocks on Ubuntu. Anyone have any idea what is going on here?
Upvotes: 3
Views: 725
Reputation: 10396
cout
is printing all characters starting from the beginning of the given address (test
here, or &test[0]
in equivalent notation) up to the point where it finds a null terminator. As you haven't put a null terminator into the test array it will continue to print until it accidently finds one in memory. Up from this point it's pretty much undefined behavior what happens.
Upvotes: 2
Reputation: 10998
Character sequences need a null terminator (\0
).
char asdf[] = "hello"; // OK: String literals have '\0' appended at the end
char test[5] = {'h','e','l','l','o'}; // Oops, not null terminated. UB
Corrected:
char test[6] = {'h','e','l','l','o','\0'}; // OK
// ^ ^^^^
Upvotes: 0
Reputation: 20718
This is undefined behaviour.
Raw char*
or char[]
strings in C and C++ must be NULL-terminated. That is, the string needs to end with a '\0'
character. Your test[5]
does not do that, so the function printing the output continues after the last o
, because it is still looking for the NULL-termination.
Due to how the strings are stored on the stack (the stack usually grows towards lower addresses), the next bytes it encounters are those of asdf[]
, to which you assigned "hello"
. This is how the memory layout actually looks like, the arrow indicates the direction in which memory addresses (think pointers) increase:
---->
+-------------------
|hellohello\0 ...
+-------------------
\_ asdf
\_ test
Now in C++ and C, string literals like "hello"
are NULL-terminated implicitly, so the compiler writes a hidden '\0'
behind the end of the string. The output function continues to print the contents of asdf char-by-char until it reaches that hidden '\0'
and then it stops.
If you were to remove the asdf
, you would likely see a bit of garbage after the first hello
and then a segmentation fault. But this is undefined behaviour, because you are reading out of the bounds of the test
array. This also explains why it behaves differently on different systems: for example, some compilers may decide to lay out the variables in a different order on the stack, so that on your friends system, test
is actually lower on the stack (remember, lower on the stack means at a higher address):
---->
+-------------------
|hello\0hello ...
+-------------------
\_ test
\_ asdf
Now when you print the contents of test
, it will print hello
char-by-char, then continue reading the memory until a \0
is found. The contents of ...
are highly specific to architecture and runtime used, possibly even phase of the moon and time of day (not entirely serious), so that on your friends machine it prints a "random" character and stops then.
You can fix this by adding a '\0'
or 0
to your test
array (you will need to change the size to 6). However, using const char test[] = "hello";
is the sanest way to solve this.
Upvotes: 8
Reputation: 59987
The reason for this is that C style strings need the null character to mark the end of the string.
As you have not put this into the array test
it will just keep printing characters until it finds one. In you case the array asdf
happens to follow test
in memory - but this cannot be guaranteed.
Instead change the code to this:
char test[] = {'h','e','l','l','o', 0};
Upvotes: 4
Reputation: 16017
Unless there is an overload of operator<<
for a reference to an array of 5 chars, the array will "decay" to a pointer to char and treated as a C style string by the operator. C style strings are by convention terminated with a 0 char, which your array is lacking. Therefore the operator continues outputting the bytes in memory, interpreting them as printable chars. It just so happens that on the stack, the two arrays were adjacent so that the operator ran into asdf
's memory area, outputting those chars and finally encountering the implicit 0 char which is at the end of "hello"
. If you omit the other declaration it's likely that your program will crash, namely if the next 0 byte comes later than the memory boundary of your program.
It is undefined behavior to access memory outside an object (here: test
) through a pointer to that object.
Upvotes: 0
Reputation: 49150
Last character should be '\0'
to indicate end of string.
char test[6] = {'h','e','l','l','o','\0'};
Upvotes: 0
Reputation: 6990
You have to terminate your test
array with an ascii 0 char. What happens now is that in memory it is adjacent to your asdf string, so since test
isn't terminated, the <<
will just continue until it meets the ascii 0 at the end of asdf
.
In case you wonder: When filling asdf
, this ascii 0 is added automatically.
Upvotes: 5