Kieron Dowie
Kieron Dowie

Reputation: 37

Unexpected output of char arrays in c++

I'm pretty inexperienced in c++, and I wrote the following code to see how characters and strings work.

#include "stdio.h"
#include <iostream>
#include <string>

using namespace std;

int main()
{
    char asdf[] = "hello";
    char test[5] = {'h','e','l','l','o'};
    cout << test;
}

I was expected it to output "hello", but instead I got "hellohello", which is really puzzling to me. I did some experimenting:

If I change the asdf to another string of a different length, it outputs "hello" normally. If I change the amount of characters in test it outputs "hello" normally.

I thought this only happened when the two were the same length, but when I change them both to "hell" it seems to output "hell" normally.

To make things more confusing, when I asked a friend to run this code on their computer, it outputted "hello" and then a random character.

I'm running a fresh install of code blocks on Ubuntu. Anyone have any idea what is going on here?

Upvotes: 3

Views: 725

Answers (7)

Matthias247
Matthias247

Reputation: 10396

cout is printing all characters starting from the beginning of the given address (test here, or &test[0] in equivalent notation) up to the point where it finds a null terminator. As you haven't put a null terminator into the test array it will continue to print until it accidently finds one in memory. Up from this point it's pretty much undefined behavior what happens.

Upvotes: 2

Andreas DM
Andreas DM

Reputation: 10998

Character sequences need a null terminator (\0).

char asdf[] = "hello"; // OK: String literals have '\0' appended at the end
char test[5] = {'h','e','l','l','o'}; // Oops, not null terminated. UB

Corrected:

char test[6] = {'h','e','l','l','o','\0'}; // OK
//        ^                         ^^^^

Upvotes: 0

Jonas Sch&#228;fer
Jonas Sch&#228;fer

Reputation: 20718

This is undefined behaviour.

Raw char* or char[] strings in C and C++ must be NULL-terminated. That is, the string needs to end with a '\0' character. Your test[5] does not do that, so the function printing the output continues after the last o, because it is still looking for the NULL-termination.

Due to how the strings are stored on the stack (the stack usually grows towards lower addresses), the next bytes it encounters are those of asdf[], to which you assigned "hello". This is how the memory layout actually looks like, the arrow indicates the direction in which memory addresses (think pointers) increase:

      ---->
  +-------------------
  |hellohello\0 ...
  +-------------------
        \_ asdf
   \_ test

Now in C++ and C, string literals like "hello" are NULL-terminated implicitly, so the compiler writes a hidden '\0' behind the end of the string. The output function continues to print the contents of asdf char-by-char until it reaches that hidden '\0' and then it stops.

If you were to remove the asdf, you would likely see a bit of garbage after the first hello and then a segmentation fault. But this is undefined behaviour, because you are reading out of the bounds of the test array. This also explains why it behaves differently on different systems: for example, some compilers may decide to lay out the variables in a different order on the stack, so that on your friends system, test is actually lower on the stack (remember, lower on the stack means at a higher address):

      ---->
  +-------------------
  |hello\0hello ...
  +-------------------
          \_ test
   \_ asdf

Now when you print the contents of test, it will print hello char-by-char, then continue reading the memory until a \0 is found. The contents of ... are highly specific to architecture and runtime used, possibly even phase of the moon and time of day (not entirely serious), so that on your friends machine it prints a "random" character and stops then.

You can fix this by adding a '\0' or 0 to your test array (you will need to change the size to 6). However, using const char test[] = "hello"; is the sanest way to solve this.

Upvotes: 8

Ed Heal
Ed Heal

Reputation: 59987

The reason for this is that C style strings need the null character to mark the end of the string.

As you have not put this into the array test it will just keep printing characters until it finds one. In you case the array asdf happens to follow test in memory - but this cannot be guaranteed.

Instead change the code to this:

 char test[] = {'h','e','l','l','o', 0};

Upvotes: 4

Peter - Reinstate Monica
Peter - Reinstate Monica

Reputation: 16017

Unless there is an overload of operator<< for a reference to an array of 5 chars, the array will "decay" to a pointer to char and treated as a C style string by the operator. C style strings are by convention terminated with a 0 char, which your array is lacking. Therefore the operator continues outputting the bytes in memory, interpreting them as printable chars. It just so happens that on the stack, the two arrays were adjacent so that the operator ran into asdf's memory area, outputting those chars and finally encountering the implicit 0 char which is at the end of "hello". If you omit the other declaration it's likely that your program will crash, namely if the next 0 byte comes later than the memory boundary of your program.

It is undefined behavior to access memory outside an object (here: test) through a pointer to that object.

Upvotes: 0

GorvGoyl
GorvGoyl

Reputation: 49150

Last character should be '\0' to indicate end of string.

 char test[6] = {'h','e','l','l','o','\0'};

Upvotes: 0

Jacques de Hooge
Jacques de Hooge

Reputation: 6990

You have to terminate your test array with an ascii 0 char. What happens now is that in memory it is adjacent to your asdf string, so since test isn't terminated, the << will just continue until it meets the ascii 0 at the end of asdf.

In case you wonder: When filling asdf, this ascii 0 is added automatically.

Upvotes: 5

Related Questions