Reputation: 1881
I am trying to print some characters on screen. I have used a big variation of wrong implementations. For example:
Example 1:
#include <stdio.h>
int main(int argc, char const *argv[]) {
char hello[6] = {'h', 'e', 'l', 'l', 'o', '\n'};
char bye[5] = {'b', 'y', 'e', '\n', '\0'};
char end[] = "end";
char dot = '.';
char oops[] = {'o', 'o', 'p', 's'};
printf("%s\n", hello);
printf("%s\n", bye);
printf("%s\n", end);
printf("%s\n", dot);
printf("%s\n", oops);
return 0;
}
Output 1:
Questions:
Why the dot
character has been printed before bye
?
What are these rubbish after the dot
character?
Example 2 (removed dot
declaration/definition, print):
#include <stdio.h>
int main(int argc, char const *argv[]) {
char hello[6] = {'h', 'e', 'l', 'l', 'o', '\n'};
char bye[5] = {'b', 'y', 'e', '\n', '\0'};
char end[] = "end";
char oops[] = {'o', 'o', 'p', 's'};
printf("%s\n", hello);
printf("%s\n", bye);
printf("%s\n", end);
printf("%s\n", oops);
return 0;
}
Output 2:
Questions
The rubbish are still there! I am feeling good for them but why the end
was printed twice?
I managed to do it correctly using termination character whenever it should be used, but why do I have this kind of inconsistency in printing? I am coming from a Java background and I am already feeling weird!
Upvotes: 2
Views: 1332
Reputation: 16876
The reason why you're getting weird output is because you have undefined behavior. Functions handling strings (including printing them with the %s
format specifier with printf
) expect them to be null terminated, otherwise you get undefined behavior. And if that happens, no particular output is guaranteed. It can work as expected, it can give weird output, it can crash the program or do something different. Usually with printing strings, it will just go on printing until there is an access violation and it crashes, or until it encounters the next null byte (which will probably happen sooner rather than later, and until it happens, it prints "rubbish").
This is also why changing seemingly unrelated things can change the UB to manifest in different ways. Due to the changes in Example 2, the stack looks differently, and when it is wrongly accessed due to UB, it will access different things. It is possible that the dot that was printed in Example 1 when you printed hello
is actually the dot from the char dot = '.';
.
char hello[6] = {'h', 'e', 'l', 'l', 'o', '\n'};
Printing this is undefined behavior because it's not null terminated.
char bye[5] = {'b', 'y', 'e', '\n', '\0'};
You can print this one as a string, as it is null terminated.
char end[] = "end";
Since you made this with a string literal, a null-terminator is added automatically; you can print that. char end[]
is an array of four chars: 'e'
, 'n'
, 'd'
and '\0'
.
char dot = '.';
This is one char
, not a string. Printing that with %s
is undefined behavior and your compiler probably warns you. Print it with %c
instead.
char oops[] = {'o', 'o', 'p', 's'};
Same as the first one, the null terminator is missing, so printing it with %s
is undefined behavior.
The rubbish are still there! I am feeling good for them but why the end was printed twice?
Compilers often put all strings next to each other in the program's binary. So it can happen that your end
string will come right after the not null-terminated oops
. So in the memory it looks like oopsend\0
, which is why it can print "oopsend"
(or do something entirely different, as it's still undefined behavior).
Generally there's not much to gain from wondering why UB does what it does as it is not consistent (it could do something different each time you run it, on a different compiler/machine, or work until you demonstrate it to someone else and then suddenly crash the program). Just look at it as something wrong that you should not do.
Upvotes: 1
Reputation: 1401
Your first string array is not null terminated.
char hello[6] = {'h', 'e', 'l', 'l', 'o', '\n'};
When passing a string which is not null terminated to printf (and other string related functions like strlen,strdup,strcat etc.), the behavior is undefined. It may prints more bytes after your array, until it reaches a null (zero) byte (like at your case, shown as garbage), or it can crash the program.
To fix the behavior, you need all your string arrays end with null, i.e. have '\0' character at the end.
If you initialize a string like this
char end[] = "end";
It is automatically null terminated and allocated the correct length to the char array declared.
Upvotes: 1