Reputation: 23
I need to understand this for my computer science project but I am not sure what is going on, the code is
#include <stdio.h>
void main () {
char x[6] = "12345\0";
char y[6] = "67890\0";
y[7]='A';
printf("X: %s\n",x);
printf("Y: %s\n",y);
}
And the output is:
X: 1A345
Y: 67890
Now I'm not sure why the A
is in the 2nd element in the x
array when I clearly specify y
.
Upvotes: 2
Views: 110
Reputation: 134356
What going on is your program showing undefined behaviour.
Remember,
0
based. indexing.To clarify,
When you try to put a string literal like "12345\0"
in an array of 6 char
elements, compiler will try to put an extra null after the elements of string literal, which will become a try to access past allocated memory area which in turn invokes undefined behaviour. In case of char x[6] = "12345\0";
, you don't need the \0
as a part of the string literal. Also, it's always better to leave the allocation of elements (size of array, in other words) to the compiler when you're providing a string literal for initialization. You can use
char x[ ] = "12345";
Then, having an array of dimension x
, the valid access to the array is from index [0]
to [x-1]
. Accessing outside the allocated memory is UB, again. For example, the above array x
, can (should) be accessed safely in a range like
len = strlen(x); //get the length of the string
for (int i = 0; i < len; i++)
{
x[i] = i*i; //access the array
}
That said, please note that the recommended signature of main()
is int main(void)
Upvotes: 3
Reputation: 127
I'm not 100% sure on this, but from what I can see you have wrote data outside the bounds of the array and it wrote into memory in adjacent blocks.
char x[6] = "12345\0";
char y[6] = "67890\0";
The confusion may come from, if y is declared after x, then surely the memory should go as follows:
x[0] x[1] x[2] x[3] x[4] x[5] y[0] y[1] y[2] y[3] y[4] y[5]
This comes down to something called Big Endian v Little Endian.
In big endian storage, the most significant byte is stored in the smallest address. In little endian, it is stored in the biggest address.
A lot of computers use the iittle endian system (for example, a lot of intel hardware does), which may mean that your arrays were actually stored like this in memory:
y[0] y[1] y[2] y[3] y[4] y[5] x[0] x[1] x[2] x[3] x[4] x[5]
If this is the case, then calling y[7] would actually correspond to setting x[1], i.e. the second element of the x array. Resulting in an overwrite of data and the following result: X: 1A345 Y: 67890
Upvotes: 0
Reputation: 72256
There is a big problem here:
char y[6] = "67890\0";
y[7]='A';
y
is an array that has 6
elements, indexed from 0
(i.e. 0
, 1
... 5
). This means y[7]
is an invalid expression and assigning a value to it is undefined behaviour.
You write outside the boundaries of the y
array and due to the way the x
and y
arrays are placed in memory it happened that you overwrote the second element of x
.
Using a different OS, compiler or compiler flags can produce a different placement of the x
and y
variables in memory and the code will write 'A'
somewhere else. It's even possible to write in a read-only
memory area and in that case the OS will terminate your program because of a page fault exception.
This is why it's called undefined behaviour.
Upvotes: 0
Reputation: 11
char x[6] = "12345\0";
char y[6] = "67890\0";
y[7]='A';
Array index in 'C/C++' starts with zero. That means you can access only [0-5] indices of x,y.
Accessing y[7] causes undefined behavior; In this case, most probably stack is growing downwards and overrides the second element of x (i.e x[1]).
Related Read: http://en.wikipedia.org/wiki/Stack_buffer_overflow
Upvotes: 0
Reputation: 1292
You specified that there are two arrays, each of which are six bytes in size; that means that they'll have elements numbered 0 through 5 (since C uses zero-based array offsets, not one-based as in some other languages).
Since you try to access y[7]
, you're accessing an element that isn't part of your array. C doesn't do bounds checking, so you get into undefined behaviour. In the particular combination of compiler, compiler options, operating system, processor architecture, etc, that you're using, it just so happens that there's no space between x
and y
, and that x
is behind y
; so when you access an element two places behind the end of array y
, you end up accessing the memory occupied by array x
. Change one of those elements (operating system/compiler (options)/processor), and the result may be wildly different. It still won't be what you expect, though.
Also note that the \0
is superfluous, and will result in your compiler effectively trying to assign "12345\0\0"
to the array, which is seven bytes (and therefore an overflow). It will probably give a warning, but it isn't required to.
Upvotes: 2
Reputation: 504
When you allocate an array of length n in C, you actually allocated an array in memory of length n+1 as the compiler creates room for the null terminator. You do not need to add the null terminator. This is actually causing undefined behavior.
Try removing the null terminator. Let me know what happens!
Upvotes: -3