Reputation: 889
I have a simple struct containing a string defined as a char array. I thought that copying an instance of the struct to another instance using the assignment operator would simply copy the memory address stored in the char pointer. Instead it seems that the string content is copied. I put together a very simple example:
#include <stdio.h>
#include <string.h>
struct Test{
char str[20];
};
int main(){
struct Test t1, t2;
strcpy(t1.str, "Hello");
strcpy(t2.str, "world");
printf("t1: %s %p\n", t1.str, (char*)(t1.str));
printf("t2: %s %p\n", t2.str, (char*)(t2.str));
t2 = t1;
printf("t2: %s %p\n", t2.str, (char*)(t2.str));
return 0;
}
Compiling this code with gcc 4.9.2 I get:
t1: Hello 0x7fffb8fc9df0
t2: world 0x7fffb8fc9dd0
t2: Hello 0x7fffb8fc9dd0
As I understand, after t2 = t1
t2.str points to the same memory address it pointed before the assignment, but now inside that address there is the same string found inside t1.str. So it seems to me that the string content has been automatically copied from one memory location to another, something that I thought C would not do. I think that this behaviour is triggered by the fact that I declared str as a char[]
, not as a char*
. Indeed, trying to assign directly one string to another with t2.str = t1.str
gives this error:
Test.c: In function ‘main’:
Test.c:17:10: error: assignment to expression with array type
t2.str = t1.str;
^
which makes me think that arrays are effectively treated differently than pointers in some cases. Still I can't figure out which are the rules for array assignment, or in other words why arrays inside a struct are copied when I copy one struct into another one but I can't directly copy one array into another one.
Upvotes: 4
Views: 8276
Reputation: 17593
In C a struct
is a way for the compiler to know how to structure an area of memory. A struct
is a kind of template or stencil which the C compiler uses to figure out how to calculate offsets to the various members of the struct.
The first C compilers did not allow struct
assignment so people had to use a memcpy()
function to assign structs however later compilers did. A C compiler will do a struct
assignment by copying the number of bytes of the struct
area of memory, including padding bytes that may be added for address alighnment from one address to another. Whatever happens to be in the source memory area is copied to the destination area. There is nothing smart done about the copy. It is just copy so many bytes of data from one memory location to another.
If you have a string array in the struct
or any kind of an array then the entire array will be copied since that is part of the struct.
If the struct
contains pointer variables then those pointer variables will also be copied from one area to another. The result of this is that you will have two structs with the same data. The pointer variables in each of those structs will have similar address values, the two areas being a copy of each other, so a particular pointer in one struct will have the same address as the corresponding pointer in the other struct and both will be pointing to the same location.
Remember that a struct assignment is just copying bytes of data from one area of memory to another. For instance if we have a simple struct
with a char
array with the C source looking like:
typedef struct {
char tt[50];
} tt_struct;
void test (tt_struct *p)
{
tt_struct jj = *p;
tt_struct kk;
kk = jj;
}
The assembler listing output by the Visual Studio 2005 C++ compiler in debug mode for the assignment of kk = jj;
looks like:
; 10 : tt_struct kk;
; 11 :
; 12 : kk = jj;
00037 b9 0c 00 00 00 mov ecx, 12 ; 0000000cH
0003c 8d 75 c4 lea esi, DWORD PTR _jj$[ebp]
0003f 8d 7d 88 lea edi, DWORD PTR _kk$[ebp]
00042 f3 a5 rep movsd
00044 66 a5 movsw
This bit of code is copying data 4 byte word by 4 byte word from one location in memory to another. With a smaller char
array size, the compiler may opt to use a different series of instructions to copy the memory as being more efficient.
In C arrays are not really handled in a smart way. An array is not seen as a data structure in the same way that Java sees an array. In Java an array is a type of object composed of an array of objects. In C an array is just a memory area and the array name is actually treated like a constant pointer or a pointer that can not be changed. The result is that in C you can have an array say int myInts[5];
which Java would see as an array of five ints however to C that is really a constant pointer with a label of myInts
. In Java if you try to access an array element out of range, say myInts[i] where i is a value of 8, you will get a runtime error. In C if you try to access an array element out of range, say myInts[i] where i is a value of 8, you will not get a runtime error unless you are working with a debug build with a nice C compiler that is doing runtime checks. However experienced C programmers have a tendency to treat arrays and pointers as similar constructs though arrays as pointers do have some restrictions since they are a form of a constant pointer and aren't exactly pointers but have some characteristics similar to pointers.
This kind of buffer overflow error is very easy to do in C by accessing an array past its number of elements. The classic example is doing a string copy of a char array into another char array and the source char array does not have a zero termination character in it resulting in a string copy of a few hundred bytes when you expect ten or fifteen.
Upvotes: 0
Reputation: 311038
If you run the following simple program
#include <stdio.h>
int main( void )
{
{
struct Test
{
char str[20];
};
printf( "%zu\n", sizeof( Test ) );
}
{
struct Test
{
char *str;
};
printf( "%zu\n", sizeof( Test ) );
}
return 0;
}
you will get a result similar to the following
20
4
So the first structure contains a character array of 20 elements while the second structure contains only a pointer of type char *
.
When one structure is assigned to another structure its data members are copied. So for the first structure all content of the array is copied in another structure. For the second structure only the value of the pointer (the address it contains) is copied. The memory pointed to by the pointer is not copied because it is not contained in the structure itself.
And arrays are not pointers though usually names of arrays in expressions (with rare exceptions) are converted to pointers to their first elements.
Upvotes: 0
Reputation: 6756
There are really 20 characters in your case, it same as if you declare the struct as struct Test {char c1, char c2, ...}
If you want to copy only pointer to the string, you can change the struct declaration as below and manually manage the memory for the string via functions Test_init
and Test_delete
.
struct Test{
char* str;
};
void Test_init(struct Test* test, size_t len) {
test->str = malloc(len);
}
void Test_delete(struct Test* test) {
free(test->str);
}
Upvotes: 0
Reputation: 1826
The structure contains no pointer, but 20 chars.
After t2 = t1
, the 20 chars of t1
are copied into t2
.
Upvotes: 12