Reputation: 1326
I'm having trouble understanding what the difference between these two code snippets is:
// out is of type char* of size N*D
// N, D are of type int
for (int i=0; i!=N; i++){
if (i % 1000 == 0){
std::cout << "i=" << i << std::endl;
}
for (int j=0; j!=D; j++) {
out[i*D + j] = 5;
}
}
This code runs fine, even for very big data sets (N=100000, D=30000). From what I understand about pointer arithmetic, this should give the same result:
for (int i=0; i!=N; i++){
if (i % 1000 == 0){
std::cout << "i=" << i << std::endl;
}
char* out2 = &out[i*D];
for (int j=0; j!=D; j++) {
out2[j] = 5;
}
}
However, the latter does not work (it freezes at index 143886 - I think it segfaults, but I'm not 100% sure as I'm not used to developing on windows) for a very big data set and I'm afraid I'm missing something obvious about how pointer arithmetic works. Could it be related to advancing char*?
EDIT: We have now established that the problem was an overflow of the index (i.e. (i*D + j) >= 2^32), so using uint64_t instead of int32_t fixed the problem. What's still unclear to me is why the first above case would run through, while the other one segfaults.
Upvotes: 2
Views: 1937
Reputation: 183
When using N as size of array, why use int? does a negative value of an array has any logical meaning?
what do you mean "doesn't work"?
just think of pointers as addresses in memory and not as 'objects'.
char*
void*
int*
are all pointers to memory addresses, and so are exactly the same, when are defined or passes into a function.
char * a;
int* b = (char*)a;
void* c = (void*)b;
a == b == c;
The difference is that when accessing a, a[i], the value that is retrieved is the next sizeof(*a) bytes from the address a.
And when using ++ to advance a pointer the address that the pointer is set to is advanced by
sizeof(pointer_type) bytes.
Example:
char* a = 1;
a++;
a is now 2.
((int*)a)++;
a is now 6.
Another thing:
char* a = 10;
char* b = a + 10;
&(a[10]) == b
because in the end
a[10] == *((char*)(a + 10))
so there should not be a problem with array sizes in your example, because the two examples are the same.
EDIT
Now note that there is not a negative memory address so accessing an array with a signed negative value will convert the value to positive.
int a = -5;
char* data;
data[a] == data[MAX_INT - 5]
For that reason it might be that (when using sign values as array sizes!) your two examples will actually not get the same result.
Upvotes: 1
Reputation: 15872
Version 1
for (int i=0; i!=N; i++) // i starts at 0 and increments until N. Note: If you ever skip N, it will loop forever. You should do < N or <= N instead
{
if (i % 1000 == 0) // if i is a multiple of 1000
{
std::cout << "i=" << i << std::endl; // print i
}
for (int j=0; j!=D; j++) // same as with i, only j is going to D (same problem, should be < or <=)
{
out[i*D + j] = 5; // this is a way of faking a 2D array by making a large 1D array and doing the math yourself to offset the placement
}
}
Version 2
for (int i=0; i!=N; i++) // same as before
{
if (i % 1000 == 0) // same as before
{
std::cout << "i=" << i << std::endl; // same as before
}
char* out2 = &out[i*D]; // store the location of out[i*D]
for (int j=0; j!=D; j++)
{
out2[j] = 5; // set out[i*D+j] = 5;
}
}
They are doing the same thing, but if out
is not large enough, they will both behave in an undefined manner (and likely crash).
Upvotes: -1