Reputation: 131
I am programming a 512 bits integer in C++.
For the integer, I allocate memory from the heap using the new
keyword, but the compiler (g++ version 8.1 on MINGW) seems to wrongfully optimize that out.
i.e compiler commands are:
g++ -Wall -fexceptions -Og -g -fopenmp -std=c++14 -c main.cpp -o main.o
g++ -o bin\Debug\cs.exe obj\Debug\main.o -O0 -lgomp
Code:
#include <iostream>
#include <cstdint>
#include <omp.h>
constexpr unsigned char arr_size = 16;
constexpr unsigned char arr_size_half = 8;
void exit(int);
struct uint512_t{
uint32_t * bytes;
uint512_t(uint32_t num){
//The line below is either (wrongfully) ignored or (wrongfully) optimized out
bytes = new(std::nothrow) uint32_t[arr_size];
if(!bytes){
std::cerr << "Error - not enough memory available.";
exit(-1);
}
*bytes = num;
for(uint32_t * ptr = bytes+1; ptr < ptr+16; ++ptr){
//OS throws error 0xC0000005 (accessing unallocated memory) here
*ptr = 0;
}
}
uint512_t inline operator &(uint512_t &b){
uint32_t* itera = bytes;
uint32_t* iterb = b.bytes;
uint512_t ret(0);
uint32_t* iterret = ret.bytes;
for(char i = 0; i < arr_size; ++i){
*(iterret++) = *(itera++) & *(iterb++);
}
return ret;
}
uint512_t inline operator =(uint512_t &b){
uint32_t * itera=bytes, *iterb=b.bytes;
for(char i = 0; i < arr_size; ++i){
*(itera++) = *(iterb++);
}
return *this;
}
uint512_t inline operator + (uint512_t &b){
uint32_t * itera = bytes;
uint32_t * iterb = b.bytes;
uint64_t res = 0;
uint512_t ret(0);
uint32_t *p2ret = ret.bytes;
uint32_t *p2res = 1+(uint32_t*)&res;
//#pragma omp parallel for shared(p2ret, res, p2res, itera, iterb, ret) private(i, arr_size) schedule(auto)
for(char i = 0; i < arr_size;++i){
res = *p2res;
res += *(itera++);
res += *(iterb++);
*(p2ret++) = (i<15) ? res+*(p2res) : res;
}
return ret;
}
uint512_t inline operator += (uint512_t &b){
uint32_t * itera = bytes;
uint32_t * iterb = b.bytes;
uint64_t res = 0;
uint512_t ret(0);
uint32_t *p2ret = ret.bytes;
uint32_t *p2res = 1+(uint32_t*)&res;
//#pragma omp parallel for shared(p2ret, res, p2res, itera, iterb, ret) private(i, arr_size) schedule(auto)
for(char i = 0; i < arr_size;++i){
res = *p2res;
res += *(itera++);
res += *(iterb++);
*(p2ret++) = (i<15) ? res+(*p2res) : res;
}
(*this) = ret;
return *this;
}
//uint512_t inline operator * (uint512_t &b){
//}
~uint512_t(){
delete[] bytes;
}
};
int main(void){
uint512_t a(3);
}
Upvotes: 0
Views: 97
Reputation: 7482
The error is at this line and has nothing to do with new
being optimized away:
for(uint32_t * ptr = bytes+1; ptr < ptr+16; ++ptr){
*ptr = 0;
}
The condition for the for
is wrong. ptr < ptr+16
will never be false. The loop will go on forever and eventually you will dereference an invalid memory location because ptr
gets incremented ad-infinitum.
By the way, the compiler is allowed to perform optimizations but it is not allowed to change the apparent behavior of the program. If your code performs a new
, the compiler can optimize it away if it can ensure that the side effects of new
are there when you need them (in this case at the moment you access the array).
Upvotes: 5
Reputation: 131
p.s i tried your solution, and it worked fine -
bytes = new(std::nothrow) uint32_t[arr_size];
if(!bytes){
std::cerr << "Error - not enough memory available.";
exit(-1);
}
*bytes = num;
auto ptrp16 = bytes+16;
for(uint32_t * ptr = bytes+1;ptr < ptrp16 ; ++ptr){
*ptr = 0;
}
Upvotes: 0
Reputation: 238351
ptr < ptr+16
is always true. The loop is infinite, and eventually overflows the buffer that it writes to.
Simple solution: Value initialise the array so that you don't need the loop:
bytes = new(std::nothrow) uint32_t[arr_size]();
// ^^
PS. If you copy an instance, the behaviour will be undefined, since the copy would point to same allocation and both instances would attempt to delete it in the destructor.
Simple solution: Don't use bare owning pointers. Use a RAII container such as std::vector
if you need to allocate an array dynamically.
PPS. Carefully consider whether you need dynamic allocation (and the associated overhead) in the first place. 512 bits is in many cases a fairly safe size to have in-place.
Upvotes: 5
Reputation: 141145
You are accessing the array out of bound. The smallest reproducible example would be:
#include <cstdint>
int main() {
uint32_t bytes[16];
for(uint32_t * ptr = bytes + 1; ptr < ptr + 16; ++ptr){
//OS throws error 0xC0000005 (accessing unallocated memory) here
*ptr = 0;
}
}
The ptr < ptr + 16
is always true (maybe except for overflow).
Upvotes: 0