Claudius
Claudius

Reputation: 580

memcpy not optimised out during attempt at ‘fast’ pimpl

I need to use a very large and complex header-only class (think boost::multiprecision::cpp_bin_float<76>, called BHP below) which I would like to hide behind a pimpl-like implementation, purely to reduce compilation time in a somewhat large project (replacing the Boost class with std::complex<double> reduced compilation times by approx. 50%).

However, I would like to avoid dynamic memory allocations. Hence, something like this seems natural (ignoring alignment issues for now which can be avoided using aligned_storage or alignas):

struct Hidden {
  char data[sz];

  Hidden& punned(Hidden const& other);
};

Hidden::punned can then be defined in a single translation unit to cast data to BHP*, act on it and not pollute all other translation units with 170k LOC of header files. A possible implementation might be

Hidden& Hidden::punned(Hidden const& other) {
  *(BHP*)(data) += *(BHP*)(other.data);
  return *this;
}

This, of course, is undefined behaviour, because we access an object of type BHP through a pointer of type char, thus violating strict aliasing rules. The proper way to do this is:

Hidden& Hidden::proper(Hidden const& other) {
  BHP tmp; std::memcpy(&tmp, data, sz);
  BHP tmp2; std::memcpy(&tmp2, other.data, sz);
  tmp += tmp2;
  std::memcpy(data, &tmp, sz);
  return *this;
}

Now it might appear ‘obvious’ that these memcpy calls could be optimised out. Unfortunately, this is not the case, they remain and make proper() much larger than punned().

I would like to know what the correct way to a) store the data directly in the Hidden object and b) avoid unnecessary copies to re-interpret it and c) avoid violations of the strict alignment rule and d) do not carry around an extra pointer pointing at the storage area.

There is a godbolt link here; note that all compilers I tested (GCC 4.9 - trunk, Clang 3.9, 4.0 and 5.0 and Intel 18) did not ‘optimise out’ the memcpy. Some versions of GCC (e.g. 5.3) also outright complain about a violation of the strict aliasing rule, though not all of them do. I’ve also inserted a Direct class which knows about BHP and hence can call it directly, but I would like to avoid this.

Minimal working example:

#include <cstring>

constexpr std::size_t sz = 64;

struct Base {
  char foo[sz];
  Base& operator+=(Base const& other) { foo[0] += other.foo[0]; return *this; }
};
typedef Base BHP;

// or:
//#include <boost/multiprecision/cpp_bin_float.hpp>
//typedef boost::multiprecision::number<boost::multiprecision::cpp_bin_float<76> > BHP;

struct Hidden {
  char data[sz];

  Hidden& proper(Hidden const& other);
  Hidden& punned(Hidden const& other);
};

Hidden& Hidden::proper(Hidden const& other) {
  BHP tmp; std::memcpy(&tmp, data, sz);
  BHP tmp2; std::memcpy(&tmp2, other.data, sz);
  tmp += tmp2;
  std::memcpy(data, &tmp, sz);
  return *this;
}

Hidden& Hidden::punned(Hidden const& other) {
  *(BHP*)(data) += *(BHP*)(other.data);
  return *this;
}

struct Direct {
  BHP member;
  Direct& direct(Direct const& other);
};

Direct& Direct::direct(Direct const& other) {
  member += other.member;
  return *this;
}

struct Pointer {
  char storage[sz];
  BHP* data;

  Pointer& also_ok(Pointer const& other);
};

Pointer& Pointer::also_ok(Pointer const& other) {
  *data += *other.data;
  return *this;
}

Upvotes: 4

Views: 159

Answers (1)

Barry
Barry

Reputation: 303387

This, of course, is undefined behaviour, because we access an object of type BHP through a pointer of type char.

That's actually not the case. Accessing through a char* is fine provided that there is actually a BHP object there. That is, as long as on both sides you had:

new (data) BHP(...);

then this is perfectly ok:

*(BHP*)(data) += *(BHP*)(other.data);

Just make sure that your char array is also alignas(BHP).

Note that gcc doesn't like when you reinterpret a char[] sometimes, so you can instead choose to use something like std::aligned_storage_t.

Upvotes: 1

Related Questions