Reputation: 90432
So, there are a few questions on SO about this subject, but I haven't quite found something that exactly answers the question I have in mind. First some background:
I would like to have a uint32_t
field, which I can also access as an array of bytes.
So the first thing that comes to mind is:
union U {
uint32_t u32;
uint8_t bytes[sizeof(uint32_t)];
};
Which allows me to do this:
// "works", but is UB as far as I understand
U u;
u.u32 = 0x11223344;
u.bytes[0] = 0x55;
OK, so undefined behavior (UB) is bad, therefore we don't want to do that. Similarly casts are UB and can sometimes be even worse due to alignment concerns (though not in this case because I'm using a char
sized object for my array).
// "works", but is UB as far as I understand
uint32_t v = 0x11223344;
auto p = reinterpret_cast<uint8_t *>(&v);
p[0] = 0x55;
Once again, UB is bad, therefore we don't want to do that.
Some say that this is OK if we use a char*
instead of a uint8_t*
:
// "works", but maybe is UB?
uint32_t v = 0x11223344;
auto p = reinterpret_cast<char *>(&v);
p[0] = 0x55;
But I am honestly not sure about it... So getting creative.
So, I think I remember it being legal (as far as I know) to read the contents of a void*
cast to a char*
(this allows things like std::memcpy
to not be UB). So maybe we can kinda play with this:
uint8_t get_byte(const void *p, size_t n) {
auto ptr = static_cast<const char *>(p);
return ptr[n];
}
void set_byte(void *p, size_t index, uint8_t v) {
auto ptr = static_cast<char *>(p);
ptr[index] = v;
}
// "works", is this UB?
uint32_t v = 0x11223344;
uint8_t v1 = get_byte(&v, 0); // read
set_byte(&v, 0, 0x55); // write
So my questions are:
Is the final example I came up with UB?
If it is, what is the "right" way to do this? I really hope the "correct" way isn't a memcpy
to and from a byte array. That would be ridiculous.
(BONUS): suppose I want my get_byte to return a reference (like for implementing operator[]
. Is it safe to use uint8_t
instead of literal char
when reading a the contents of a void *
?
NOTE: I understand the concerns regarding endian and portability. They are not a problem for my use case. I think that it is acceptable for the result to be an "unspecified value" (in that it is compiler specific which byte it will read). My question is really focused on the UB aspects ("nasal demons" and similar).
Upvotes: 1
Views: 396
Reputation: 217275
Why not create a class for that ?
Something like:
class MyInt32 {
public:
std::uint32_t asInt32() const {
return b[0]
| (b[1] << 8)
| (b[2] << 16)
| (b[3] << 24);
}
void setInt32(std::uint32 i) {
b[0] = (i & 0xFF);
b[1] = ((i >> 8) & 0xFF);
b[2] = ((i >> 16) & 0xFF);
b[3] = ((i >> 24) & 0xFF);
}
const std::array<std::uint8_t, 4u>& asInt8() const { return b; }
std::array<std::uint8_t, 4u>& asInt8() { return b; }
void setInt8s(const std::array<std::uint8_t, 4u>& a) { b = a; }
private:
std::array<std::uint8_t, 4u> b;
};
So you don't have UB, you don't break aliasing rules, you manage endianess as you want.
Upvotes: 3
Reputation: 146930
It's perfectly legit (as long as the type is a POD), and uint8_t
is not guaranteed to be legal so don't.
Upvotes: 0