Reputation: 582
Let me first explain the motivation of the problem I will outline. I have a vector v
of unsigned words with information stored in the bits as well as a word w
of some possibly different unsigned type. I would like to put len
bits from v
into w
, beginning at the i
th LSB of w
. For example, consider the case below where I have represented the words as bits just for the sake of explanation.
T1 w = 10110100;
vector<T2> v = [11, 00, 10, 01];
T1 len = 5;
T1 start = 2;
T1 dst_digits = 8;
T1 src_digits = 2;
10110100 -> 10111100 -> 10001100 -> 11001100
It is clear that we will need to iterate through v
and add bits to w
one word at a time. One way to do this is the following
template<class T>
T _blend(T a, T b, T start, T len) {
// Replaces len bits of a by the ones of b starting at start
}
auto it = std::begin(v);
while (len > 0) {
w = _blend(w,
(T1) ((T1) *first) << start),
start,
std::min(len, src_digits)
);
start += std::min(len, src_digits);
++first;
len -= std::min(len, src_digits);
}
I am fairly new to C++, and the code above is simplified code, but the main idea stands. I have this working currently as well. However, I find the (T1) ((T1) *first) << start)
ugly. If I don't include the first cast, though, the shift operation will be promoted to int
or long
, and then mismatch with the other types of _blend
. If I don't include the second cast, I may overshift *first
(in the case when dst_digits > src_digits
and start > src_digits
).
Now, my question is how costly are the two (T1)
conversions? My guess is not nearly as costly as other things, like the std::min
call or the other statements in the loop. I only ask simply because coming from Python, seeing something like (T1) ((T1) *first) << start)
just looks plain unnatural, and I'm wondering if that's just a consequence of C++ with little to no overhead, and whether or not there is simply a better way to do it.
Upvotes: 3
Views: 390
Reputation: 5331
On a little-endian system, an unsigned integer to unsigned integer cast is translated to zero or one instructions.
If there is a uint64
in rax
then casting to a uint32
is basically a no-op. The compiler will just assume that the value is in eax
(the least significant half of rax
).
If there is a uint32
in eax
then casting to a uint64
is just one very fast instruction. The compiler will zero extend (fill the most significant part with zeros) the value in eax
and place the result in rbx
with the movzx
instruction.
Unsigned integer casts are the least of your concern during this operation. You might want to consider using std::vector<bool>
or std::bitset
as they will do this bit twiddling for you.
Upvotes: 2