Cost of conversion

Question

Let me first explain the motivation of the problem I will outline. I have a vector v of unsigned words with information stored in the bits as well as a word w of some possibly different unsigned type. I would like to put len bits from v into w, beginning at the ith LSB of w. For example, consider the case below where I have represented the words as bits just for the sake of explanation.

T1 w = 10110100;
vector v = [11, 00, 10, 01];
T1 len = 5;
T1 start = 2;
T1 dst_digits = 8;
T1 src_digits = 2;

10110100 -> 10111100 -> 10001100 -> 11001100

It is clear that we will need to iterate through v and add bits to w one word at a time. One way to do this is the following

template
T _blend(T a, T b, T start, T len) {
    // Replaces len bits of a by the ones of b starting at start
}

auto it = std::begin(v);
while (len > 0) {
    w = _blend(w, 
               (T1) ((T1) *first) << start), 
               start,
               std::min(len, src_digits)
               );
    start += std::min(len, src_digits);
    ++first;
    len -= std::min(len, src_digits);
}

I am fairly new to C++, and the code above is simplified code, but the main idea stands. I have this working currently as well. However, I find the (T1) ((T1) *first) << start) ugly. If I don't include the first cast, though, the shift operation will be promoted to int or long, and then mismatch with the other types of _blend. If I don't include the second cast, I may overshift *first (in the case when dst_digits > src_digits and start > src_digits).

Now, my question is how costly are the two (T1) conversions? My guess is not nearly as costly as other things, like the std::min call or the other statements in the loop. I only ask simply because coming from Python, seeing something like (T1) ((T1) *first) << start) just looks plain unnatural, and I'm wondering if that's just a consequence of C++ with little to no overhead, and whether or not there is simply a better way to do it.

Indiana Kernick · Accepted Answer

On a little-endian system, an unsigned integer to unsigned integer cast is translated to zero or one instructions.

If there is a uint64 in rax then casting to a uint32 is basically a no-op. The compiler will just assume that the value is in eax (the least significant half of rax).

If there is a uint32 in eax then casting to a uint64 is just one very fast instruction. The compiler will zero extend (fill the most significant part with zeros) the value in eax and place the result in rbx with the movzx instruction.

Unsigned integer casts are the least of your concern during this operation. You might want to consider using std::vector or std::bitset as they will do this bit twiddling for you.

Cost of conversion

Answers (1)

Related Questions