retrodev
retrodev

Reputation: 2393

Copy two elements from uint8_t array into uint16_t variable using pointers

I have a uint8_t array. I sometimes need to treat two sequential elements as uint16_t, and copy them into another uint16_t variable. The elements may not necessarily start on a uint16_t word boundary.

At present I'm using memcpy to do so:

uint8_t bytes[] = { 0x1A, 0x2B, 0x3C, 0x4D };
uint16_t word = 0;
memcpy(&word, &bytes[1], sizeof(word));

gdb shows this works as expected:

(gdb) x/2bx &word
0x7fffffffe2e2: 0x2b    0x3c

Casting a reference to an array element to uint16_t causes only the array element being directly referenced to be copied:

word = (uint16_t)bytes[1];

(gdb) x/2bx &word
0x7fffffffe2e2: 0x2b    0x00

Using a uint16_t pointer and pointing it at an array element address cast to uint16_t * results in two sequential elements being referenced, but not copied:

uint16_t *wordp = (uint16_t *)&bytes[1];

(gdb) x/2bx wordp
0x7fffffffe2e5: 0x2b    0x3c

bytes[1] = 0x5E;

(gdb) x/2bx wordp
0x7fffffffe2e5: 0x5e    0x3c

Assigning a regular uint16_t variable the value dereferenced from the uint16_t pointer copies both sequential array elements:

word = *wordp;
bytes[1] = 0x6F;

(gdb) x/2bx wordp
0x7fffffffe2e5: 0x6f    0x3c
(gdb) x/2bx &word
0x7fffffffe2e2: 0x5e    0x3c

gcc produces no warnings when using the -Wall option.

How can I achieve the same result without the intermediate pointer?

Are there any concerns with using a pointer as described above, or with attempting to do so without the intermediate pointer?

Would using memcpy be considered preferable under certain scenarios?

The ultimate use of this is processing a Big Endian byte stream, so I'm using byteorder(3) functions as appropriate.

Upvotes: 0

Views: 1823

Answers (2)

0___________
0___________

Reputation: 67476

memcpy is the safest way. And using modern compilers the most efficient as memcpy will not be called.

Example (volatile to orevent optimizations):

#include <stdint.h>
#include <string.h>

int main()
{
    volatile uint8_t bytes[4];
    volatile uint16_t word = 0;
    memcpy(&word, &bytes[1], sizeof(word));
}

and the memcpy is translated to

        movzx   eax, WORD PTR [rsp-3]
        mov     WORD PTR [rsp-6], ax

https://godbolt.org/z/57n879

Upvotes: 2

Christian Gibbons
Christian Gibbons

Reputation: 4370

uint8_t bytes[] = { 0x1A, 0x2B, 0x3C, 0x4D };
uint16_t word = 0;
memcpy(&word, &bytes[1], sizeof(word));

The above is good. Copies the bytes directly. If you are happy to copy the bytes over in order and not concerned about endianness, this is the way to do it.

See P__J supports women in Poland's answer for the proof that compilers are smart enough to optimize this usage so you don't have to worry about optimizing it youself.

word = (uint16_t)bytes[1];

The above simply zero-extends your 8-bit value into a 16-bit value (0x002B instead of 0x2B); presumably not what you want.

uint16_t *wordp = (uint16_t *)&bytes[1];
word = *wordp;

Do not do the above. That results in undefined behavior. A uint16_t has stricter alignment requirements than a uint8_t, and you can get an invalid address for the pointer type. That is, being a two-byte data type, it requires addresses that exist on multiple-of-two boundaries (so 0x2, 0x4, 0x6, etc.) while uint8_t doesn't have such a restriction and can exist at 0x1, 0x2, 0x3, etc.. (See section 6.2.8 of C11 Working Draft for more on alignment)

Upon dereferencing it, you have also broken the strict aliasing rule by dereferencing an object with an incompatible type. (See section 6.2.7 of C11 Working Draft for more on compatible types)

Upvotes: 2

Related Questions