user2030431
user2030431

Reputation: 831

convert int64 to float32 in c

I read from an embedded device from four 16 bit registers that represent a 64 bit integer. The read function reads them in uint16_t and i want to convert it to float 32. If i do casting like this i get warnings left shift count >= width of type [enabled by default].

uint16_t u1,u2,u3,u4;
u1=readregister();
u2=readregister();
u3=readregister();
u4=readregister();

float num11 = (float) (u1 << 48);       
float num22 = (float) (u2 << 32); 
float num33 = (float) (u3 << 16);   
float num44 = (float) u4;   
float numm= num11+num22+num33+num44;
printf("%f\n", numm);

What about accuracy?

Upvotes: 0

Views: 5744

Answers (2)

Pascal Cuoq
Pascal Cuoq

Reputation: 80276

One way to do it is:

#include <math.h>

float numm = (float) u4 + ldexpf(u3, 16) + ldexpf(u2, 32) + ldexpf(u1, 48);

This does not require your embedded compiler to provide any other integer size than you already have with uint16_t, it only requires ldexpf().

This computes a float that is within one ULP of the mathematical sum of the shifted integers u1, …, u4.

Upvotes: 2

ouah
ouah

Reputation: 145829

Do it this way:

float num11 = (uint64_t) u1 << 48;
/* ... */

If the compiler warns (which C does not require) because of the uint64_t conversion to float, you can add an extra float cast:

float num11 = (float) ((uint64_t) u1 << 48);

This will get rid of the warning.

For efficiency and precision reasons, it would be best to first convert your 4 uint16_t to a single uint64_t and then perform a single conversion from uint64_t to float.

Upvotes: 2

Related Questions