Reputation: 9269

Optimize integer and floating point multiplication

I am trying to optimize following operation where I have a large number of unsigned short inputs which needs to be scaled down by a certain factor. Is there a way to optimize it to not use floating point operations

unsigned short val = 65523U;
val = val * 0.943;

Note

I will be running above operation on a DSP where floating point operations are costly

Upvotes: 5

Answers (5)

Jean-François Fabre

Reputation: 140168

you could multiply by 943 then divide by 1000. You'd save a floating point division (but you'd do a multiplication + an euclidian division).

unsigned short val = 65523U;
val = (val*943UL)/1000;

I get: 61788

it works (even on systems where int is 16-bit wide) as long as var*943 is within unsigned long capacity (unsigned long long could be used to extend the limit even further).

you could multiply by 943 then divide by 1000. You'd save a floating point division (but you'd do a multiplication + an euclidian division).

unsigned short val = 65523U;
val = (val*943UL)/1000;

I get: 61788

it works (even on systems where int is 16-bit wide) as long as var*943 is within unsigned long capacity (unsigned long long could be used to extend the limit even further).

EDIT: You could even avoid division computing the ratio times a power of 2, I chose 16:

So .943*(1<<16) which is 61800.448

and you could do one multiplication and one shift operation (very fast). It's better to use unsigned long long at this point because the intermediate result can get very large:

val = (val*61800UL)>>16;

to get roughly the same result: 61787. Use 61801 and you get 61788

Upvotes: 1

nwellnhof

Reputation: 33618

You can multiply with an integer approximation of 0.943 * 2^16, then divide by 2^16 which your compiler should transform into a right shift. Assuming 16-bit shorts and at least 32-bit ints:

val = ((unsigned)val * 61800) / 65536;

Depending on your exact requirements, you might get more accurate results by rounding to the nearest integer:

val = ((unsigned)val * 61800 + 32768) / 65536;

Any other power of two will work. On a 64-bit platform, you could use 2^48 for more precision.

Upvotes: 8

Bathsheba

Reputation: 234705

With a platform that uses a 32 bit int or higher, using

int val = 65523U;
val = val * 943 / 1000;

would be hard to beat. Convert the truncation to German rounding by altering the coefficients. If your system has a 16 bit int then you could use a long (note then that the multiplication by 943 and division by 1000 would take place in long arithmetic) but the solution would require profiling.

Dividing by 1000 first would cause truncation issues; a larger type is required to accommodate the larger value.

Upvotes: 1

Jason

Reputation: 1099

The mult / divide thing is good. But even better is that you can avoid the divide.

unisgned short has a range 0 ... 65535.

All maths calculations in the CPU are internally handled as 32 bit numbers. But they get cast back to 16 bit after the computation. You want to avoid that if you're multiplying a short by a large number. The output will be a short, causing it to truncate the value. So I put casts in to show what's going on and to make sure there's no extra type casting going on from the compiler.

unsigned short val = 65523U;

const unsigned int mult = 65536 * 0.943; // expressed as a fraction of 2^16

unsigned short output = (unsigned short)(((unsigned int)val * mult) >> 16));

So this casts the value to 32 bit unsigned int (to guarantee control of the types), multiplies it by up to 2^16 based on the original fraction, then right-shifts it by 16 to put it back in the correct scale.

Upvotes: 4

Lundin

Reputation: 213810

The simplest way is to just use a 32 bit type that can hold the result:

uint16_t val = 65523U;
val = (uint_fast32_t)val * 943 / 1000;

Or if you want more type correctness and portability, while at the same time allowing the compiler to use the best possible integer type for the task:

#include <stdint.h>

uint_fast16_t val = UINT16_C(65523);
val = (uint_fast16_t) ( (uint_fast32_t)val * (uint_fast32_t)943 / (uint_fast32_t)1000 );

Upvotes: 8

Optimize integer and floating point multiplication

Answers (5)

Related Questions