Reputation: 31

32-bit fractional multiplication with cross-multiplication method (no 64-bit intermediate result)

I am programming a fixed-point speech enhancement algorithm on a 16-bit processor. At some point I need to do 32-bit fractional multiplication. I have read other posts about doing 32-bit multiplication byte by byte and I see why this works for Q0.31 formats. But I use different Q formats with varying number of fractional bits.

So I have found out that for fractional bits less than 16, this works:

(low*low >> N) + low*high + high*low + (high*high << N)

where N is the number of fractional bits. I have read that the low*low result should be unsigned as well as the low bytes themselves. In general this gives exactly the result I want in any Q format with less than 16 fractional bits.

Now it gets tricky when the fractional bits are more than 16. I have tried out several numbers of shifts, different shifts for low*low and high*high I have tried to put it on paper, but I can't figure it out.

I know it may be very simple but the whole idea eludes me and I would be grateful for some comments or guidelines!

Upvotes: 3

Answers (2)

Alexey Frunze

Reputation: 62048

There are a few ideas at play.

First, multiplication of 2 shorter integers to produce a longer product. Consider unsigned multiplication of 2 32-bit integers via multiplications of their 16-bit "halves", each of which produces a 32-bit product and the total product is 64-bit:

a * b = (a_hi * 2¹⁶ + a_lo) * (b_hi * 2¹⁶ + b_lo) =

a_hi * b_hi * 2³² + (a_hi * b_lo + a_lo * b_hi) * 2¹⁶ + a_lo * b_lo.

Now, if you need a signed multiplication, you can construct it from unsigned multiplication (e.g. from the above).

Supposing a < 0 and b >= 0, a *_signed b must be equal

2⁶⁴ - ((-a) *_unsigned b), where

-a = 2³² - a (because this is 2's complement)

IOW,

a *_signed b =

2⁶⁴ - ((2³² - a) *_unsigned b) =

2⁶⁴ + (a *_unsigned b) - (b * 2³²), where 2⁶⁴ can be discarded since we're using 64 bits only.

In exactly the same way you can calculate a *_signed b for a >= 0 and b < 0 and must get a symmetric result:

(a *_unsigned b) - (a * 2³²)

You can similarly show that for a < 0 and b < 0 the signed multiplication can be built on top of the unsigned multiplication this way:

(a *_unsigned b) - ((a + b) * 2³²)

So, you multiply a and b as unsigned first, then if a < 0, you subtract b from the top 32 bits of the product and if b < 0, you subtract a from the top 32 bits of the product, done.

Now that we can multiply 32-bit signed integers and get 64-bit signed products, we can finally turn to the fractional stuff.

Suppose now that out of those 32 bits in a and b N bits are used for the fractional part. That means that if you look at a and b as at plain integers, they are going to be 2^N times greater than what they really represent, e.g. 1.0 is going to look like 2^N (or 1 << N).

So, if you multiply two such integers the product is going to be 2^N*2^N = 2^2*N times greater than what it should represent, e.g. 1.0 * 1.0 is going to look like 2^2*N (or 1 << (2*N)). IOW, plain integer multiplication is going to double the number of fractional bits. If you want the product to have the same number of fractional bits as in the multiplicands, what do you do? You divide the product by 2^N (or shift it arithmetically N positions right). Simple.

A few words of caution, just in case...

In C (and C++) you cannot legally shift a variable left or right by the same or greater number of bits contained in the variable. The code will compile, but not work as you may expect it to. So, if you want to shift a 32-bit variable, you can shift it by 0 through 31 positions left or right (31 is the max, not 32).

If you shift signed integers left, you cannot overflow the result legally. All signed overflows result in undefined behavior. So, you may want to stick to unsigned.

Right shifts of negative signed integers are implementation-specific. They can either do an arithmetic shift or a logical shift. Which one, it depends on the compiler. So, if you need one of the two you need to either ensure that your compiler just supports it directly or implement it in some other ways.

Upvotes: 0

stark

Reputation: 13189

It's the same formula. For N > 16, the shifts just mean you throw out a whole 16-bit word which would have over- or underflowed. low*low >> N means just shift N-16 bit in the high word of the 32-bit result of the multiply and add to the low word of the result. high * high << N means just use the low word of the multiply result shifted left N-16 and add to the high word of the result.

Upvotes: 0

32-bit fractional multiplication with cross-multiplication method (no 64-bit intermediate result)

Answers (2)

Related Questions