Reputation: 45
I have question regarding how to handle some fixed point calculations. I can't figure out how to solve it. I know it is easy in floating point, but i want to figure out how to do it in fixed point.
I have a fixed point system where i am performing the following equation on a signal (vSignal):
Signal_amplified = vSignal * 10^Exp
The vSignal has an max amplitude of around 4e+05,
The system allows for representation of 2.1475e+09 (32 bit) signals. So there is some headroom for Signal_amplified.
For simplicity reason, lest just assume Exp can go from 0 to 10.
Lets say the first value is 2.8928. This value works well when calculating in floating point, since the expresson 10^2.8928 results in 781. When using a rounded floating point value 781 i get signal amplitudes of 3.0085e+08, well within the signal range.
If i try to represent the value 2.8928 with a Q format of, lets say Q12. The value changes to 11849. Now 10^11849 results in overflow.
How should one handle these large numbers?? I Could use another formatting like Q4, but even then the numbers get very large and my becomes poor. I would very much like to be able to calculate with a precision of .001, but i just can see how this should be done.
Minimal Working Example:
int vSignal = 400000
// Floatingpoint -> Goes well
double dExp = 2.89285
double dSignal_amplified = vSignal * std::pow(10,dExp)
// Fixedpoint -> Overflow
int iExp = 11848 // Q12 format
int iSignal_amplified = vSignal * std::pow(10,iExp)
iSignal_amplified = iSignal_amplified>>12
Any ideas?
Upvotes: 1
Views: 1760
Reputation: 179907
"If i try to represent the value 2.8928 with a Q format of, lets say Q12. The value changes to 11849. Now 10^11849 results in overflow.".
Mixed-type math is pretty hard, and it looks like you should avoid it. What you want is pow(Q12(10.0), Q12(2.8928))
or possibly an optimized pow10(Q12(2.8928))
. For the first, see my previous answer. The latter can be optimized by a hardcoded table of powers. pow10(2.8928)
is of course pow10(2) * pow10(.5) * pow10(.25) * pow10(.125) * ...
- each 1 in the binary representation of 2.8928
corresponds to a single table entry. You may want to calculate the intermediate results in Q19.44 and drop the lowest 32 bits when you return..
Storing all the values of pow10(2^-n)
up to n=12 has the slight problem that the result is close to 1, namely 1.000562312
. If you'd store that as a Q12, you lose precision in rounding. Instead, it may be wise to store the value of pow10(2^-12)
as a Q24, the value of pow10(2^-121)
as a Q23 etc. Now evaluate Q12 pow10(Q12 exp)
starting at the LSB of exp
, not the MSB. You need to repeatedly shift the intermediate results as you move up to pow10(0.5)
but half of the time you can merge that with the >>12
that's inherent to Q12 multiplication.
Upvotes: 1
Reputation: 2718
Here is a proposal. It's just a rough idea, that needs to be adjusted and refined.
Say you need a precision of 0.01
(you can choose the precision you need of course) you can represent the exponent as: Exp = N + M*10^-1 + P*10^-2
where N, M and P are integers and M and P are between 0 and 9.
Then you pre-compute and round all values for 10^(M*10^-1) * 100
and 10^(P*10^-2) * 100
. They are all between 1 and 1000. Store them in a lookup table to avoid computing float operations at runtime. Let's call these lookup tables A[M] and B[P].
Then you can compute 10^Exp =( 10^N * A[M] * B[P] ) / 10000
The multiplication should not overflow since A[M] * B[P]
is between 1 and 1,000,000 and A is lower than 10 according to what you said.
I did a quick test with a few values and it seems to give an acceptable precision.
Upvotes: 1