Beacon of Wierd
Beacon of Wierd

Reputation: 101

What's an efficient way to skip 0 coefficients when evaluating a polynomial?

I'm working on a microchip with a single core (the stm32f103c8t6) and I want to evaluate some polynomial up to a predetermined exponent, so f(x) = a0 + a1*x +a2*x^2 +...+an*x^n where n is known at compile time.

The coefficients a0...an will change during runtime and a few of them are very likely to become zero and I want to find a way to skip those when evaluating f(x). Note, most will be non-zero. Ideally I would want to rewrite f(x) during runtime so the coefficients which are zero are no longer in the function, but I don't want to go into self modifying code territory (unless there's some easy way to do it in C++). The microchip is capable of single instruction multiplication, so any solution equivalent to having if statements to check if the coefficient is zero would be the same or slower than just evaluating the expression as a whole.

The evaluations will happen many times so even saving a single cycle on evaluating the function is helpful. Currently I don't have a viable solution for this and I'm simply evaluating the polynomial as a whole.

I'm writing in C++ for the microchip, though I'm currently working on the algorithms in python since it's easier to plot results and such so I don't have any code for the problem.

Upvotes: 0

Views: 385

Answers (3)

paladin
paladin

Reputation: 790

Using a finite state machine. You need to write the code for every possible state. But this is probably the fastest possible way to calculate.

This is an example, it uses example math-functions and an example value of iterations. OP has to provide his own math-functions and own value of iteration for each state.

#include <iostream>

int main(int argc, char *argv[]){
    /* MAX_COEFFICIENTS <-- states == 2^coefficients */
    const int MAX_COEFFICIENTS = 3;
    int coefficientsv[] = {1, 0, 3}; /* {A, B, C} */
    int coefficientsc = sizeof coefficientsv / sizeof coefficientsv[0];
    if(coefficientsc > MAX_COEFFICIENTS) return -1; /* Out of bounds! */
    
    register int A = 0, B = 0, C = 0;
    for(int i = 0; i < coefficientsc; i++){
        if(i == 0) A = coefficientsv[i];
        else if(i == 1) B = coefficientsv[i];
        else if(i == 2) C = coefficientsv[i];
    }
    
    register long polycalc = 0; /* or use just int, if int is big enough */
    register int iteration = 1000; /* example value */
    
    /* state truth table    */
    /* A B C                */
    /* 0 0 0 goto STATE_0   */
    /* 1 0 0 goto STATE_A   */
    /* 0 1 0 goto STATE_B   */
    /* 1 1 0 goto STATE_AB  */
    /* 0 0 1 goto STATE_C   */
    /* 1 0 1 goto STATE_AC  */
    /* 0 1 1 goto STATE_BC  */
    /* 1 1 1 goto STATE_ABC */

    STATE_SELECT:
    if(!A && !B && !C) goto STATE_0;
    if( A && !B && !C) goto STATE_A;
    if(!A &&  B && !C) goto STATE_B;
    if( A &&  B && !C) goto STATE_AB;
    if(!A && !B &&  C) goto STATE_C;
    if( A && !B &&  C) goto STATE_AC;
    if(!A &&  B &&  C) goto STATE_BC;
    if( A &&  B &&  C) goto STATE_ABC;
    
    STATE_0:
    while(iteration){
        polycalc = 0;
        iteration--;
    }
    goto END;
    
    STATE_A:
    while(iteration){
        polycalc = A;
        iteration--;
    }
    goto END;
    
    STATE_B:
    while(iteration){
        polycalc = B * B;
        iteration--;
    }
    goto END;
    
    STATE_AB:
    while(iteration){
        polycalc = A + B * B;;
        iteration--;
    }
    goto END;
    
    STATE_C:
    while(iteration){
        polycalc = C * C * C;
        iteration--;
    }
    goto END;
    
    STATE_AC:
    while(iteration){
        polycalc = A + C * C * C;
        iteration--;
    }
    goto END;
    
    STATE_BC:
    while(iteration){
        polycalc = B * B + C * C * C;
        iteration--;
    }
    goto END;
    
    STATE_ABC:
    while(iteration){
        polycalc = A + B * B + C * C * C;
        iteration--;
    }
    /* Example: Pseudo-Code */
    /* Maybe your first calculation has shown that C becomes 0. */
    /* So simply use "goto STATE_AB", maybe set some variables. */
    /* In another example, you only know that some coefficient */
    /* has become 0, but you don't know which one, */
    /* so use "goto STATE_SELECT", maybe set some variables too. */
    goto END;
    
    END:
    std::cout << polycalc << std::endl;
    
    return 0;
}

Upvotes: -1

comingstorm
comingstorm

Reputation: 26087

As an initial step, recall that it is usually better to compute polynomials in some other way than the usual presentation:

polynomial = a0 + a1*x + a2*x^2 + a3*x^3 + a4*x^4 + a5*x^5 + ...
           = a0 + x*(a1 + x*(a2 + x*(a3 + x*(a4 + x*(a5 + ... )...)
           = (a0 + x*a1) + x^2*(a2 + x*a3) + x^4*((a4 + x*a5) + x^2*(a6 + x*a7)) + ...

The first line is the conventional presentation, and has the problem that you can end up abusing your dynamic range.

The second line tends to avoid the numerical problems with computing successive powers of x, and is a fairly conventional way to actually compute polynomials. Unfortunately, it still has a long chain of dependencies, which limits the instruction-level parallellism you can get from it.

The third line avoids the long chain of dependencies, and is also ideally suited for exploiting a vector unit if you have one.


From your comment discussion, it sounds like you're working with a low-powered embedded processor. So, let's assume no vector processor and not much in the way of ILP available anyway, and adapt the second line. (I will write this in terms of floating point, but it should be readily adaptable to fixed-point).

First, the straightforward method, without trying to skip zero-coefficients:

int max_idx = a.size() - 1;
float result = a[max_idx];
int i = max_idx;
while (i > 0) {
  -- i;
  result = a[i] + x * result;
}
return result;

Now, the clever method: assuming your coefficients are fixed, you can preprocess them to determine what powers of x you can skip, and therefore which incremental powers you will need to compute (once, ahead of time):

// compute powers between nonzero coefficients
int max_idx = a.size() - 1;
assert(a[max_idx] != 0); // you should not have leading zeroes in the first place

std::vector<int> inc_powers;
std::vector<int> inc_indices;
int inc_pow = 0;
int i = max_idx;
while (i > 0) {
  -- i;
  ++ inc_pow;
  if (a[i] != 0 || i == 0) {
    inc_powers.push_back(inc_pow);
    inc_indices.push_back(i);
    inc_pow = 0;
  }
}

To evaluate the polynomial:

int i = a.size() - 1;
float result = a[i];
for (int i = 0; i < inc_powers.size(); ++i) {
  result = a[inc_indices[i]] + x_powers[inc_powers[i]] * result;
}
return result;

You will need to compute x_powers -- an array of powers of x used in the above polynomial computation. There is probably some clever way to compute and specify a minimum computation to generate exactly the powers of x you will need, but it will probably be about as fast (for small polynomials on a slow processor without ILP) to generate them incrementally. You will need to determine (once, ahead of time) how many powers you need:

int max_powers = 0;
for (power: inc_powers) {
  if (power > max_powers) { max_powers = power; }
}
std::vector<float> x_powers(max_powers + 1);

Then, for each evaluation, compute them successively as prep for computing the polynomial value:

float value = x;
for (int i = 1; i < max_powers; ++i) {
  x_powers[i] = value;
  value *= x;
}
x_powers[max_powers] = value;

(I have set things up so that x_powers[i] == x^i and x_powers[0] is unused. Hopefully that will make things easier to follow; of course in practice you don't have to do it that way...)

(In a similar spirit, note that you can just copy the nonzero coefficients instead of using an index -- but it might be more instructive as is, so I'll leave it that way)


Finally, since you are concerned about performance, you need to actually benchmark both the straightforward and the clever version to make sure your "improvement" is actually an improvement.

Upvotes: 3

Related: Rice's theorem and some Robinson's theorem. IIRC your problem is undecidable or at least intractable, and related to P versus NP problem. Try using REDUCE.

Notice that Artificial Intelligence is probably a better place to ask

You could try abstract interpretation and partial evaluation techniques

You certainly can get a PhD by working on this topic.

Study the source code of GCC, of MILEPOST GCC and of the Clang static analyzer. Both are doing such compiler optimizations (with more or less success). Consider also using Frama-C.

I'm sure this problem already has a solution but I couldn't find it

As far as I know, your problem has no exact solution

and if you find one, you would get the Turing award.

Read J.Pitrat's Artificial Beings: the Conscience of a Conscious Machine (ISBN 978-1848211018) book. It is giving useful insights on your problem.

Upvotes: 0

Related Questions