tuk
tuk

Reputation: 6872

Calculate R-Square for PolynomialCurveFitter in Apache commons-math3

Apache commons-math3 (version 3.6.1) classes like OLSMultipleLinearRegression, SimpleRegression provide a method that calculates RSquare (i.e calculateRSquared(), getRSquare() respectively). But I am not able to find any such method for PolynomialCurveFitter ?

Right now I am doing it myself like below. Is there any such method in common-math which does this?

private PolynomialFunction getPolynomialFitter(List<List<Double>> pointlist) {
    final PolynomialCurveFitter fitter = PolynomialCurveFitter.create(2);
    final WeightedObservedPoints obs = new WeightedObservedPoints();
    for (List<Double> point : pointlist) {
        obs.add(point.get(0), point.get(1));
    }

    double[] fit = fitter.fit(obs.toList());
    System.out.printf("\nCoefficient %f, %f, %f", fit[0], fit[1], fit[2]);
    final PolynomialFunction fitted = new PolynomialFunction(fit);
    return fitted;
}
private double getRSquare(PolynomialFunction fitter, List<List<Double>> pointList) {
    final double[] coefficients = fitter.getCoefficients();
    double[] predictedValues = new double[pointList.size()];
    double residualSumOfSquares = 0;
    final DescriptiveStatistics descriptiveStatistics = new DescriptiveStatistics();
    for (int i=0; i< pointList.size(); i++) {
        predictedValues[i] = predict(coefficients, pointList.get(i).get(0));
        double actualVal = pointList.get(i).get(1);
        double t = Math.pow((predictedValues[i] - actualVal), 2);
        residualSumOfSquares  += t;
        descriptiveStatistics.addValue(actualVal);
    }
    final double avgActualValues = descriptiveStatistics.getMean();
    double totalSumOfSquares = 0;
    for (int i=0; i<pointList.size(); i++) {
        totalSumOfSquares += Math.pow( (predictedValues[i] - avgActualValues),2);
    }
    return 1.0 - (residualSumOfSquares/totalSumOfSquares);
}
final PolynomialFunction polynomial = getPolynomialFitter(trainData);
System.out.printf("\nPolynimailCurveFitter R-Square %f", getRSquare(polynomial, trainData));

Upvotes: 2

Views: 1604

Answers (1)

tuk
tuk

Reputation: 6872

This has been answered in apache-commons mailing list. Cross-posting the answer

OLSMultipleLinearRegression, SimpleRegression provide a method that returns calculateRSquared(), getRSquare(). But I am not able to find any such method for PolynomialCurveFitter ?

Right now I am doing it myself like below :-

Is there any such method in common-math which does this?

"PolynomialCurveFitter" is one of the syntactic sugar/wrapper around the least-squares optimizers. No state is maintained in the (immutable) instance.

private PolynomialFunction getPolynomialFitter(List<List<Double>>pointlist) {

final PolynomialCurveFitter fitter = PolynomialCurveFitter.create(2);

final WeightedObservedPoints obs = new WeightedObservedPoints();
for (List<Double> point : pointlist) {
    obs.add(point.get(0), point.get(1));
}

double[] fit = fitter.fit(obs.toList());
System.out.printf("\nCoefficient %f, %f, %f", fit[0], fit[1], fit[2]); 

final PolynomialFunction fitted = new PolynomialFunction(fit);
return fitted;
}

This is indeed one the intended use-cases.

private double getRSquare(PolynomialFunction fitter, List<List<Double>> pointList) {

final double[] coefficients = fitter.getCoefficients();
double[] predictedValues = new double[pointList.size()];
double residualSumOfSquares = 0;
final DescriptiveStatistics descriptiveStatistics = new DescriptiveStatistics();

for (int i=0; i< pointList.size(); i++) {
    predictedValues[i] = predict(coefficients, pointList.get(i).get(0));

    double actualVal = pointList.get(i).get(1);
    double t = Math.pow((predictedValues[i] - actualVal), 2);
    residualSumOfSquares  += t;
    descriptiveStatistics.addValue(actualVal);
}
final double avgActualValues = descriptiveStatistics.getMean();
double totalSumOfSquares = 0;
for (int i=0; i<pointList.size(); i++) {
    totalSumOfSquares += Math.pow( (predictedValues[i] - avgActualValues),2);

}
return 1.0 - (residualSumOfSquares/totalSumOfSquares);
}

The "predict" method is not shown here, but note that the argument which you called "fitter" in the above, is actually a polynomial function:

http://commons.apache.org/proper/commons-math/apidocs/org/apache/commons/math4/analysis/polynomials/PolynomialFunction.html

Hence: predictedValues[i] = fitter.value(pointList.get(i).get(0));

But otherwise, yes, the caller is responsible for choosing his assessement of the quality of the model.

You could directly use the least-squares suite of classes; then the "Evaluation" object would allow to retrieve various measures of the fit:

http://commons.apache.org/proper/commons-math/apidocs/org/apache/commons/math4/fitting/leastsquares/LeastSquaresProblem.Evaluation.html

However, they might still not be what you are looking for...

Upvotes: 2

Related Questions