Shykin
Shykin

Reputation: 168

Proper Translation of linear regression equation to C#

I am trying to replicate this equation: Slope(b) = (NΣXY - (ΣX)(ΣY)) / (NΣX2 - (ΣX)2) in C# but I'm getting the following issue:

If I make the average of X = 1 + 2 + 3 + 4 + 5 and the average Y = 5 + 4 + 3 + 2 + 1 it gives me a positive slope even though it is clearly counting down. If I place the same numbers into this calculator: http://www.easycalculation.com/statistics/regression.php

It gives me a negative slope in the linked calculator with the same data. I'm trying to narrow down the reasons so is the following a proper translation from equation to C# code:

Slope(b) = (NΣXY - (ΣX)(ΣY)) / (NΣX2 - (ΣX)2)

to

Slope (m) = ((x * avgX * avgY) - (avgX * avgY)) / ((x * Math.Pow(avgX, 2)) - Math.Pow(avgX, 2));

Upvotes: 1

Views: 652

Answers (1)

YoryeNathan
YoryeNathan

Reputation: 14522

Avg has nothing to do with it. Σ means Sum(...). It should actually be:

var sumX = pts.Sum(pt => pt.X);
var slope = (numberOfPoints * pts.Sum(pt => pt.X * pt.Y) -
             sumX * pts.Sum(pt => pt.Y)) /
            (numberOfPoints * pts.Sum(pt => pt.X * pt.X) -
             sumX * sumX)

ΣXY Doesn't mean Sum(x) * Sum(y). But it means Sum(x * y), which is different.

ΣX2 Doesn't mean Sum(x) ^ 2. But it means Sum(x ^ 2), which is different as well.

ΣXY = Σ(XY) != ΣX * ΣY

And that is where your mistakes really came from.

Other than that and the terminology of average vs sum, you weren't far from the answer.

Upvotes: 4

Related Questions