Emma
Emma

Reputation: 1297

Ignore #N/As in Excel LINEST function with multiple independent variables (known_x's)

I am trying to find the equation of a plane of best fit to a set of x,y,z data using the LINEST function. Some of the z data is missing, meaning that there are #N/As in the z column. For example:

    A   B   C
    (x) (y) (z)
1   1   1   5.1
2   2   1   5.4
3   3   1   5.7
4   1   2   #N/A
5   2   2   5.2
6   3   2   5.5
7   1   3   4.7
8   2   3   5
9   3   3   5.3

I would like to do =LINEST(C1:C9,A1:B9), but the #N/A causes this to return a value error.

I found a solution for a single independent variable (one column of known_x's, i.e. fitting a line to x,y data), but I have not been able to extend it for two independent variables (two known_x's columns, i.e. fitting a plane to x,y,z data). The solution I found is here: http://www.excelforum.com/excel-general/647448-linest-question.html, and the formula (slightly modified for my application) is:

=LINEST(
  N(OFFSET(C1:C9,SMALL(IF(ISNUMBER(C1:C9),ROW(C1:C9)-ROW(C1)),
    ROW(INDIRECT("1:"&COUNT(C1:C9)))),0,1)),
  N(OFFSET(A1:A9,SMALL(IF(ISNUMBER(C1:C9),ROW(C1:C9)-ROW(C1)),
    ROW(INDIRECT("1:"&COUNT(C1:C9)))),0,1)),
  )

which is equivalent to =LINEST(C1:C9,A1:A9), ignoring the row containing the #N/A.

Upvotes: 1

Views: 13135

Answers (1)

lori_m
lori_m

Reputation: 5567

The formula from the posted link could probably be adapted but it is unwieldy. Least squares with missing data can be viewed as a regression with weight 1 for numeric values and weight 0 for non-numeric values. Based on this observation you could try this (with Ctrl+Shift+Enter in a 1x3 range):

=LINEST(IF(ISNUMBER(C1:C9),C1:C9,),IF(ISNUMBER(C1:C9),CHOOSE({1,2,3},1,A1:A9,B1:B9),),)

This gives the equation of the plane as z=-0.2x+0.3y+5 which can be checked against the results of using LINEST(C1:C8,A1:B8) with the error row removed.

Upvotes: 4

Related Questions