Reputation: 311
I have a data set formatted as table in Excel 2019. What I would like to achieve is a regression analysis, but only for those records in data set, where we have 'X' in E column which is named IncludeInRegression.
Known Y's are in column F (Price) and known X's are in columns B:D (L, W, Volume).
I have managed to make it work for one independent variable X (variable L in column B) and here is the array formula:
=LINEST(
INDEX(F:F;N(IF(1;MODE.MULT(IF(tblData[IncludeInRegression]={"X","X"};ROW(tblData[Price]))))));
INDEX(B:B;N(IF(1;MODE.MULT(IF(tblData[IncludeInRegression]={"X","X"};ROW(tblData[L]))))));
TRUE;FALSE)
However, I cannot make it work for 3 independent variables. I have tried the following array formula, but #VALUE! is returned:
=LINEST(
INDEX(F:F;N(IF(1;MODE.MULT(IF(tblData[IncludeInRegression]={"X","X"};ROW(tblData[Price]))))));
INDEX(B:D;N(IF(1;MODE.MULT(IF(tblData[IncludeInRegression]={"X","X"};ROW(tblData[[L]:[Volume]]))))));
TRUE;FALSE)
So it will be easier for you to visualize, I am attaching an image as well.
Upvotes: 0
Views: 1043
Reputation: 152505
You need to include another array of the column numbers in the second INDEX so it returns all three columns:
=LINEST(
INDEX(F:F,N(IF(1,MODE.MULT(IF(tblData[includeinRegression]={"X","X"},ROW(tblData[Price])))))),
INDEX(B:D,N(IF(1,MODE.MULT(IF(tblData[includeinRegression]={"X","X"},ROW(tblData[[L]:[Volume]]))))),N(IF(1,{1,2,3}))),
TRUE,FALSE)
Depending on ones version this may need to be array entered by selecting four horizontal cells, putting the formula in the formula bar and using Ctrl-Shift-Enter instead of Enter when leaving edit mode
Upvotes: 1