Miles
Miles

Reputation: 19

Excel array query

I'm struggling to understand the mechanics of a particular array formula. I have a row of data ranging from January 2015 to December 2016. Let's assume the data is populated up to October 2016 and the sum in October is £1,000. When data is entered into November 2016 say £1,250, the formula below automatically calculates the delta between the two months. How did the formula do that. Could someone help provide a simple explanation of the below, in particular how it knew to deduct the latest month from the prior month.

=(INDEX(60:60,MAX(IF(M60:AV60<>"",COLUMN(M60:AV60)))))-(INDEX(60:60,MAX(IF(M60:AV60<>"",COLUMN(M60:AV60)-1))))

Thanks for your help,

Miles

Upvotes: 1

Views: 738

Answers (1)

Werrf
Werrf

Reputation: 1148

It's a little complex, but let's break it down a piece at a time.

This looks to be an array formula, which means that rather than dealing with a single cell, it can deal with a whole set of cells at once.

M60:AV60<>"" This segment produces an array (list) of TRUE and FALSE values, looking at each cell between M60 and AV60. Wherever the cell contains a value - ie is not blank - it returns TRUE. Wherever the cell does not contain a value, it returns FALSE. This list exists only in the program's working memory, and it isn't recorded anywhere in the sheet. So we have something like this:

TRUE
TRUE
TRUE
TRUE
TRUE
TRUE
FALSE
FALSE
FALSE
FALSE
FALSE

COLUMN(M60:AV60) This segment produces another array, the same size as the TRUE/FALSE array above, that simply contains the column numbers of every cell from M60 to AV60. We now have two lists - one containing TRUE/FALSE, and one containing numbers, both the same length.

TRUE  | 1
TRUE  | 2
TRUE  | 3
TRUE  | 4
TRUE  | 5
TRUE  | 6
FALSE | 7
FALSE | 8
FALSE | 9
FALSE | 10
FALSE | 11

IF(M60:AV60<>"",COLUMN(M60:AV60)) This IF statement combines the TRUE/FALSE array with the column numbers array to get something more useful. Wherever there is a TRUE in the first array, it is replaced with the corresponding number from the second array; wherever there is a FALSE in the first array, nothing is changed, and the value stays at FALSE. This way, we end up with a list of numbers, representing the columns of each non-blank cell. It's the equivalent of running the IF formula on all the members of the array.

IF | TRUE  |THEN| 1 = 1
IF | TRUE  |THEN| 2 = 2
IF | TRUE  |THEN| 3 = 3
IF | TRUE  |THEN| 4 = 4
IF | TRUE  |THEN| 5 = 5
IF | TRUE  |THEN| 6 = 6
IF | FALSE |THEN| 7 = 0
IF | FALSE |THEN| 8 = 0
IF | FALSE |THEN| 9 = 0
IF | FALSE |THEN| 10 = 0
IF | FALSE |THEN| 11 = 0

The last column, after the =, is what is passed to the MAX function.

MAX(IF(M60:AV60<>"",COLUMN(M60:AV60))) This segment cuts down the list of numbers to just one number, the Max or highest number in the list. Thus we end up with a single result, which represents the last column that contains a value.

INDEX(60:60,MAX(IF(M60:AV60<>"",COLUMN(M60:AV60))))) The INDEX function looks at all of row 60, and returns a value from a specified column in that row. That being the column returned by the previous segments - the last column that contains a value.

The second half of the formula with the second INDEX function does exactly the same thing, but it subtracts 1 from the column number returned - that is, it gets the second-to-last column that has a value.

The end result is subtracting the second-to-last value from the last value, to get the difference between them.

Upvotes: 2

Related Questions