Sam
Sam

Reputation: 2865

Why are Excel Array formulas slow?

I have read in many places on the web that one of the downsides of using array formulas is that they are slow (and that using intermediate helper columns can speed things up).

The general explanation given is that they are performing a lot of work. They are, but the helper formulas must surely also be reproducing the same amount of work spread across different cells.

I would like to understand in a little more detail:

  1. Why array formulas are slow?
  2. What situations might it not be so?

Upvotes: 4

Views: 15763

Answers (1)

Axel Richter
Axel Richter

Reputation: 61860

Let's have a concrete example:

enter image description here

Formulas:

in D2:E11:

=IF($A2=D$1,$B2)

in D12 and E12:

=SUM(D2:D11)

in D13 and E13:

{=SUM(IF($A$2:$A$11=D$1,$B$2:$B$11))}

as an array formula confirmed with [Ctrl]+[Shift]+[Enter]

in D14 and E14:

=SUMIF($A$2:$A$11,D$1,$B$2:$B$11)

So what happens if something changes in A2:B11?

In the helper columns only those formulas needs to recalculate which refers to the cells which have actually changed. And of course the sums in D12 and E12 must be recalculated. But this is exactly wherefore Excel was developed. So the program is probably optimized to do exactly this very performant.

The array formula must be fully recalculated, independent of which cells have actually changed. That are 10 IF functions and the SUM after each change in A2:B11. And there is nothing optimized. The array context causes only the IFfunction, which per default expect one criteria and one value_if_the_criteria_is_true, to perform multiple times for each criteria in the criteria array and provide a results array, which is then summed by SUM.

The SUMIF also must be fully recalculated. But this will be done with the precompiled function SUMIF which is optimized to do so. That function not simply calls the IF function 10 times, as the array formula does.

So if it comes to performance, the solution with helper columns will be the best, then comes SUMIF and the array formula is surely the solution with lowest performance.

But of course the performance difference will only noticeable if either there are much data cells which influence the formulas or if there are much array formulas in the sheet. So if one needs using array formulas, then those should not take whole columns or rows as arrays and one should not put thousands of array formulas in the sheet.

Upvotes: 5

Related Questions