Reputation: 1000
This question is about how Excel's COUNTIF function treats different data types when used as an array formula.
There are lots of good posts out there detailing how to use COUNTIF for tasks such as extracting unique values from a list, for example this post. I've managed to use examples from this and other posts to solve specific problems, but I'm trying to get a deeper understanding of array formulas in order to adapt my formulas to new needs.
I came across a peculiar behavior of COUNTIF. In general, Excel seems to treat strings as "larger than" numbers, so that the following examples are valid:
Cell Formula Returns
=1<2 TRUE
="a"<"b" TRUE
="a">"b" FALSE
=1<"b" TRUE
Now, suppose range A1:A6 contains the following data set:
1
2
3
A
B
C
For each cell in this set, I want to check how many of all the cells in the set that are smaller than or equal to that cell (a useful technique in more complex formulas). I enter the following array formula in range B1:B6:
{=COUNTIF($A$1:$A$6,"<="&$A$1:$A$6)}
(CTRL + SHIFT + ENTER)
Based on the examples above comparing numbers and strings (also illustrated in Column D below), I would expect the output shown below to look like Column C. However, the array formula returns the result shown in Column B, which suggests that strings and number elements are counted separately by arraywise COUNTIF.
Column A Column B Column C Column D
1 1 1 A1<"C" = TRUE
2 2 2 A2<"C" = TRUE
3 3 3 A3<"C" = TRUE
A 1 4 A4<"C" = TRUE
B 2 5 A5<"C" = TRUE
C 3 6 A6<"C" = FALSE
So the question is how to produce the output in Column C? (EDIT: Just to clarify, I'm specifically looking for solutions that make use of COUNTIF's array properties.)
Any insight into why arraywise COUNTIF apparently behaves differently than the single-cell examples would also be much appreciated.
NOTE: I've translated the examples from a non-English version of Excel, so I apologize in advance for any typos.
PS. For a background, I ran into this problem when I tried to build a formula that would both extract unique values from a list with possible duplicates, and sort the unique values in numerical/alphabetical order. My current solution is to do this in two steps. One solution for how to do it in one step is proposed here.
Upvotes: 6
Views: 2972
Reputation: 61975
The different behavior can easily be shown if you compare
=COUNTIF($A$1:$A$6,"<=A")
with
{=COUNT(IF($A$1:$A$6<="A",1))}
The first will only get text values from $A$1:$A$6
because it is clearly text to compare and it is faster ignoring other values then. =COUNTIF($A$1:$A$6,"<=3")
will only get numeric values from $A$1:$A$6
because of the same reasons. Even if the criterion would be a concatenation with a cell reference, the concatenation would be the first process and would lead either to "<=A" or "<=3". So, it is always clear what to compare, text or numbers.
The second will use an array of comparisons, then performs the IF
, gets so an array of 1 or FALSE
and counts then. But the "A" could also be a cell reference. So, it is not clear what to compare at the beginning and the first array has to compare all values in $A$1:$A$6
.
So COUNTIF(S)
and SUMIF(S)
can't be used comparing mixed text and numeric data.
The solution is shown already by XOR LX.
Btw.: with your PS. For a background
you should consider the following solution from a German Excel site: http://www.excelformeln.de/formeln.html?welcher=236.
In your linked example:
Formula in B2
downwards
{=INDEX($A$2:$A$99,MATCH(LARGE(COUNTIF(A$2:A$99,">="&A$2:A$99)+99*ISNUMBER(A$2:A$99),ROWS($1:1)),COUNTIF(A$2:A$99,">="&A$2:A$99)+99*ISNUMBER(A$2:A$99),0))&""}
In this solution the COUNTIF
compares with >=
so the biggest text or number will count lowest and so get the lowest position. All number positions are added with 99. So, they are always greater than all possible text positions. So, we have a descending sorted array. Then, using LARGE
, the list is created from the highest to the lowest position.
Upvotes: 2
Reputation: 6660
Try this FormulaArray
in B1
then copy till B6
:
=SUM(($A$1:$A$6<=$A1)*1)
Upvotes: 0
Reputation: 138
I doubt that countif is the right function for what you want to achieve here.
try this (ctrl+shift+enter):
={SUM(IF(A1>=$A$1:$A$6,1,0))}
You will get
1
2
3
4
5
6
PS: CountIf is an basically an array function internally. Using it in another array function results into multiple array functions and their behaviour becomes complex. Array functions are best used with clear logical path.
As tested in Excel 2013, you will only get 1 in all results instead of what was proposed in Column B.
Currently, in the function provided by you, countif cannot figure out which cell to compare to which cell. Array functions expand ranges and then perform the provided action. Therefore, it is comparing each cell to same cell and resulting into 1.
Upvotes: 0
Reputation: 7762
First of all, excellently laid-out question, and on an interesting topic to boot.
I also raised an eyebrow when I first came across this behaviour of the COUNTIF(S)
/SUMIF(S)
functions. In their defence, I suppose we could construct situations in which we actually want strings and numerics to be considered separately.
In order to construct your required in-formula array, you will need something like:
MMULT(0+(TRANSPOSE($A$1:$A$6)<=$A$1:$A$6),ROW($A$1:$A$6)^0)
though note that the necessary transposition will mean that any set-up which includes this construction will require committing with CSE.
Regards
Upvotes: 3