Reputation: 27
If I have a list of values in range C3:C15, how does this formula calculate the unique values?
=SUMPRODUCT((C3:C15<>"")/COUNTIF(C3:C15,C3:C15&""))
Customer |
---|
Customer1 |
Customer5 |
Customer2 |
Customer4 |
Customer4 |
Customer6 |
Customer1 |
Customer2 |
Customer1 |
Customer3 |
Customer3 |
Customer3 |
Customer4 |
Upvotes: 1
Views: 70
Reputation: 2877
a) C3:C15<>""
Evaluates to 1 for all cells that have something in them and 0 for blank cells.
b) COUNTIF(C3:C15, C3:C15&"")
returns an array showing how many times each number appears in the range. So if the first value in the range also appears once more elsewhere in the range, the first value of the returned array is 2 (it appears twice). As per @JodyHighroller's comment, the &""
ensures that blank cells are counted as a value to avoid divide-by-zero errors in the below step.
c) Dividing (a) by (b) returns, for non-blank values, 1 for values that appear once, 1/2 for values that appear twice, 1/3 for values that appear 3 times and so on. Each unique list item will be given values adding up to 1. Because step (a) evaluates to 0 for blank cells, all blank cells evaluate to 0 at this stage (0/1 = 0, 0/2 = 0, etc).
d) SUMPRODUCT(...)
of the above adds them all together, each unique value totals to 1 so the total of all of that is the number of unique values. (SUM
would do the same thing really)
An easier-to-understand formula that does the same thing would be =COUNT(UNIQUE(C3:C15) - IF(COUNTBLANK(C3:C15)>0, 1, 0)
(that second section is needed because the UNIQUE
function returns 0 to represent blank cells). However this wouldn't work in some older versions of Excel, the unique function is quite a new thing.
Upvotes: 0
Reputation: 1029
It is sometimes easiest to break the formula into its pieces and go from there.
=C3:C15<>""
Will return TRUE for any cell that is not blank. SUMPRODUCT will then treat that Boolean TRUE as a 1. If it returned FALSE it would be treated as a 0.
=COUNTIF(C3:C15,C3:C15&"")
Will return the number of times that value is found in the list. By adding the &""
at the end, you are guaranteeing that will you always return at least a 1, eliminating any divide by zero errors.
When combined, you get something that looks like this:
Source | =C3:C15<>"" | =COUNTIF(C3:C15,C3:C15&"") | COL B / COL C |
---|---|---|---|
Customer1 | 1 | 3 | 0.333333 |
Customer5 | 1 | 1 | 1 |
Customer2 | 1 | 2 | 0.5 |
Customer4 | 1 | 3 | 0.333333 |
Customer4 | 1 | 3 | 0.333333 |
Customer6 | 1 | 1 | 1 |
Customer1 | 1 | 3 | 0.333333 |
Customer2 | 1 | 2 | 0.5 |
Customer1 | 1 | 3 | 0.333333 |
Customer3 | 1 | 3 | 0.333333 |
Customer3 | 1 | 3 | 0.333333 |
Customer3 | 1 | 3 | 0.333333 |
Customer4 | 1 | 3 | 0.333333 |
When you SUM the division column, by using the SUMPRODUCT function, you get a total of six, for the six unique values.
If you are using Excel 365+ you would get the same result by using:
=COUNTA(UNIQUE(FILTER(C3:C14,C3:C14<>"")))
Upvotes: 2