kjd513
kjd513

Reputation: 11

Case sensitive distinct count from one column based on cell values in another column in EXCEL

Can you please help me with finding the count of distinct (case sensitive) products sold (Column B) per store (Column C) as per below data? Product_sold column has data from B2: B10. Store column has data from C2:C10. F2:F5 has the resulting expected values that the formula should calculate.

In Column F2, I want a formula which counts the distinct count (case sensitive) of product sold for the store "Costco". The answer should calculate to 2 in F2.

Image of Table with values attached in the link

Is there a way using SUMPRODUCT and EXACT to calculate distinct count(including case sensitive data) in Column B corresponding to the values in Column C?

I have the below formula which only looks up the range of B2:B10 and counts the occurrences of the value in B2. However, I am not able to expand this formula to make it look for the corresponding values in Column C and then calculate the distinct (case sensitive) count in Column B based on the value from Column C.

=SUMPRODUCT(--EXACT($B$2:$B$10,B2)) --> Is there a way to expand this formula maybe using IF statement and make the formula lookup the range in Column C?

Image of Table with data is in the link above.

Upvotes: 1

Views: 168

Answers (3)

Chris Strickland
Chris Strickland

Reputation: 3490

Take a look at this demo in sheets. This should work the same in Excel.

https://docs.google.com/spreadsheets/d/1MSmzVJbBbjP8HLufvEhpezDUZQ5ByV4sKa2uhKWq4Wk/edit?gid=0#gid=0

We can write a formula that will split a text into an array of letters by generating a SEQUENCE from 1 to LEN(productName) and then calling MAP on that array of indexes with MID to return each letter. We'll use LET to make writing this simpler:

=LET(
  prod, "Apples", 
  idx, SEQUENCE(1, LEN(prod)), 
  MAP(idx, LAMBDA(x, MID(prod, x, 1)))
)

Since Excel doesn't usually treat comparisons in a case sensitive fashion, let's convert this array of letters into an array of ASCII values by wrapping the lambda's calculation in CODE().

=LET(
  prod, "Apples", 
  idx, SEQUENCE(1, len(prod)), 
  MAP(idx, LAMBDA(x, CODE(MID(prod, x, 1))))
)

Now that we have a formula that will give us an array of code values, let's convert this into a lambda formula to make it easier to use. Lambda formulas just take a list of arguments, and then a calculation, then are followed by the values for the arguments.

=LAMBDA(
  prod, 
  let(
    idx, sequence(1, len(prod)), 
    map(idx, lambda(x, code(mid(prod, x, 1))))
  )
)("Apples")

Now that we have the LAMBDA formula, we can use LET to assign that formula a name:

=let(
  chars, LAMBDA(
    prod, let(
      idx, sequence(1, len(prod)), 
      map(idx, lambda(x, code(mid(prod, x, 1))))
    )
  ),
  chars("Apples")
)

And then inside the let we can call our formula using chars(productName).

We can use FILTER to get a list of products by store, like so:

=filter(productCol, storeCol="store")

And then we can use MAP on that array of products, and then we can call our function chars on that array to give us a list of arrays of product names converted to code arrays.

=let(
  chars, LAMBDA(prod, let(
    idx, sequence(1, len(prod)), 
    map(idx, lambda(x, code(mid(prod, x, 1))))
  )),
  prods, filter($B:$B, $C:$C="Costco"),
  map(prods, lambda(x, chars(x)))
)

We can use UNIQUE on this to get just the unique character arrays:

=let(
  chars, LAMBDA(prod, let(
    idx, sequence(1, len(prod)), 
    map(idx, lambda(x, code(mid(prod, x, 1))))
  )),
  prods, filter($B:$B, $C:$C="Costco"),
  UNIQUE(map(prods, lambda(x, chars(x))))
)

and we could call COUNTA on that except that it will return the unique number of values in the arrays, not the number of unique arrays. So what we will do is use TEXTJOIN on the output of chars so that we get a string of the character codes:

=let(
  chars, LAMBDA(prod, let(
    idx, sequence(1, len(prod)), 
    map(idx, lambda(x, code(mid(prod, x, 1))))
  )),
  prods, filter($B:$B, $C:$C="Costco"),
  UNIQUE(map(prods, lambda(x, TEXTJOIN("|", TRUE, chars(x)))))
)

and then we can apply COUNTA to that:

=let(
  chars, LAMBDA(prod, let(
    idx, sequence(1, len(prod)), 
    map(idx, lambda(x, code(mid(prod, x, 1))))
  )),
  prods, filter($B:$B, $C:$C="Costco"),
  COUNTA(UNIQUE(map(prods, lambda(x, TEXTJOIN("|", TRUE, chars(x))))))
)

In the demo spreadsheet, the store isn't hard coded, of course.

=let(
  chars, LAMBDA(prod, let(
    idx, sequence(1, len(prod)), 
    map(idx, lambda(x, code(mid(prod, x, 1))))
  )),
  prods, filter($B:$B, $C:$C=$E2),
  COUNTA(UNIQUE(map(prods, lambda(x, TEXTJOIN("|", TRUE, chars(x))))))
)

Copy it down and that should return the right answers. It is possible to do it as a dynamic array that doesn't require copying down, of course.

Note that Google Sheets has a COUNTUNIQUE function that can be used in place of COUNTA(UNIQUE())

Upvotes: 0

VBasic2008
VBasic2008

Reputation: 54948

Count Exact Distinct Values

Only Count (F2)

=LET(row_fields,C2:C11,values,B2:B11,result_row_fields,E2:E6,if_not_found,"",
    BYROW(result_row_fields,LAMBDA(r,LET(
        f,FILTER(values,row_fields=r),
        IFERROR(ROWS(UNIQUE(EXACT(f,TOROW(f)))),if_not_found)))))

All (H2)

  • No error handling since all stores exist.
=LET(row_fields,C2:C11,values,B2:B11,
    rrf,UNIQUE(row_fields),
    rv,BYROW(rrf,LAMBDA(r,LET(
        f,FILTER(values,row_fields=r),
        ROWS(UNIQUE(EXACT(f,TOROW(f))))))),
    HSTACK(rrf,rv))

Lambda in Name Manager, e.g. ExactDistinctCount:

=LAMBDA(row_fields,values,LET(rrf,UNIQUE(row_fields),rv,BYROW(rrf,LAMBDA(r,LET(f,FILTER(values,row_fields=r),ROWS(UNIQUE(EXACT(f,TOROW(f))))))),HSTACK(rrf,rv)))
=ExactDistinctCount(C2:C11,B2:B11)

enter image description here

How ROWS(UNIQUE(EXACT(f;TOROW(f)))) Works (K2)

=LET(row_fields,C2:C11,values,B2:B11,result_row_field,E2,
    f,FILTER(values,row_fields=result_row_field),
    ex,EXACT(f,TOROW(f)),
    un,UNIQUE(ex),
    ro,ROWS(un),
    IFNA(VSTACK("Field",result_row_field,"Filter",f,"Exact",ex,"Unique",un,"Rows",ro),""))

Upvotes: 1

Mayukh Bhattacharya
Mayukh Bhattacharya

Reputation: 27438

So, here is one way of doing this using MAP() + REDUCE() functions:

enter image description here


• Formula used in cell E2

=LET(
     _Data, B2:C10,
     _Store, TAKE(_Data,,-1),
     _UniqSt, UNIQUE(_Store),
     _Count, MAP(_UniqSt,LAMBDA(Σ, ROWS(REDUCE(,FILTER(TAKE(_Data,,1),Σ=_Store),
            LAMBDA(α,δ, IF(SUM(N(EXACT(α,δ))),α,VSTACK(α,δ))))))),
     HSTACK(_UniqSt, _Count))

One another alternative which can be used by older version, by tweaking and removing the Dynamic array functions:

=LET(
     _Store, C2:C10,
     _ProdSold, B2:B10,
     _Uniq, UNIQUE(_Store),
     _Exact, EXACT(_ProdSold,TOROW(_ProdSold)),
     _Count, MAP(E2:E5, LAMBDA(α, SUM(IF(α=_Store,1/
                 MMULT((_Exact)*(α=TOROW(_Store)),SEQUENCE(ROWS(_Store))^0),0)))),
     HSTACK(_Uniq, _Count))

And if applicable using GROUPBY()

=GROUPBY(C2:C10,B2:B10,LAMBDA(α, ROWS(UNIQUE(REPT(α, EXACT(α,TOROW(α)))))),,0)

Upvotes: 2

Related Questions