whatdahil
whatdahil

Reputation: 57

easiest way to aggregate values over multiple columns?

I have a table that looks like this:

classification animal1 animal2 animal3
mammal dog cat
bird parrot owl
mammal cat zebra lion
fish goldfish tuna
bird ostrich parrot chicken

My goal is to make a summary of each animal based on their classification, i.e. basically just to get a count of each animal and their corresponding classification such that:

| classification | animal | count | | -------- | -------- | -------- | | mammal | dog | 1 | | mammal | cat | 2 | | mammal | zebra | 1 | | mammal | lion | 1 | | bird | parrot | 2 | | bird | owl | 1 | ...

The way I did this (and this might be a very beginner solution, I'm afraid I'm making it more complicated than it should be) is by doing the following:

classification animal count
mammal dog 1
mammal cat 2
mammal zebra 1
mammal lion 1
bird parrot 2
bird owl 1

First, I work in the second column. I get the unique list of all animals using array formula, unique, and flatten at the same time like:

=ARRAYFORMULA(UNIQUE(FLATTEN(B2:D)))

This gives me a list of all the animals from columns B to D.

Then, to get the value in the first column, I add a column in the original raw file that concatenates the animals in the 3 columns:

classification animal1 animal2 animal3 concat
mammal dog cat dog, cat
bird parrot owl parrot, owl
mammal cat zebra lion cat, zebra, lion
fish goldfish tuna goldfish, tuna
bird ostrich parrot chicken ostrich, parrot, chicken

Then I use index match to get the classification

From there, I just use a COUNTIF on the third column to count the instances of each animal in the columns B:D

Upvotes: 0

Views: 925

Answers (2)

z..
z..

Reputation: 13003

There's nothing wrong with how you solved the problem. I personally try to avoid helper columns, especially for simple problems so here's how I would do it.

=SORT(LET(cc,TOCOL(A2:A&"ζ"&B2:D),QUERY(UNIQUE({SPLIT(cc,"ζ"),COUNTIF(cc,cc)}),"WHERE Col2 IS NOT NULL")),1,0)

First we unpivot the table, then we run a COUNTIF and finally we remove the duplicate and blank rows and sort the result.

Btw, UNIQUE(FLATTEN(B2:D)) is not an array formula so you don't need an ARRAYFORMULA wrapper around it. Also, we have TOCOL now, which is a better version of FLATTEN, I recommend checking it out.

Upvotes: 2

rockinfreakshow
rockinfreakshow

Reputation: 30120

You may try:

=index(let(Σ,tocol(if(len(B2:D),A2:A&"|"&B2:D,),1),
           sort({split(unique(Σ),"|"),countif(Σ,unique(Σ))})))

enter image description here

Upvotes: 1

Related Questions