Bopemagno
Bopemagno

Reputation: 15

KeyError while executing pandas count

I am practicing using pandas module with a pokedex list. I am having an issue using groupby while looking for two columns. It works with one column and it gives the amount of pokemon in the list. It's easier to add a column to count it like this:

df["Count"] = 1
df.groupby(df["Type 1"]).count()["Count"]

Having

|Type 1         |
:---------------
|Bug          69|
|Dark         31|
|Dragon       32|
|Electric     44|
|Fairy        17|
|Fighting     27|
|Fire         52|
|Flying        4|
|Ghost        32|
|...            |

But if I want to add the Type 2 in the count it gives error:

df.groupby(df["Type 1", "Type 2"]).count()["Count"]

giving:

KeyError: ('Type 1', 'Type 2')

What am I doing wrong?

Upvotes: 0

Views: 804

Answers (2)

SeaBean
SeaBean

Reputation: 23217

You can also use:

df.groupby([df["Type 1"], df["Type 2"]]).count()["Count"]

although for this case, you can use the simpler form:

df.groupby(["Type 1", "Type 2"]).count()["Count"]

groupby() does not support passing multi-column dataframe like df[["Type 1", "Type 2"]] but you can pass list of Series, like [ df["Type 1"], df["Type 2"] ] or simply ["Type 1", "Type 2"].

Quoting df inside groupby() is necessary in the following case:

df["Count"].groupby([df["Type_1"], df["Type_2"]]).count()

In this case, the simple form is invalid:

df["Count"].groupby(["Type_1", "Type_2"]).count()      # invalid

It is because in this case, a pandas Series df["Count"] instead of the whole dataframe df is used to call groupby(), groupby() cannot recognize the columns Type 1 and Type 2 since df["Count"] is the pandas object being processed.

Upvotes: 1

LCMa
LCMa

Reputation: 445

I think you have used a bad syntax for groupby function. Try :

df.groupby(by=["Type 1", "Type 2"]).count()

Upvotes: 2

Related Questions