Reputation: 15
I am practicing using pandas module with a pokedex list. I am having an issue using groupby while looking for two columns. It works with one column and it gives the amount of pokemon in the list. It's easier to add a column to count it like this:
df["Count"] = 1
df.groupby(df["Type 1"]).count()["Count"]
Having
|Type 1 |
:---------------
|Bug 69|
|Dark 31|
|Dragon 32|
|Electric 44|
|Fairy 17|
|Fighting 27|
|Fire 52|
|Flying 4|
|Ghost 32|
|... |
But if I want to add the Type 2 in the count it gives error:
df.groupby(df["Type 1", "Type 2"]).count()["Count"]
giving:
KeyError: ('Type 1', 'Type 2')
What am I doing wrong?
Upvotes: 0
Views: 804
Reputation: 23217
You can also use:
df.groupby([df["Type 1"], df["Type 2"]]).count()["Count"]
although for this case, you can use the simpler form:
df.groupby(["Type 1", "Type 2"]).count()["Count"]
groupby()
does not support passing multi-column dataframe like df[["Type 1", "Type 2"]]
but you can pass list of Series, like [ df["Type 1"], df["Type 2"] ]
or simply ["Type 1", "Type 2"]
.
Quoting df
inside groupby()
is necessary in the following case:
df["Count"].groupby([df["Type_1"], df["Type_2"]]).count()
In this case, the simple form is invalid:
df["Count"].groupby(["Type_1", "Type_2"]).count() # invalid
It is because in this case, a pandas Series df["Count"]
instead of the whole dataframe df
is used to call groupby()
, groupby()
cannot recognize the columns Type 1
and Type 2
since df["Count"]
is the pandas object being processed.
Upvotes: 1
Reputation: 445
I think you have used a bad syntax for groupby function. Try :
df.groupby(by=["Type 1", "Type 2"]).count()
Upvotes: 2