Reputation: 306
i just started to work with python (pandas
) and now i have my first question.
I have a dataframe with the following row names:
ID A Class
1 True [0,5]
2 False [0,5]
3 True [5,10]
4 False [10,20]
5 True [0,5]
6 False [10,20]
Now i'm looking for a cool solution, where i can do something like this:
Class True False
[0,5] 2 1
[5,10] 1 0
[10,20] 0 2
I want to count how much True
and False
i have for a Class
Is there a fast solution? My Dataframe could have more than 2 million entries.
Upvotes: 1
Views: 89
Reputation: 33793
You can use pivot_table
to do the aggregation. After that, it's just a matter of formatting the column names and index to match your desired output.
# Perform the pivot and aggregation.
df = pd.pivot_table(df, index='Class', columns='A', aggfunc='count', fill_value=0)
# Format column names and index to match desired output.
df.columns = [c[1] for c in df.columns]
df.reset_index(inplace=True)
The resulting output:
Class False True
0 [0,5] 1 2
1 [10,20] 2 0
2 [5,10] 0 1
Edit:
The above solution assumes that the elements of the 'Class'
column are strings. If they are lists, you could do the following:
df['Class'] = df['Class'].map(tuple)
**original solution code here**
df['Class'] = df['Class'].map(list)
Upvotes: 4
Reputation: 21552
Let df
be your dataframe, I would first use:
g = df.groupby('Class')['A'].value_counts().reset_index()
that returns:
Class A 0
0 [0,5] True 2
1 [0,5] False 1
2 [10,20] False 2
3 [5,10] True 1
then I would pivot the above table to get your desired shape:
a = pd.pivot_table(g, index='Class', columns='A', values=0).fillna(0)
This returns:
A False True
Class
[0,5] 1.0 2.0
[10,20] 2.0 0.0
[5,10] 0.0 1.0
Upvotes: 1