Brainfail
Brainfail

Reputation: 306

Dataframe use grouped row values as a row name

i just started to work with python (pandas) and now i have my first question. I have a dataframe with the following row names:

ID  A    Class
1  True  [0,5]
2  False [0,5]
3  True  [5,10]
4  False [10,20]
5  True  [0,5]
6  False [10,20]

Now i'm looking for a cool solution, where i can do something like this:

Class  True   False
[0,5]   2      1
[5,10]  1      0
[10,20] 0      2

I want to count how much True and False i have for a Class Is there a fast solution? My Dataframe could have more than 2 million entries.

Upvotes: 1

Views: 89

Answers (2)

root
root

Reputation: 33793

You can use pivot_table to do the aggregation. After that, it's just a matter of formatting the column names and index to match your desired output.

# Perform the pivot and aggregation.
df = pd.pivot_table(df, index='Class', columns='A', aggfunc='count', fill_value=0)

# Format column names and index to match desired output.
df.columns = [c[1] for c in df.columns]
df.reset_index(inplace=True)

The resulting output:

     Class  False  True
0    [0,5]      1     2
1  [10,20]      2     0
2   [5,10]      0     1

Edit:

The above solution assumes that the elements of the 'Class' column are strings. If they are lists, you could do the following:

df['Class'] = df['Class'].map(tuple)
**original solution code here**
df['Class'] = df['Class'].map(list)

Upvotes: 4

Fabio Lamanna
Fabio Lamanna

Reputation: 21552

Let df be your dataframe, I would first use:

g = df.groupby('Class')['A'].value_counts().reset_index()

that returns:

     Class      A  0
0    [0,5]   True  2
1    [0,5]  False  1
2  [10,20]  False  2
3   [5,10]   True  1

then I would pivot the above table to get your desired shape:

a = pd.pivot_table(g, index='Class', columns='A', values=0).fillna(0)

This returns:

A        False  True 
Class                
[0,5]      1.0    2.0
[10,20]    2.0    0.0
[5,10]     0.0    1.0

Upvotes: 1

Related Questions