Reputation: 321
I have a DataFrame that is something similar to this
id name value
a Adam 5
b Eve 6
c Adam 4
a Eve 3
d Seth 2
b Adam 4
a Adam 2
I am trying to see how many id
s are associated with how many names and the overlap between them. I did a groupby on the id column and then I could see how many id's have how many names associated with them.
df.groupby('id')['name'].nunique().value_counts()
What I would now like is a way to get a table where the names are the column names, and index is the id, and the value is the sum for each id and name. I could do it for a for loop, by initializing a DataFrame where the columns are the values in the name column but I am wondering if there is a pandas way of accomplishing something like this?
Upvotes: 1
Views: 656
Reputation: 210852
is that what you want?
In [54]: df.pivot_table(index='id', columns='name', values='value', aggfunc='sum')
Out[54]:
name Adam Eve Seth
id
a 7.0 3.0 NaN
b 4.0 6.0 NaN
c 4.0 NaN NaN
d NaN NaN 2.0
or without NaN's:
In [56]: df.pivot_table(index='id', columns='name', values='value', aggfunc='sum', fill_value=0)
Out[56]:
name Adam Eve Seth
id
a 7 3 0
b 4 6 0
c 4 0 0
d 0 0 2
Upvotes: 2