Reputation: 365
I have a dataframe of the below format. Let us call it df
flag1 | flag2 | type | count1 | count2 |
---|---|---|---|---|
a | x | new | 10 | 2 |
a | y | old | 40 | 5 |
a | x | old | 50 | 6 |
a | y | new | 15 | 1 |
I am trying to get the following format. (I could not merge the adjacent cells of count1 and count2)
count1 | count2 | ||||
---|---|---|---|---|---|
new | old | new | old | ||
a | x | 10 | 50 | 2 | 6 |
a | y | 15 | 40 | 1 | 5 |
I tried the following when i had to do the aggregate on only one column (count1) and the following worked:
pd.crosstab([df.flag1,df.flag2], df.type, values=df.count1, aggfunc='sum')
But since i want two columns of data, both count1 and count2, I tried the following but did not work out
pd.crosstab([df.flag1,df.flag2], df.type, values=[df.count1,df.count2], aggfunc=['sum','sum']) #trial1
pd.crosstab([df.flag1,df.flag2], df.type, values=[df.count1,df.count2], aggfunc='sum') #trial2
None of them worked.
Extension : I should be able use different functions on the different columns. say sum on count1 and nunique on count2 or sum on count1 and mean on count2
Upvotes: 0
Views: 1183
Reputation: 862741
I think crosstab
is not possible use here, alternative is DataFrame.pivot_table
:
df = df.pivot_table(index=['flag1','flag2'],
columns='type',
aggfunc={'count1':'sum', 'count2':'nunique'})
print (df)
count1 count2
type new old new old
flag1 flag2
a x 10 50 1 1
y 15 40 1 1
Another alternative with aggregation:
df = (df.groupby(['flag1','flag2','type'])
.agg({'count1':'sum', 'count2':'nunique'})
.unstack())
print (df)
count1 count2
type new old new old
flag1 flag2
a x 10 50 1 1
y 15 40 1 1
Upvotes: 1