Reputation: 2615

How can I perform a value dependent pivot table/Groupby in Pandas?

I have the following dataframe:

      Tran ID    Category    Quantity 
   0   001          A           5
   1   001          B           2 
   2   001          C           3
   3   002          A           4
   4   002          C           2
   5   003          D           6

I want to transform it into:

      Tran ID       A      B      C      D       Quantity 
   0   001        True    True   True   False       10
   1   002        True    False  True   False        6
   2   003        False   False  False  True         6

I know I can use groupby to get the sum of quantity, but I can't figure out how to perform the pivot that I described.

Upvotes: 1

Answers (2)

U13-Forward

Reputation: 71610

Or you can use:

print(df.drop('Category',1).join(df['Category'].str.get_dummies().astype(bool)).groupby('Tran ID',as_index=False).sum())

Or little easier to read:

df1 = df.drop('Category',1).join(df['Category'].str.get_dummies().astype(bool))
print(df1.groupby('Tran ID',as_index=False).sum())

Both output:

   Tran ID  Quantity      A      B      C      D
0        1        10   True   True   True  False
1        2         6   True  False   True  False
2        3         6  False  False  False   True

pandas.DataFrame.groupby with pandas.Series.str.get_dummies is the way to do it.

Upvotes: 2

jezrael

Reputation: 863166

Use get_dummies for indicators with max and add new column with aggregating sum:

#pandas 0.23+
df1 = pd.get_dummies(df.set_index('Tran ID')['Category'], dtype=bool).max(level=0)
#oldier pandas versions
#df1 = pd.get_dummies(df.set_index('Tran ID')['Category']).astype(bool).max(level=0)
s = df.groupby('Tran ID')['Quantity'].sum()

df2 = df1.assign(Quantity = s).reset_index()
print (df2)
  Tran ID      A      B      C      D  Quantity
0     001   True   True   True  False        10
1     002   True  False   True  False         6
2     003  False  False  False   True         6

Upvotes: 2

How can I perform a value dependent pivot table/Groupby in Pandas?

Answers (2)

Related Questions