William Goodwin
William Goodwin

Reputation: 464

Pandas grouping and express as proportion

d = [{'name': 'tv', 'value': 10, 'amount': 35},
     {'name': 'tv', 'value': 10, 'amount': 14},
     {'name': 'tv', 'value': 15, 'amount': 23},
     {'name': 'tv', 'value': 34, 'amount': 56},
     {'name': 'radio', 'value': 90, 'amount': 35},
     {'name': 'radio', 'value': 90, 'amount': 65},
     {'name': 'radio', 'value': 100, 'amount': 50},
     {'name': 'dvd', 'value': 0.5, 'amount': 35},
     {'name': 'dvd', 'value': 0.2, 'amount': 40},
     {'name': 'dvd', 'value': 0.5, 'amount': 15}
     ]

df = pd.DataFrame(d)
dff = df.groupby(['name', 'value']).agg('sum').reset_index()
dfff = dff.groupby(['name']).apply(lambda x: round((x['amount']/x['amount'].sum())*100))
print(dff)
print(dfff)

    name  value  amount
0    dvd    0.2      40
1    dvd    0.5      50
2  radio   90.0     100
3  radio  100.0      50
4     tv   10.0      49
5     tv   15.0      23
6     tv   34.0      56


name    
dvd    0    44.0
       1    56.0
radio  2    67.0
       3    33.0
tv     4    38.0
       5    18.0
       6    44.0

I now want to take this dataset and concatenate the rows grouped on the name variable. The amount variable should be expressed as a proportion.

The final dataset should look like below, where the value is the first term and amount expressed as a proportion is the second term.

    name  concatenated_values
0    dvd  0.2, 44%, 0.5, 56%
1  radio  90, 67%, 100, 33%
.
.
.

Upvotes: 1

Views: 72

Answers (1)

jezrael
jezrael

Reputation: 863146

Use custom lambda function with flatten nested lists in GroupBy.apply:

dff = df.groupby(['name', 'value']).agg('sum').reset_index()
dff['amount'] = ((dff['amount'] / dff.groupby(['name'])['amount'].transform('sum')*100)
                    .round().astype(int).astype(str) + '%')

f = lambda x:  ', '.join(str(z) for y in x.to_numpy() for z in y)
d = dff.groupby('name')[['value','amount']].apply(f).reset_index(name='concatenated_values')
print(d)
    name              concatenated_values
0    dvd               0.2, 44%, 0.5, 56%
1  radio            90.0, 67%, 100.0, 33%
2     tv  10.0, 38%, 15.0, 18%, 34.0, 44%

Upvotes: 2

Related Questions