Reputation: 464
d = [{'name': 'tv', 'value': 10, 'amount': 35},
{'name': 'tv', 'value': 10, 'amount': 14},
{'name': 'tv', 'value': 15, 'amount': 23},
{'name': 'tv', 'value': 34, 'amount': 56},
{'name': 'radio', 'value': 90, 'amount': 35},
{'name': 'radio', 'value': 90, 'amount': 65},
{'name': 'radio', 'value': 100, 'amount': 50},
{'name': 'dvd', 'value': 0.5, 'amount': 35},
{'name': 'dvd', 'value': 0.2, 'amount': 40},
{'name': 'dvd', 'value': 0.5, 'amount': 15}
]
df = pd.DataFrame(d)
dff = df.groupby(['name', 'value']).agg('sum').reset_index()
dfff = dff.groupby(['name']).apply(lambda x: round((x['amount']/x['amount'].sum())*100))
print(dff)
print(dfff)
name value amount
0 dvd 0.2 40
1 dvd 0.5 50
2 radio 90.0 100
3 radio 100.0 50
4 tv 10.0 49
5 tv 15.0 23
6 tv 34.0 56
name
dvd 0 44.0
1 56.0
radio 2 67.0
3 33.0
tv 4 38.0
5 18.0
6 44.0
I now want to take this dataset and concatenate the rows grouped on the name
variable. The amount
variable should be expressed as a proportion.
The final dataset should look like below, where the value
is the first term and amount
expressed as a proportion is the second term.
name concatenated_values
0 dvd 0.2, 44%, 0.5, 56%
1 radio 90, 67%, 100, 33%
.
.
.
Upvotes: 1
Views: 72
Reputation: 863146
Use custom lambda function with flatten nested lists in GroupBy.apply
:
dff = df.groupby(['name', 'value']).agg('sum').reset_index()
dff['amount'] = ((dff['amount'] / dff.groupby(['name'])['amount'].transform('sum')*100)
.round().astype(int).astype(str) + '%')
f = lambda x: ', '.join(str(z) for y in x.to_numpy() for z in y)
d = dff.groupby('name')[['value','amount']].apply(f).reset_index(name='concatenated_values')
print(d)
name concatenated_values
0 dvd 0.2, 44%, 0.5, 56%
1 radio 90.0, 67%, 100.0, 33%
2 tv 10.0, 38%, 15.0, 18%, 34.0, 44%
Upvotes: 2