Reputation: 941
I have a given df:
ID Name Alias Ttl Value
1 abc A 10 2
1 abc B 10 3
1 abc C 10 1
1 ijk A 5 2
1 ijk B 5 2
2 ijk A 5 2
2 ijk B 5 2
I want insert a "leftover" row for each group by ID and Name. What I mean is this:
Output:
ID Name Alias Ttl Value
1 abc A 10 2
1 abc B 10 3
1 abc C 10 1
1 abc Z 10 4 <---10 - (2+3+1)
1 ijk A 5 2
1 ijk B 5 2
1 ijk Z 5 1 <---5 - (2+2)
2 ijk A 5 2
2 ijk B 5 2
2 ijk Z 5 1 <---5 - (2+2)
I'm not sure how to approach this problem. Is it possible in python?
Many thanks!
Upvotes: 0
Views: 27
Reputation: 153460
You can use pd.concat
, groupby
, agg
, and eval
:
(pd.concat([df,
df.groupby(['ID','Name'])['Ttl','Value']
.agg({'Ttl':'first','Value':'sum'})
.eval('Value = Ttl - Value').reset_index()])
.sort_values(['ID','Name'])
.fillna('Z'))
Output:
Alias ID Name Ttl Value
0 A 1 abc 10 2
1 B 1 abc 10 3
2 C 1 abc 10 1
0 Z 1 abc 10 4
3 A 1 ijk 5 2
4 B 1 ijk 5 2
1 Z 1 ijk 5 1
5 A 2 ijk 5 2
6 B 2 ijk 5 2
2 Z 2 ijk 5 1
Upvotes: 2
Reputation: 323226
Is this what you need ?
df.append((df.groupby(['ID','Name']).\
apply(lambda x : x['Ttl'].head(1)-
sum(x['Value']))).\
reset_index().\
drop('level_2',1).rename(columns={'Ttl':'Value'})).\
fillna({'Alias':'Z'}).\
sort_values(['ID','Name']).\
ffill()
Out[446]:
Alias ID Name Ttl Value
0 A 1 abc 10.0 2
1 B 1 abc 10.0 3
2 C 1 abc 10.0 1
0 Z 1 abc 10.0 4
3 A 1 ijk 5.0 2
4 B 1 ijk 5.0 2
1 Z 1 ijk 5.0 1
5 A 2 ijk 5.0 2
6 B 2 ijk 5.0 2
2 Z 2 ijk 5.0 1
Upvotes: 2