Reputation: 595
Suppose I have a dataframe, df
, consisting of a class of two objects, S
, a set of co-ordinates associated with them, X
and Y
, and a value, V
, that was measured there.
The dataframe looks like this:
S X Y V
0 1 1 1
1 2 2 1
1 9 9 2
0 9 9 8
I would like to know the commands that allow me to go from this picture to the one where each S
is converted to a series of columns where:
V_s
represents the sum of all the shared X-Y
coordinates;F0
and F1
represent the fractions of the V for each possible class, S
.For example:
X Y V_s F0 F1
1 1 1 1.0 0.0
2 2 1 0.0 1.0
9 9 10 0.2 0.8
I can sum and fraction calculate the fraction by using
df['V_s'] = df.groupby(['X', 'Y'])['V'].transform('sum')
df['F'] = df['V']/df['V_s']
What are the next steps?
Upvotes: 0
Views: 725
Reputation: 150785
You could try this:
(df.groupby(['X','Y','S']).sum()
.unstack('S', fill_value=0)['V']
.rename(columns=lambda x: f"F{x}")
.assign(V_s=lambda x: x.sum(1),
F0 =lambda x: x['F0']/x['V_s'],
F1 =lambda x: x['F1']/x['V_s'])
.reset_index()
)
Output:
S X Y F0 F1 V_s
0 1 1 1.0 0.0 1
1 2 2 0.0 1.0 1
2 9 9 0.8 0.2 10
Update for unknown/large number of classes in S
:
new_df = (df.groupby(['X','Y','S']).sum()
.unstack('S', fill_value=0)['V']
.rename(columns=lambda x: f"F{x}")
)
vs = new_df.sum(1)
new_df = (new_df.div(vs,axis='rows')
.assign(V_s=vs)
.reset_index()
)
And you get same output.
Upvotes: 1