Reputation: 125
Given df
:
Value Machine
0 10 A
1 20 A
2 30 A
5 100 B
6 200 B
7 300 B
How can I calculate a new column df['norm']
where values for Machine A are divided (element-wise) by the average of values of Machine A, and the values for Machine B are divided by the average of values of Machine B, yielding:
Value Machine norm
0 10 A 0.5
1 20 A 1.0
2 30 A 1.5
5 100 B 0.5
6 200 B 1.0
7 300 B 1.5
Note that the number of unique machines is variable.
I can calculate the average of each machine using groupby
like this:
machine_mean = df.groupby("Machine").mean().reset_index()
I've tried looping through these means, but I can't figure out how to do the element-wise referencing, while at the same time only applying each mean to the right rows.
I thought list comprehension would work, but I had the same problem as above.
Upvotes: 0
Views: 35
Reputation: 304
You can use groupby().apply()
:
df['norm'] = df.groupby('Machine')['Value'].apply(lambda x: x / x.mean())
Upvotes: 0
Reputation: 323276
Let us do transform
with mean
df['norm'] = df['Value']/df.groupby('Machine')['Value'].transform('mean')
df
Out[209]:
Value Machine norm
0 10 A 0.5
1 20 A 1.0
2 30 A 1.5
5 100 B 0.5
6 200 B 1.0
7 300 B 1.5
Upvotes: 2
Reputation: 150745
Use groupby().transform
:
df['norm'] = df['Value']/df.groupby('Machine')['Value'].transform('sum')
Upvotes: 0