Reputation: 13
I am new to python. I want to calculate the row-wise average based on unique IDs.
My DataFrame is:
ID Time[h] concentration[g/L]
15127 V527 23.425 59.9
20361 V527 27.570 73.4
21880 V527 29.281 75.4
33133 V560 27.677 75.9
35077 V560 30.183 75.7
37117 V560 31.847 74.6
I want to calculate the row wise average based on each ID. so that my output looks something like this
ID Time[h] concentration[g/L] avg [g/L]
15127 V527 23.425 59.9 NaN
20361 V527 27.570 73.4 66.5
21880 V527 29.281 75.4 74.4
33133 V560 27.677 75.9 NaN
35077 V560 30.183 75.7 66.5
37117 V560 31.847 74.6 75.8
I tried:
df.groupby(['ID'])['concentration[g/L]'].mean()
But this returned mean for each ID, as a whole.
So I tried this:
df.groupby(['ID'])['concentration[g/L]'].transform('mean')
This returns again the mean of each group, but fills to the same length of my df.
Can you please help me, if something is not clear I can rephrase my question.
Thanks in Advance!
Upvotes: 1
Views: 287
Reputation: 21709
You can use shift
:
df['avg'] = df.groupby('ID')['concentration[g/L]'].apply(lambda x: (x + x.shift())/2)
print(df)
ID Time[h] concentration[g/L] avg
15127 V527 23.425 59.9 NaN
20361 V527 27.570 73.4 66.65
21880 V527 29.281 75.4 74.40
33133 V560 27.677 75.9 NaN
35077 V560 30.183 75.7 75.80
37117 V560 31.847 74.6 75.15
Upvotes: 0
Reputation: 15872
Try using pd.rolling.mean
with a window of 2:
>>> df['avg [g/L]'] = df.groupby('ID')['concentration[g/L]'].rolling(2).mean().values
>>> df
ID Time[h] concentration[g/L] avg [g/L]
15127 V527 23.425 59.9 NaN
20361 V527 27.570 73.4 66.65
21880 V527 29.281 75.4 74.40
33133 V560 27.677 75.9 NaN
35077 V560 30.183 75.7 75.80
37117 V560 31.847 74.6 75.15
Upvotes: 1