Reputation: 87
Here's how the dataframe, df, looks like. I want to group by the position and to calculate the value in each group of level using the function I defined.
data
ID Position Level Value
001 N L 15
001 N H 21
001 N L 17
001 S L 12
001 S H 20
001 S L 16
Here's how the function looks like
def rule(df):
h_value = df.loc[df['level'] == 'H'], value]
l_value = df.loc[df['level'] == 'L'], value].mean()
df['dtm'] = h_value - l_value
return df
Here's the groupby code.
new_df = df.groupby(['ID','Position']).apply(lambda x: rule(x))
but the result(new_df) shows
ID Position Level Value dtm expected result of dtm
001 N L 15 Nan 5
001 N H 21 5 5
001 N L 17 Nan 5
001 S L 12 Nan 6
001 S H 20 6 6
001 S L 16 Nan 6
How can I fill the missing values with the same value of dtm in each group of Position. Thanks
Upvotes: 1
Views: 391
Reputation: 22493
h_value = df.loc[df['Level'] == 'H', "Value"]
returns a pd.Series
. The minus operation also returns a Series
which would assign the value only by the Index
.
To fix this, you can add iat[0]
, i.e. h_value = df.loc[df['Level'] == 'H', "Value"].iat[0]
, and the result would properly propagate.
ID Position Level Value dtm
0 1 N L 15 5.0
1 1 N H 21 5.0
2 1 N L 17 5.0
3 1 S L 12 6.0
4 1 S H 20 6.0
5 1 S L 16 6.0
Upvotes: 2