Reputation: 45
I have 2 data frames grouped by 4 separate keys. I would like to assign the mean of a column of one group, to all the row values in a column in another group. As I understand it, this is how it should be done:
g_test.get_group((1, 5, 13, 8)).monthly_sales = \
g_train.get_group((1, 5, 13, 8)).monthly_sales.mean()
Except this does nothing. The values in monthly_sales of the group identified in g_test are unchanged. Can someone please explain what I am doing wrong and suggest alternatives?
These are the first few rows of g_train.get_group((1, 5, 13, 8))
year month day store item units monthly_sales
1 5 5 13 8 4 466
1 5 6 13 8 12 475
1 5 0 13 8 22 469
1 5 5 13 8 26 469
1 5 6 13 8 39 480
and these are the first few rows of g_test.get_group((1, 5, 13, 8))
year month day store item monthly_sales
1 5 1 13 8 0
1 5 2 13 8 0
1 5 3 13 8 0
1 5 4 13 8 0
1 5 5 13 8 0
Only the first few rows are shown, but the mean of g_train((1, 5, 13, 8)).monthly_sales is 450, which I want to be copied over to the monthly_sales column in g_test.
Edit: I now understand that, the code snippet below will work:
`df1.loc[(df1.year == 1)
& (df1.month == 5)
& (df1.store == 13)
& (df1.item == 8), 'monthly_sales'] = \
gb2.get_group((1, 5, 13, 8)).monthly_sales.mean()`
This operation is great for copying the mean once, however the whole reason I split the data frame into groups was to avoid these logic checks and do this multiple times for different store and item numbers. Is there something else I can do?
Upvotes: 1
Views: 173
Reputation: 45
Actually I just discovered a better way. g_test is part of dataframe 'test', so when I tried the line below it worked perfectly
test.loc[g_test.get_group((1, 5, 13, 8)).index, 'monthly_sales'] = \
g_train.get_group((1, 5, 13, 8)).monthly_sales.mean()
Upvotes: 0
Reputation: 109520
You need to assign the result back to the DataFrame, not the groupby object. This should work:
df1.loc[(df1.year == 1)
& (df1.month == 5)
& (df1.store == 13)
& (df1.item == 8), 'monthly_sales'] = \
gb2.get_group((1, 5, 13, 8)).monthly_sales.mean()
>>> gb1.get_group((1, 5, 13, 8))
year month day store item units monthly_sales
0 1 5 5 13 8 4 471.8
1 1 5 6 13 8 12 471.8
2 1 5 0 13 8 22 471.8
3 1 5 5 13 8 26 471.8
4 1 5 6 13 8 39 471.8
Upvotes: 1