Reputation: 61
I have some data that I want to analyze. I group my data by the relevant group variables (here, 'test_condition' and 'region') and analyze the measure variable ('rt') with a function I wrote:
grouped = data.groupby(['test_condition', 'region'])['rt'].apply(summarize)
That works fine. The output looks like this (fake data):
ci1 ci2 mean
test_condition region
Test Condition Name And 0 295.055978 338.857066 316.956522
Spill1 0 296.210167 357.036210 326.623188
Spill2 0 292.955327 329.435977 311.195652
The problem is, 'test_condition' and 'region' are not actual columns, I can't index into them. I just want columns with the names of the group variables! This seems so simple (and is automatically done in R's ddply) but after lots of googling I have come up with nothing. Does anyone have a simple solution?
Upvotes: 1
Views: 151
Reputation: 139152
By default, the grouping variables are turned into an index. You can change the index to columns with grouped.reset_index()
.
My second suggestion to specify this in the groupby call with as_index=False
, seems not to work as desired in this case with apply
(but it does work when using aggregate
)
Upvotes: 2