isilya
isilya

Reputation: 61

How to output groupby variables when using .groupby() in pandas?

I have some data that I want to analyze. I group my data by the relevant group variables (here, 'test_condition' and 'region') and analyze the measure variable ('rt') with a function I wrote:

grouped = data.groupby(['test_condition', 'region'])['rt'].apply(summarize)

That works fine. The output looks like this (fake data):

                                           ci1         ci2        mean  
test_condition      region                                               
Test Condition Name And          0  295.055978  338.857066  316.956522   
                    Spill1       0  296.210167  357.036210  326.623188   
                    Spill2       0  292.955327  329.435977  311.195652   

The problem is, 'test_condition' and 'region' are not actual columns, I can't index into them. I just want columns with the names of the group variables! This seems so simple (and is automatically done in R's ddply) but after lots of googling I have come up with nothing. Does anyone have a simple solution?

Upvotes: 1

Views: 151

Answers (1)

joris
joris

Reputation: 139152

By default, the grouping variables are turned into an index. You can change the index to columns with grouped.reset_index().

My second suggestion to specify this in the groupby call with as_index=False, seems not to work as desired in this case with apply (but it does work when using aggregate)

Upvotes: 2

Related Questions