Reputation: 400
I have dataframe rounds
(which was the result of deleting a column from another dataframe) with the following structure (can't post pics, sorry):
----------------------------
|type|N|D|NATC|K|iters|time|
----------------------------
rows of data
----------------------------
I use groupby
so I can then get the mean of the groups, like so:
rounds = results.groupby(['type','N','D','NATC','K','iters'])
results_mean = rounds.mean()
I get the means that I wanted but I get a problem with the keys. The results_mean
dataframe has the following structure:
----------------------------
| | | | | | |time|
|type|N|D|NATC|K|iters| |
----------------------------
rows of data
----------------------------
The only key recognized is time
(I executed results_mean.keys()
).
What did I do wrong? How can I fix it?
Upvotes: 2
Views: 2540
Reputation: 1
I've got the same problem of losing the dataframes's
keys due to the use of the group_by()
function and the answer I found for that problem was to convert the Dataframe into a CSV file then read this file.
Upvotes: 0
Reputation: 18446
In your aggregated data, time
is the only column. The other ones are indices.
groupby
has a parameter as_index
. From the documentation:
as_index : boolean, default True
For aggregated output, return object with group labels as the index. Only relevant for DataFrame input. as_index=False is effectively “SQL-style” grouped output
So you can get the desired output by calling
rounds = results.groupby(['type','N','D','NATC','K','iters'], as_index = False)
results_mean = rounds.mean()
Or, if you want, you can always convert indices to keys by using reset_index
. Using
rounds = results.groupby(['type','N','D','NATC','K','iters'])
results_mean = rounds.mean().reset_index()
should have the desired effect as well.
Upvotes: 7