Reputation: 71
I have a dataframe consisting of 3 columns and n rows.
My dataframe before grouping looks like
Index Max_Mass (kg/m) Max_Diameter (m)
1 10 1
2 20 2
3 30 3
200 5 4
201 60 3
202 20 2
300 90 1
301 3 1
302 10 1
400 100 1
401 10 1
402 10 1
I group the dataframe by cutting it every 100 rows so that I can find the maximum value of a specific column every 100th row using:
groups = output_df.groupby(pd.cut(output_df.index, range(0,len(output_df), 100)))
I am using the below to find the maximums of column 'Max Mass (kg/m)':
groups.max()['Max Mass (kg/m)']
I now want to make another df which will contain the max value found and the index of that value. How do I retrieve the index? I have tried using the below, but from my understanding, it only works for a single value, while the line above returns me a column of all maximums.
(groups.max()['Max Mass (kg/m)']).getidx()
My expected output (for the DataFrame above) would be
My new dataframe that I want to create should look like;
Index Max_Mass (kg/m) Max_Diameter (m)
3 30 3
201 60 3
300 90 1
400 100 1
Upvotes: 3
Views: 2417
Reputation: 402814
Comments inlined.
# Initialise the grouper.
grouper = df.Index // 100
# Get list of indices corresponding to the max using `apply`.
idx = df.groupby(grouper).apply(
lambda x: x.set_index('Index')['Max_Mass (kg/m)'].idxmax())
# Compute the max and update the other columns based on `idx` computed previously.
v = df.groupby(grouper, as_index=False)['Max_Mass (kg/m)'].max()
v['Index'] = idx.values
v['Max_Diameter (m)'] = df.loc[df.Index.isin(v.Index), 'Max_Diameter (m)'].values
print(v)
Max_Mass (kg/m) Index Max_Diameter (m)
0 30 3 3
1 60 201 3
2 90 300 1
3 100 400 1
Upvotes: 1
Reputation: 652
Instead of using groups.max() you can use groups.idxmax()
. Then use the index to get the max values. Now you have everything you need.
Upvotes: 0