Reputation: 169
I have a pandas dataframe like:
I have the data frame as like below one,
Input DataFrame
id ratio
0 1 5.00%
1 2 9.00%
2 3 6.00%
3 2 13.00%
4 1 19.00%
5 4 30.00%
6 3 5.5%
7 2 22.00%
How can I then group this like
id ratio
0 1 5.00%
4 1 19.00%
6 3 5.5%
2 3 6.00%
1 2 9.00%
3 2 13.00%
7 2 22.00%
5 4 30.00%
So essentially first looks at the ratio, takes the lowest for that value and groups the rest of the rows for which it has the same id. Then looks for the second lowest ratio and groups the rest of the ids again etc.
Upvotes: 1
Views: 73
Reputation: 42916
First convert your ratio
column to numeric.
Then we get the lowest rank
per group by using Groupby
Finally we sort based on rank
and numeric ratio
.
df['ratio_num'] = df['ratio'].str[:-1].astype(float).rank()
df['rank'] = df.groupby('id')['ratio_num'].transform('min')
df = df.sort_values(['rank', 'ratio_num']).drop(columns=['rank', 'ratio_num'])
id ratio
0 1 5.00%
1 1 19.00%
2 3 5.5%
3 3 6.00%
4 2 9.00%
5 2 13.00%
6 2 22.00%
7 4 30.00%
Upvotes: 2
Reputation: 195438
With help of pd.Categorical
:
d = {'id':[1, 2, 3, 2, 1, 4, 3, 2],
'ratio': ['5.00%', '9.00%', '6.00%', '13.00%', '19.00%', '30.00%', '5.5%', '22.00%']}
df = pd.DataFrame(d)
df['ratio_'] = df['ratio'].map(lambda x: float(x[:-1]))
df['id'] = pd.Categorical(df['id'], categories=df.sort_values(['id', 'ratio_']).groupby('id').head(1).sort_values(['ratio_', 'id'])['id'], ordered=True)
print(df.sort_values(['id', 'ratio_']).drop('ratio_', axis=1))
Prints:
id ratio
0 1 5.00%
4 1 19.00%
6 3 5.5%
2 3 6.00%
1 2 9.00%
3 2 13.00%
7 2 22.00%
5 4 30.00%
Upvotes: 0