Reputation: 4197
class col2 col3 col4 col5
1 4 5 5 5
4 4 4.5 5.5 6
1 3.5 5 6 4.5
3 3 4 4 4
2 3 3.5 3.8 6.1
I have used hypothetical data in the example. The shape of the real DataFrame is 6680x1900. I have clustered these data into 50
labeled classes (1 to 50). How can I sort this data in ascending order of class
labels?
I have tried:
df.groupby([column_name_lst])["class"]
But it fails with this error:
TypeError: You have to supply one of 'by' and 'level'
How to solve this problem? Expected output is:
class col2 col3 col4 col5
1 4 5 5 5
1 3.5 5 6 4.5
2 3 3.5 3.8 6.1
3 3 4 4 4
4 4 4.5 5.5 6
Upvotes: 4
Views: 14700
Reputation: 863291
I think you can use DataFrame.sort_values
if class
is Series
:
print (type(df['class']))
<class 'pandas.core.series.Series'>
print (df.sort_values(by='class'))
class col2 col3 col4 col5
0 1 4.0 5.0 5.0 5.0
2 1 3.5 5.0 6.0 4.5
4 2 3.0 3.5 3.8 6.1
3 3 3.0 4.0 4.0 4.0
1 4 4.0 4.5 5.5 6.0
Also if need groupby
, use parameter by
:
print (df.groupby(by='class').sum())
col2 col3 col4 col5
class
1 7.5 10.0 11.0 9.5
2 3.0 3.5 3.8 6.1
3 3.0 4.0 4.0 4.0
4 4.0 4.5 5.5 6.0
And if class
is index
, use Kartik solution
:
print (df.index)
Int64Index([1, 4, 1, 3, 2], dtype='int64', name='class')
print (df.sort_index())
col2 col3 col4 col5
class
1 4.0 5.0 5.0 5.0
1 3.5 5.0 6.0 4.5
2 3.0 3.5 3.8 6.1
3 3.0 4.0 4.0 4.0
4 4.0 4.5 5.5 6.0
Also if need groupby
, use parameter level
:
print (df.groupby(level='class').sum())
col2 col3 col4 col5
class
1 7.5 10.0 11.0 9.5
2 3.0 3.5 3.8 6.1
3 3.0 4.0 4.0 4.0
4 4.0 4.5 5.5 6.0
or index
, but first solution is better, because is more general:
print (df.groupby(df.index).sum())
col2 col3 col4 col5
class
1 7.5 10.0 11.0 9.5
2 3.0 3.5 3.8 6.1
3 3.0 4.0 4.0 4.0
4 4.0 4.5 5.5 6.0
Upvotes: 4
Reputation: 8703
If you are starting with the data in your question:
class col2 col3 col4 col5 1 4 5 5 5 4 4 4.5 5.5 6 1 3.5 5 6 4.5 3 3 4 4 4 2 3 3.5 3.8 6.1
And want to sort that, then it depends on whether 'class'
is an index or column. If index:
df.sort_index()
should give you the answer. If column, follow answer by @jezarael
Upvotes: 1