JPC
JPC

Reputation: 5173

How to sort pandas dataframe from list category?

so I have this data set below that I want to sort base on mylist from column 'name' as well as acsending by 'A' and descending by 'B'

import pandas as pd
import numpy as np

    df1 = pd.DataFrame.from_items([('A', [1, 2, 3]), ('B', [4, 5, 6]) , ('name', ['x','x','x'])])
    df2 = pd.DataFrame.from_items([('B', [5, 6, 7]), ('A', [8, 9, 10]) , ('name', ['y','y','y'])])
    df3 = pd.DataFrame.from_items([('C', [5, 6, 7]), ('D', [8, 9, 10]), ('A',[1,2,3]), ('B',[4,5,7] ), ('name', ['z','z','z'])])
    df_list = [df1,df2,df3[['A','B','name']]]
    df = pd.concat(df_list, ignore_index=True)

so my list is

  mylist = ['z','x','y']

I want the dataset to start with sort by my list , then sort asc column A then desc column B

is there a way to do this in python ?

enter image description here

======== Edit ========== I want my final result to be something like

enter image description here

Upvotes: 3

Views: 10213

Answers (2)

Rajesh Doolla
Rajesh Doolla

Reputation: 21

Hi We can do the above issue by using the following:

t = pd.CategoricalDtype(categories=['z','x','y'], ordered=True)
df['sort'] = pd.Series(df.name, dtype=t)
df.sort_values(by=['sort','A','B'], inplace=True)

Upvotes: 2

EdChum
EdChum

Reputation: 394239

OK, a way to sort by a custom order is to create a dict that defines how 'name' column should be order, call map to add a new column that defines this new order, then call sort and pass in the new column and the others, plus the param ascending where you selectively decide whether each column is sorted ascending or not, and then finally drop that column:

In [20]:

name_sort = {'z':0,'x':1,'y':2}
df['name_sort'] = df.name.map(name_sort)
df
Out[20]:
    A  B name  name_sort
0   1  4    x          1
1   2  5    x          1
2   3  6    x          1
3   8  5    y          2
4   9  6    y          2
5  10  7    y          2
6   1  4    z          0
7   2  5    z          0
8   3  7    z          0
In [23]:

df = df.sort(['name_sort','A','B'], ascending=[1,1,0])
df
Out[23]:
    A  B name  name_sort
6   1  4    z          0
7   2  5    z          0
8   3  7    z          0
0   1  4    x          1
1   2  5    x          1
2   3  6    x          1
3   8  5    y          2
4   9  6    y          2
5  10  7    y          2
In [25]:

df = df.drop('name_sort', axis=1)
df
Out[25]:
    A  B name
6   1  4    z
7   2  5    z
8   3  7    z
0   1  4    x
1   2  5    x
2   3  6    x
3   8  5    y
4   9  6    y
5  10  7    y

Upvotes: 4

Related Questions