Xin
Xin

Reputation: 674

Sort column values and append index column into it

For the data frame, how would I sort the column values from highest to lowest and then append the index column into it while ignoring the NaN fields?

Data frame:

tmp2 = pd.DataFrame({'index': ['CARDINAL', 'DATE', 'EVENT', 'FAC'], 'Unnamed: 0': [0.166667, 0.833333, 'NaN', 'NaN'], 'name': [0.026578, 0.003322, 0.006645, 0.006645] })

    index     Unnamed: 0    name
0   CARDINAL  0.166667   0.026578
1   DATE      0.833333   0.003322
2   EVENT     NaN        0.006645
3   FAC       NaN        0.006645

Desired results

Unnamed: 0           name
0.833333 (DATE)      0.026578 (CARDINAL)
0.166667 (CARDINAL)  0.006645 (EVENT)
NaN                  0.006645 (FAC)
NaN                  0.003322 (DATE)

Upvotes: 1

Views: 47

Answers (3)

Gustav Rasmussen
Gustav Rasmussen

Reputation: 3961

Use string conversion, boolean mask/filter, reset the index, and finally drop old columns:

import pandas as pd

df = pd.DataFrame({'index': ['CARDINAL', 'DATE', 'EVENT', 'FAC'],
                   'Unnamed: 0': [0.166667, 0.833333, 'NaN', 'NaN'],
                   'name': [0.026578, 0.003322, 0.006645, 0.006645]
                   }
                  )

df['name'] = df['name'].astype(str) + ' (' + df['index'] + ')'
filt = df['Unnamed: 0'] != 'NaN'
df['Unnamed: 0'] = df['Unnamed: 0'][filt].astype(str) + ' (' + df['index'] + ')'

df.set_index(df['Unnamed: 0'], inplace=True)
df = df.drop(columns=["index", "Unnamed: 0"])
print(df)

Returning:

                                    name
Unnamed: 0                              
0.166667 (CARDINAL)  0.026578 (CARDINAL)
0.833333 (DATE)          0.003322 (DATE)
NaN                     0.006645 (EVENT)
NaN                       0.006645 (FAC)

Upvotes: 1

Rishabh Deep Singh
Rishabh Deep Singh

Reputation: 914

A similar but simple to remember solution:

df["name"] = df["name"].astype(str) + " (" + df["index"] + ")"
df.drop('index', 1)
Unnamed: 0        name
0   0.166667    0.026578 (CARDINAL)
1   0.833333    0.003322 (DATE)
2   NaN         0.006645 (EVENT)
3   NaN         0.006645 (FAC)

Upvotes: 1

BENY
BENY

Reputation: 323226

We need for loop here

tmp2.set_index('index',inplace=True)
newdf=pd.concat([tmp2[x].sort_values(ascending=False).dropna().reset_index().astype(str).agg('('.join,1)+')' for x in tmp2.columns] , 
                 keys=tmp2.columns ,
                 axis=1)
    
Out[30]: 
           Unnamed: 0                name
0      DATE(0.833333)  CARDINAL(0.026578)
1  CARDINAL(0.166667)       FAC(0.006645)
2                 NaN     EVENT(0.006645)
3                 NaN      DATE(0.003322)

Upvotes: 2

Related Questions