Reputation: 674
For the data frame, how would I sort the column values from highest to lowest and then append the index column into it while ignoring the NaN fields?
Data frame:
tmp2 = pd.DataFrame({'index': ['CARDINAL', 'DATE', 'EVENT', 'FAC'], 'Unnamed: 0': [0.166667, 0.833333, 'NaN', 'NaN'], 'name': [0.026578, 0.003322, 0.006645, 0.006645] })
index Unnamed: 0 name
0 CARDINAL 0.166667 0.026578
1 DATE 0.833333 0.003322
2 EVENT NaN 0.006645
3 FAC NaN 0.006645
Desired results
Unnamed: 0 name
0.833333 (DATE) 0.026578 (CARDINAL)
0.166667 (CARDINAL) 0.006645 (EVENT)
NaN 0.006645 (FAC)
NaN 0.003322 (DATE)
Upvotes: 1
Views: 47
Reputation: 3961
Use string conversion, boolean mask/filter, reset the index, and finally drop old columns:
import pandas as pd
df = pd.DataFrame({'index': ['CARDINAL', 'DATE', 'EVENT', 'FAC'],
'Unnamed: 0': [0.166667, 0.833333, 'NaN', 'NaN'],
'name': [0.026578, 0.003322, 0.006645, 0.006645]
}
)
df['name'] = df['name'].astype(str) + ' (' + df['index'] + ')'
filt = df['Unnamed: 0'] != 'NaN'
df['Unnamed: 0'] = df['Unnamed: 0'][filt].astype(str) + ' (' + df['index'] + ')'
df.set_index(df['Unnamed: 0'], inplace=True)
df = df.drop(columns=["index", "Unnamed: 0"])
print(df)
Returning:
name
Unnamed: 0
0.166667 (CARDINAL) 0.026578 (CARDINAL)
0.833333 (DATE) 0.003322 (DATE)
NaN 0.006645 (EVENT)
NaN 0.006645 (FAC)
Upvotes: 1
Reputation: 914
A similar but simple to remember solution:
df["name"] = df["name"].astype(str) + " (" + df["index"] + ")"
df.drop('index', 1)
Unnamed: 0 name
0 0.166667 0.026578 (CARDINAL)
1 0.833333 0.003322 (DATE)
2 NaN 0.006645 (EVENT)
3 NaN 0.006645 (FAC)
Upvotes: 1
Reputation: 323226
We need for loop here
tmp2.set_index('index',inplace=True)
newdf=pd.concat([tmp2[x].sort_values(ascending=False).dropna().reset_index().astype(str).agg('('.join,1)+')' for x in tmp2.columns] ,
keys=tmp2.columns ,
axis=1)
Out[30]:
Unnamed: 0 name
0 DATE(0.833333) CARDINAL(0.026578)
1 CARDINAL(0.166667) FAC(0.006645)
2 NaN EVENT(0.006645)
3 NaN DATE(0.003322)
Upvotes: 2