Reputation: 2158
I have list of tuple in each row of Pandas dataframe. I am trying to apply the first element of the tuple in a list to a dictionary's value. I found regex way to do it however, it doesn't seem like an ideal way to write a code. I am wondering if there is better way to handle it than what I am doing in Try#1?
from operator import itemgetter
import pandas as pd
# sample data
l1 = ['1','2','3']
l2 = ['test1','test2','test3']
l3 = [[(1,0.95)],[(2,0.10),(3,0.20)],[(3,0.30)]]
df = pd.DataFrame({'id':l1,'text':l2,'score':l3})
print(df)
#Preview:
id text score
1 test1 [(1, 0.95)]
2 test2 [(2, 0.1), (3, 0.2)]
3 test3 [(3, 0.3)]
di = {1:'Math',2:'Science',3:'History',4:'Physics'}
Try #1: Does the trick but it laborious, manual, and not ideal way.
df['score'].astype(str).str.replace('1,','Math,').str.replace('2,','Science,').str.replace('3,','History,').str.replace('4,','Science,')
Try#2: Getting all NaNs even if I convert to string.
df["score"].astype(str).map(di)
Looking for the output like this:
#Preview:
id text score
1 test1 [(Math, 0.95)]
2 test2 [(Science, 0.1), (History, 0.2)]
3 test3 [(History, 0.3)]
Upvotes: 0
Views: 86
Reputation: 28709
A list comprehension could help out here; also Pandas efficiency is somewhat hampered when other data structures are embedded in it.
df["score"] = [[(di[left], right)
for left, right in entry]
for entry in df.score]
df
id text score
0 1 test1 [(Math, 0.95)]
1 2 test2 [(Science, 0.1), (History, 0.2)]
2 3 test3 [(History, 0.3)]
Upvotes: 1