sharp
sharp

Reputation: 2158

pandas map dictionary to column that has tuples in list

I have list of tuple in each row of Pandas dataframe. I am trying to apply the first element of the tuple in a list to a dictionary's value. I found regex way to do it however, it doesn't seem like an ideal way to write a code. I am wondering if there is better way to handle it than what I am doing in Try#1?

from operator import itemgetter
import pandas as pd

# sample data
l1 = ['1','2','3']
l2 = ['test1','test2','test3']
l3 = [[(1,0.95)],[(2,0.10),(3,0.20)],[(3,0.30)]]

df = pd.DataFrame({'id':l1,'text':l2,'score':l3})
print(df)

#Preview: 
id   text                 score
1  test1           [(1, 0.95)]
2  test2  [(2, 0.1), (3, 0.2)]
3  test3            [(3, 0.3)]

di = {1:'Math',2:'Science',3:'History',4:'Physics'}

Try #1: Does the trick but it laborious, manual, and not ideal way.
 
df['score'].astype(str).str.replace('1,','Math,').str.replace('2,','Science,').str.replace('3,','History,').str.replace('4,','Science,')

Try#2: Getting all NaNs even if I convert to string.
df["score"].astype(str).map(di)


Looking for the output like this: 
   #Preview: 
    id   text                 score
    1  test1           [(Math, 0.95)]
    2  test2           [(Science, 0.1), (History, 0.2)]
    3  test3           [(History, 0.3)]

Upvotes: 0

Views: 86

Answers (1)

sammywemmy
sammywemmy

Reputation: 28709

A list comprehension could help out here; also Pandas efficiency is somewhat hampered when other data structures are embedded in it.

df["score"] = [[(di[left], right) 
                 for left, right in entry] 
                 for entry in df.score]
df


    id  text    score
0   1   test1   [(Math, 0.95)]
1   2   test2   [(Science, 0.1), (History, 0.2)]
2   3   test3   [(History, 0.3)]

Upvotes: 1

Related Questions