justanewb
justanewb

Reputation: 133

Create new columns in dataframe using a dictionary mapping

I created a map where they key is a string and the value is a tuple. I also have a dataframe that looks like this

d = {'comments' : ['This is a bad website', 'The website is slow']}
df = pd.DataFrame(data = d)

The maps value for This is a bad website contains something like this

[("There isn't enough support to our site",
  'Staff Not Onsite',
  0.7323943661971831),
 ('I would like to have them on site more frequently',
  'Staff Not Onsite',
  0.6875)]

What I want to do now is create 6 new columns inside the data frame using the first two tuple entries in the map.

So what I would want is something like this

d = {'comments' : ['This is a bad website', 'The website is slow'],
     'comment_match_1' : ['There isn't enough support to our site', ''],
     'Negative_category_1' : ['Staff Not Onsite', ''],
     'Score_1' : [0.7323, 0],
     'comment_match_2' : ['I would like to have them on site more frequently', ''],
     'Negative_category_2' : ['Staff Not Onsite', ''],
     'Score_2' : [0.6875, 0]}
df = pd.DataFrame(data = d)

Any suggestions on how to achieve this are greatly appreciated.

Here is how I generated the map for reference

d = {}
a = []
for x, y in zip(df['comments'], df['negative_category']):
    for z in unlabeled_df['comments']:
        a.append((x, y, difflib.SequenceMatcher(None, x, z).ratio()))
        d[z] = a

Thus when I execute this line of code

d['This is a bad website']

I get

[("There isn't enough support to our site",
  'Staff Not Onsite',
  0.7323943661971831),
 ('I would like to have them on site more frequently',
  'Staff Not Onsite',
  0.6875), ...]

Upvotes: 1

Views: 91

Answers (1)

Shubham Sharma
Shubham Sharma

Reputation: 71689

You can recreate a mapping dictionary by flattening the values corresponding to each of the key in the dictionary, then with the help of Series.map substitute the values in the column comments from mapping dictionary, finally create new dataframe from these substituted values and join this new dataframe with the comments column:

mapping = {k: np.hstack(v) for k, v in d.items()}
df.join(pd.DataFrame(df['comments'].map(mapping).dropna().tolist()))

                comments                                       0                 1                   2                                                  3                 4       5
0  This is a bad website  There isn't enough support to our site  Staff Not Onsite  0.7323943661971831  I would like to have them on site more frequently  Staff Not Onsite  0.6875
1    The website is slow                                     NaN               NaN                 NaN                                                NaN               NaN     NaN

Upvotes: 1

Related Questions