KDIC92
KDIC92

Reputation: 31

Pandas new column using a existing column and a dictionary

I have a data frame that looks like:

df = pd.DataFrame({"user_id" : ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j'],
                   "score" : [0, 100, 50, 0, 25, 50, 100, 0, 7, 20],
                  "valval" : ["va2.3", "va1.1", "va2.1", "va2.2", "va1.2",
                             "va1.1", "va2.1", "va1.2", "va1.2", "va1.3"]})
   
print(df)


     | user_id | score | valval 
-----+---------+-------+--------
 0   |     a   |    0  | va2.3  
 1   |     b   |  100  | va1.1  
 2   |     c   |   50  | va2.1  
 3   |     d   |    0  | va2.2  
 4   |     e   |   25  | va1.2  
 5   |     f   |   50  | va1.1  
 6   |     g   |  100  | va2.1  
 7   |     h   |    0  | va1.2  
 8   |     i   |    7  | va1.2  
 9   |     j   |   20  | va1.3  

I also have a dictionary that looks like:

dic_t = { "key1" : ["va1.1", "va1.2", "va1.3"], "key2" : ["va2.1", "va2.2", "va2.3"]}

I want a new column "keykey".

This column´s values have the key of the dictionary of their corresponding value.

The result would look something like this:

     | user_id | score | valval | keykey 
----------------------------------------
 0   |     a   |    0  | va2.3  | key2
 1   |     b   |  100  | va1.1  | key1
 2   |     c   |   50  | va2.1  | key2
 3   |     d   |    0  | va2.2  | key2
 4   |     e   |   25  | va1.2  | key1
 5   |     f   |   50  | va1.1  | key1
 6   |     g   |  100  | va2.1  | key2
 7   |     h   |    0  | va1.2  | key1
 8   |     i   |    7  | va1.2  | key1
 9   |     j   |   20  | va1.3  | key1

Upvotes: 3

Views: 65

Answers (3)

Chris Wilson
Chris Wilson

Reputation: 37

Not the most efficient solution, but does the job done and is easy to follow


def get_keykey(search_val, ref_dict):
    for key in ref_dict:                       # loop over all keys
        if search_val in ref_dict[key]:        # if valval is in list of values associated with key, return that key, else will return None
            return key

# apply to val column of df

df["keykey"] = df["valval"].apply(get_keykey, args = (ref_dict,))

Upvotes: 0

Aditta Das
Aditta Das

Reputation: 36

Update the blank dictionary and use of map function

import pandas as pd
df = pd.DataFrame({"user_id" : ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j'],
                   "score" : [0, 100, 50, 0, 25, 50, 100, 0, 7, 20],
                   "valval" : ["va2.3", "va1.1", "va2.1", "va2.2", "va1.2", "va1.1", "va2.1", "va1.2", "va1.2", "va1.3"]})

dic_t = { "key1" : ["va1.1", "va1.2", "va1.3"], "key2" : ["va2.1", "va2.2", "va2.3"]}

d_keykey = {}
for k, v in dic_t.items():
    for val in v:
        d_keykey.update({val: k})
df["keykey"] = df["valval"].map(d_keykey)
print(df)


  user_id  score valval keykey
0       a      0  va2.3   key2
1       b    100  va1.1   key1
2       c     50  va2.1   key2
3       d      0  va2.2   key2
4       e     25  va1.2   key1
5       f     50  va1.1   key1
6       g    100  va2.1   key2
7       h      0  va1.2   key1
8       i      7  va1.2   key1
9       j     20  va1.3   key1

Upvotes: 0

anky
anky

Reputation: 75140

You can use series.map after flattening the dictionary;

d = {val:k for k,v in dic_t.items() for val in v}
df['keykey'] = df['valval'].map(d)

print(df)

  user_id  score valval keykey
0       a      0  va2.3   key2
1       b    100  va1.1   key1
2       c     50  va2.1   key2
3       d      0  va2.2   key2
4       e     25  va1.2   key1
5       f     50  va1.1   key1
6       g    100  va2.1   key2
7       h      0  va1.2   key1
8       i      7  va1.2   key1
9       j     20  va1.3   key1

Upvotes: 3

Related Questions