rra
rra

Reputation: 809

How to create a new column "NewId" base on dictionary value in pandas dataframe using Lambda function

I'm new to programming and I wanted to create a newId base on my dictionary values. I so several Lambda functions and it use to create new fields and it efficient to use. I tried several codes but couldn't implement it into my problem here. Any help is appreciated.

My panda's dataframe look like this

    ID1     ID2
76  21.0    21.0
77  21.0    11562.0
78  21.0    131241.0
79  22.0    22.0
80  22.0    11562.0
81  23.0    23.0
82  23.0    33680.0
83  23.0    131031.0
84  23.0    523376.0

I have a dictionary like this

for x, y in jdictionary.items():
  print(x, y)

17.0 [17.0, 11561.0]
18.0 [18.0, 98640.0, 98641.0]
19.0 [19.0, 98642.0, 131135.0]
20.0 [20.0, 33791.0, 131054.0, 555884.0]
21.0 [21.0, 11562.0, 131241.0]
22.0 [22.0]
23.0 [23.0, 33680.0, 131031.0, 523376.0, 555825.0, 593026.0]

Base on dictionary value I want to create a "NewID" field in the pandas dataframe. If any values in the dictionary match to ID1 and ID2 NewID field should be dictionary key. It should look like below.

    ID1     ID2         NewID
76  21.0    21.0        21.0
77  21.0    11562.0     21.0
78  21.0    131241.0    21.0
79  22.0    22.0        22.0
80  22.0    11562.0     21.0
81  23.0    23.0        23.0
82  23.0    33680.0     23.0
83  23.0    131031.0    23.0
84  23.0    523376.0    23.0

Is this possible to do it in a Lambda function or any other method?

Thanks in advance

Upvotes: 0

Views: 815

Answers (2)

Bill Huang
Bill Huang

Reputation: 4648

If there is no duplicate item within all the items of jdictionary, a reverse-lookup dict (ID2 -> key of jdictionary) can be constructed and applied on column ID2.

Edit: A potential advantage of building up an reverse lookup dictionary is that the dictionary can be passed into Series.map(), which uses internally optimized code as indicated by this answer. But unfortunately, there seems to be no way of avoiding the double for loops as indicated in this post.

# create the reverse lookup dictionary
dic_rev = {}
for k, ls_v in jdictionary.items():
    for v in ls_v:
        dic_rev[v] = k

# optimized execution
df["newID"] = df["ID2"].map(dic_rev)

# output
df
Out[24]: 
     ID1       ID2  newID
76  21.0      21.0   21.0
77  21.0   11562.0   21.0
78  21.0  131241.0   21.0
79  22.0      22.0   22.0
80  22.0   11562.0   21.0
81  23.0      23.0   23.0
82  23.0   33680.0   23.0
83  23.0  131031.0   23.0
84  23.0  523376.0   23.0

Upvotes: 1

Mehdi Golzadeh
Mehdi Golzadeh

Reputation: 2583

the solution could be:

(
    df
    .assign(NewID = lambda x: x.apply(lambda s: 
            next((k for k,v in jdictionary.items() if ((s['ID1'] in v) & (s['ID2'] in v))),np.nan),axis=1))
)

Upvotes: 1

Related Questions