Reputation: 809
I'm new to programming and I wanted to create a newId base on my dictionary values. I so several Lambda functions and it use to create new fields and it efficient to use. I tried several codes but couldn't implement it into my problem here. Any help is appreciated.
My panda's dataframe look like this
ID1 ID2
76 21.0 21.0
77 21.0 11562.0
78 21.0 131241.0
79 22.0 22.0
80 22.0 11562.0
81 23.0 23.0
82 23.0 33680.0
83 23.0 131031.0
84 23.0 523376.0
I have a dictionary like this
for x, y in jdictionary.items():
print(x, y)
17.0 [17.0, 11561.0]
18.0 [18.0, 98640.0, 98641.0]
19.0 [19.0, 98642.0, 131135.0]
20.0 [20.0, 33791.0, 131054.0, 555884.0]
21.0 [21.0, 11562.0, 131241.0]
22.0 [22.0]
23.0 [23.0, 33680.0, 131031.0, 523376.0, 555825.0, 593026.0]
Base on dictionary value I want to create a "NewID" field in the pandas dataframe. If any values in the dictionary match to ID1 and ID2 NewID field should be dictionary key. It should look like below.
ID1 ID2 NewID
76 21.0 21.0 21.0
77 21.0 11562.0 21.0
78 21.0 131241.0 21.0
79 22.0 22.0 22.0
80 22.0 11562.0 21.0
81 23.0 23.0 23.0
82 23.0 33680.0 23.0
83 23.0 131031.0 23.0
84 23.0 523376.0 23.0
Is this possible to do it in a Lambda function or any other method?
Thanks in advance
Upvotes: 0
Views: 815
Reputation: 4648
If there is no duplicate item within all the items of jdictionary
, a reverse-lookup dict (ID2
-> key of jdictionary
) can be constructed and applied on column ID2
.
Edit: A potential advantage of building up an reverse lookup dictionary is that the dictionary can be passed into Series.map()
, which uses internally optimized code as indicated by this answer. But unfortunately, there seems to be no way of avoiding the double for
loops as indicated in this post.
# create the reverse lookup dictionary
dic_rev = {}
for k, ls_v in jdictionary.items():
for v in ls_v:
dic_rev[v] = k
# optimized execution
df["newID"] = df["ID2"].map(dic_rev)
# output
df
Out[24]:
ID1 ID2 newID
76 21.0 21.0 21.0
77 21.0 11562.0 21.0
78 21.0 131241.0 21.0
79 22.0 22.0 22.0
80 22.0 11562.0 21.0
81 23.0 23.0 23.0
82 23.0 33680.0 23.0
83 23.0 131031.0 23.0
84 23.0 523376.0 23.0
Upvotes: 1
Reputation: 2583
the solution could be:
(
df
.assign(NewID = lambda x: x.apply(lambda s:
next((k for k,v in jdictionary.items() if ((s['ID1'] in v) & (s['ID2'] in v))),np.nan),axis=1))
)
Upvotes: 1