Reputation: 320
I'm sure this question has been asked in another way, but I want to be sure word this one is worded exactly as I understand it. I'm still learning pandas, so bear with me.
I have 3 dataframes that share a common key. I need to assign the values in the columns of dataframe A into the Values of Dataframe B based on the key values of the dataframe C. To better illustrate here are my examples.
options= [["Option1", "Description1"], ["Option2", "Description2"], ["Option3", "Description3"]]
dfA = pd.DataFrame(options, columns = ["Option", "Description"])
links= [["Link1", "LinkDescription1"], ["Link2", "LinkDescription2"], ["Link3", "LinkDescription3"]]
dfB = pd.DataFrame(links, columns = ["Link", "LinkDescription"])
master_ledger = [["Option1", "Link1"], ["Option2", "Link2"], ["Option3", "Link3"]]
dfC = pd.DataFrame(master_ledger, columns = ["Option", "Link"])
Looking at each of these DF's we see:
dfA
Out[14]:
Option Description
0 Option1 Description1
1 Option2 Description2
2 Option3 Description3
dfB
Out[15]:
Link LinkDescription
0 Link1 LinkDescription1
1 Link2 LinkDescription2
2 Link3 LinkDescription3
dfC
Out[16]:
Option Link
0 Option1 Link1
1 Option2 Link2
2 Option3 Link3
What I'm trying do is assign LinkDescription's in DFB the values from the option descriptions in dfA, but based on the option / link relationship inside dfC. So psuedo terms, this could be written as:
dfB["LinkDiscripiton"] = dfA["Description"] ->where dfB["LINK_ID"] == dfC["LINK_ID"] and dfA["OPT_ID"] == dfC["OPT_ID"]
The expected change would be applied to dfB and it would look like:
dfB
Out[15]:
Link LinkDescription
0 Link1 Description1
1 Link2 Description2
2 Link3 Description3
I'm not really sure how to approach this, so I don't have examples of my previous approaches. Thanks for the help.
Update#
Thanks to Yatu's Merge suggestion, I found the following to do the trick and assign the option descriptions into dfB's link Descriptions:
df_temp = dfA.merge(dfC, on='Option')
df_temp.merge(dfB, on = 'Link')[['Link','Description']]
dfB[["Link", "LinkDescription"]] = df_temp[["Link", "Description"]]
dfB:
Out[63]:
Link LinkDescription
0 Link1 Description1
1 Link3 Description2
2 Link2 Description3
Also just to better explain the desired output. If I had defined dfA to be: options= [["Option1", "Bronze"], ["Option2", "Gold"], ["Option3", "Silver"]] dfA = pd.DataFrame(options, columns = ["Option", "Description"])
dfA
Out[68]:
Option Description
0 Option1 Bronze
1 Option2 Gold
2 Option3 Silver
Then dfB would be changed to:
Out[63]:
Link LinkDescription
0 Link1 Bronze
1 Link3 Gold
2 Link2 Silver
Upvotes: 1
Views: 126
Reputation: 88236
You can use pd.merge
in order to merge the three dataframes on their corresponding keys, and then keep the columns that belong to dfB
:
df_temp = dfA.merge(dfC, on='Option').merge(dfB, on = 'Link')[['Link','Description']]
Link Description
0 Link1 Description1
1 Link2 Description2
2 Link3 Description3
Upvotes: 1