Reputation: 115
I would like to add a column to dfA based on whether or not the job title (and its matching State) exists in dfB.
dfA=
Title State Income
Cashier WY 15000
Cashier WY 20000
Cashier WY 15000
Manager WY 25000
Cashier CO 15000
dfB=
Title State MostFreqIncome
Cashier WY 15000
In English: if a Title/State pair in dfA match any Title/State pair in dfB, create a new column in dfA which gives the MostFreqIncome attached to that Title/State pair.
Desired dfA:
Title State Income MostFreqIncome
Cashier WY 15000 15000
Cashier WY 20000 15000
Cashier WY 15000 15000
Manager WY 25000 NA
Cashier CO 15000 NA
Here's what I have so far:
is_in = dfA.Title.isin(dfB.Title) & dfA.State.isin(dfB.State)
This gives me False/True, but if it's True I want dfA.MostFreqIncome = dfB.MostFreqIncome. If it's False I want dfA.MostFreqIncome = 'NA'
Upvotes: 1
Views: 127
Reputation: 176830
You can merge
the two DataFrames A and B to create the new DataFrame:
>>> dfA.merge(dfB, on=['Title', 'State'], how='left')
Title State Income MostFreqIncome
0 Cashier WY 15000 15000.0
1 Cashier WY 20000 15000.0
2 Cashier WY 15000 15000.0
3 Manager WY 25000 NaN
4 Cashier CO 15000 NaN
Specifying how='left'
here means that we're just only dfA
's Title/State keys in the merged DataFrame.
Upvotes: 2