Reputation: 626
I have file1
that have 6 columns as follow:
proNum proName color path comNum comName
1 apple red ewtt 43 namex
1 apple red lfor 27 namey
1 apple red bcms 276 namez
4 orange yellow owif 7 namea
4 orange yellow zxaq 19 nameb
8 burry bink lpoq 99 namec
And file2
with 2 columns as follow:
comName color
nameb red
namex bink
nameb black
namez blue
namec yellow
namey red
namez red
nameb bink
What I'm looking for is adding a column in file1
resulted by/from: check/go through each value of comName
in file1
, compare it one-by-one with comName
in file2
if it found the same value then takes the corresponding value of color
in file2
and compare it with color
in file1
if it's the same put the value of color
in that new column and exit then check the next value of comName
.
If the color
values not equal, keep going in file2
because comName
appear one or more than one for some of them, if checking finish in file2
and nothing equal in term of color
then put No/Nan in that new column.
Here is the output that should be:
proNum proName color path comNum comName New_color
1 apple red ewtt 43 namex No
1 apple red lfor 27 namey red
1 apple red bcms 276 namez red
4 orange yellow owif 7 namea No
4 orange yellow zxaq 19 nameb No
8 burry bink lpoq 99 namec No
It seems to be easy but those mixes comparisons really confused me of how to do it, so any help/ideas will be appreciated.
Upvotes: 0
Views: 39
Reputation: 136
Applying a left join on the tables on the comName and color columns should leed to the desired output
file2['color_y']=['red','bink','black','blue','yellow','red','red','bink']
file1.merge(file2, how='left', on=['comName','color'].fillna('No')
Upvotes: 1
Reputation: 59549
You can use a merge here, that way you only assign a New_color
when it matches exactly on comName
and color
.
df1.merge(df2.assign(New_color=df2.color), how='left').fillna('No')
proNum proName color path comNum comName New_color
0 1 apple red ewtt 43 namex No
1 1 apple red lfor 27 namey red
2 1 apple red bcms 276 namez red
3 4 orange yellow owif 7 namea No
4 4 orange yellow zxaq 19 nameb No
5 8 burry bink lpoq 99 namec No
Upvotes: 2
Reputation: 1433
Try this code:
df2 = df2.drop_duplicates(['comName'], keep='last')
result = df.merge(df2, on='comName', how='left', suffixes=('', '_y'))
result['New_color'] = result.apply(lambda x: x.color if x.color == x.color_y else 'No', 1)
result = result.drop(['color_y'], 1)
df
is file1
and df2
is file2
.
Output:
proNum proName color path comNum comName New_color
0 1 apple red ewtt 43 namex No
1 1 apple red lfor 27 namey red
2 1 apple red bcms 276 namez red
3 4 orange yellow owif 7 namea No
4 4 orange yellow zxaq 19 nameb No
5 8 burry bink lpoq 99 namec No
Upvotes: 1