Reputation: 359
I have 2 dataframes: df1 and df2
df1
School Conference
0 Air Force Mt. West
1 Akron MAC
2 Alabama at Birmingham C-USA
3 Auburn Sun Belt
df2
SCHOOL_NAME RATE
0 Auburn University 93.0
1 Air Force Academy 53.0
2 Birmingham 75.0
3 University of Akron 77.0
I would like to get the output below, basically binding the `RATE` column from df2 into df1 based on substring from School column
School Conference RATE
0 Air Force Mt. West 53.0
1 Akron MAC 77.0
2 Alabama at Birmingham C-USA 75.0
3 Auburn Sun Belt 93.0
I tried the code below but it's not working. When I run it, it seems to execute successfully, but nothing happens
for i in range(1, len(df1)):
if df1['School'][i] in df2['SCHOOL_NAME']:
pd.merge(df1, df2, how = 'left', left_on = 'School', right_on = 'SCHOOL_NAME')
Upvotes: 4
Views: 91
Reputation: 16683
You can use list comprehension to check if the columns from each dataframe are in
each other (you also compare case-insensitively) and then merge:
df1['SCHOOL_NAME'] = df1['School'].apply(lambda x: [y for y in df2['SCHOOL_NAME']
if x in y or y in x]).str[0]
df1 = df1.merge(df2, how='left').drop('SCHOOL_NAME', axis=1) #can pass on='SCHOOL_NAME' to merge.
df1
Out[1]:
School Conference RATE
0 Air Force Mt. West 53.0
1 Akron MAC 77.0
2 Alabama at Birmingham C-USA 75.0
3 Auburn Sun Belt 93.0
You could also search case-insensitively by adding .lower()
to x
and y
:
df1['SCHOOL_NAME'] = df1['School'].apply(lambda x: [y for y in df2['SCHOOL_NAME']
if x.lower() in y.lower()
or y.lower() in x.lower()]).str[0]
df1 = df1.merge(df2, how='left').drop('SCHOOL_NAME', axis=1) #can pass on='SCHOOL_NAME' to merge.
df1
Out[2]:
School Conference RATE
0 Air Force Mt. West 53.0
1 Akron MAC 77.0
2 Alabama at Birmingham C-USA 75.0
3 Auburn Sun Belt 93.0
Single line of code per comment:
df1 = (df1.assign(SCHOOL_NAME = df1['School'].apply(lambda x: [y for y in df2['SCHOOL_NAME']
if x.lower() in y.lower()
or y.lower() in x.lower()]).str[0])
.merge(df2, how='left').drop('SCHOOL_NAME', axis=1))
df1
Out[3]:
School Conference RATE
0 Air Force Mt. West 53.0
1 Akron MAC 77.0
2 Alabama at Birmingham C-USA 75.0
3 Auburn Sun Belt 93.0
Upvotes: 4