Reputation: 547
I have dataframe of stocks and another dataframe of stocks with company name and ticker:
#df1
Company_name Industry
Buckle Inc. (The) Retail
TILLY'S INC Apparel
Teck Resources Ltd Mining
#df2
Company_name Ticker
Buckle Inc BKE
TILLY'S, INC. TLYS
Teck Resources Ltd TECK
I want to add tickers (from df2) to each company name in df1.
The problem is company names is sometime different, for example:
Buckle Inc. (The)
and Buckle Inc
but they always share some substring.
I try merge but only work when 2 company names are 100% the same, like Teck Resources Ltd
in my example.
# I tried this:
df.merge(df1, on='Company_name', how='left')
#return :
Company_name Industry Ticker
Teck Resources Ltd Mining TECK
This is the output I need:
Company_name Industry Ticker
Buckle Inc. (The) Retail BKE
Tilly's Inc. Apparel TLYS
Teck Resources Ltd Mining TECK
Upvotes: 1
Views: 19
Reputation: 26676
I make one assumption, there are no duplicates in the Company_name
df2['temp']=df2['Company_name'].str.lower().str.extract('(^\w+)')#Extract first word in string in df2.Company_name
df1['temp']=df1['Company_name'].str.lower().str.extract('(^\w+)')#Extract first word in string in df1.Company_name
df1=df1.assign(Ticker=df1['temp'].map(dict(zip(df2.temp,df2.Ticker)))).drop('temp',1)#Create dict and map
Company_name Ticker
0 Buckle Inc. (The) BKE
1 TILLY'S INC TLYS
2 Teck Resources Ltd TECK
Upvotes: 1