Reputation: 1896
I've seen a large number of similar questions but nothing quite answers what I am looking to do.
I have two dataframes
Conn_df
that contains names and company details manually entered (e.g. Conn_df["Name", "Company_name", "Company_Address"]
Cleanse_df
that contains cleaned up company names (e.g. Cleanse_df["Original_Company_Name", "Cleanse_Company_Name"]
The data for both is held in csv files that are imported into the script.
I want to change the company details in Conn_df.Company_Name
using the values in Cleanse_df
, where the Conn_df.Company_Name
equals the Cleanse_df.Original_Company_Name
and is replaced by Cleanse_df.Cleanse_Company_Name
.
I have tried:
Conn_df["Company"] = Conn_df["Company"].replace(Conn_df["Company"], Cleanse_df["Cleansed"])
but got
replace() takes no keyword arguments
I also tried:
Conn_df["Company"] = Conn_df["Company"].map(Cleanse_df.set_index("Original")["Cleansed"])
but got
Reindexing only valid with uniquely valued Index objects
Any suggestions on how to get the values to be replaced. I would note that both dataframes run to many tens of thousands of rows, so creating a manual list is not possible.
Upvotes: 0
Views: 53
Reputation: 102
I think you want something along the lines of this:
conn_df = pd.DataFrame({'Name':['Mac','K','Hutt'],
'Company_name':['McD','KFC','PH'],
'Company_adress':['street1','street2','street4']})
cleanse_df = pd.DataFrame({'Original_Company_Name':['McD'],'Cleanse_Company_Name':
['MacDonalds']})
cleanse_df = cleanse_df.rename(columns={'Original_Company_Name':'Company_name'})
merged_df = conn_df.merge(cleanse_df,on='Company_name',how='left')
merged_df['Cleanse_Company_Name'].fillna(merged_df['Company_name'],inplace=True)
final_df = merged_df[['Name','Company_adress','Cleanse_Company_Name']]\
.rename(columns={'Cleanse_Company_Name':'Company_name'})
This would return:
Name Company_adress Company_name
0 Mac street1 MacDonalds
1 K street2 KFC
2 Hutt street4 PH
You merge the two dataframes and then keep the replaced new value, if there is no value to replace the name then the name will just stay the same, this is done by the fillna command.
Upvotes: 1