Reputation: 85
I'm trying to fill missing data with data found in other fields, for example, I have a table:
Brand Model Make
Toyota Corolla Japan
Toyota Crescida Japan
Toyota Land Cruiser
Ford Escape America
Ford Explorer America
Ford Edge
Ford Focus
I know from other rows that Toyota is from Japan and Ford from America. How can I do this in Pandas to give me the below?
Brand Model Make
Toyota Corolla Japan
Toyota Crescida Japan
Toyota Land Cruiser Japan
Ford Escape America
Ford Explorer America
Ford Edge America
Ford Focus America
Upvotes: 0
Views: 45
Reputation: 85
I was able to solve it in two steps:
created another df: df2 = pd.DataFrame(df.groupby('Brand')['Make'].first())
lookup up Make from df2 to fill original df: df['Make']=df.Brand.apply(lambda x: df2[df2['Brand']==x]['Make'])
Brand Model Make 0 Toyota Corolla Japan 1 Toyota Crescida Japan 2 Toyota Prado Japan
Upvotes: 0
Reputation: 394
You can use df=df1[['Brand','Make']].groupby(['Brand']).agg(lambda x:x.value_counts().index[0]).reset_index()
to get the common occurences for the make column on the basis of Brand.
After that you can use the following code
for index,value in enumerate(df1['Make']):
if value==None:
brand=df1['Brand'][index]
df1['Make'][index]=df['Make'][df['Brand']==brand].to_list()[0]
Upvotes: 1