Adnan Hadi
Adnan Hadi

Reputation: 85

pandas fill missing cells with similar group data

I'm trying to fill missing data with data found in other fields, for example, I have a table:

Brand   Model   Make
Toyota  Corolla Japan
Toyota  Crescida    Japan
Toyota  Land Cruiser    
Ford    Escape  America
Ford    Explorer    America
Ford    Edge    
Ford    Focus   

I know from other rows that Toyota is from Japan and Ford from America. How can I do this in Pandas to give me the below?

Brand   Model   Make
Toyota  Corolla Japan
Toyota  Crescida    Japan
Toyota  Land Cruiser    Japan
Ford    Escape  America
Ford    Explorer    America
Ford    Edge    America
Ford    Focus   America

Upvotes: 0

Views: 45

Answers (2)

Adnan Hadi
Adnan Hadi

Reputation: 85

I was able to solve it in two steps:

  1. created another df: df2 = pd.DataFrame(df.groupby('Brand')['Make'].first())

  2. lookup up Make from df2 to fill original df: df['Make']=df.Brand.apply(lambda x: df2[df2['Brand']==x]['Make'])

    Brand Model Make 0 Toyota Corolla Japan 1 Toyota Crescida Japan 2 Toyota Prado Japan

Upvotes: 0

Arpit
Arpit

Reputation: 394

You can use df=df1[['Brand','Make']].groupby(['Brand']).agg(lambda x:x.value_counts().index[0]).reset_index() to get the common occurences for the make column on the basis of Brand. After that you can use the following code

for index,value in enumerate(df1['Make']):
    if value==None:
        brand=df1['Brand'][index]
        df1['Make'][index]=df['Make'][df['Brand']==brand].to_list()[0]

Upvotes: 1

Related Questions