Richard
Richard

Reputation: 5

INDEX and MATCH with multiple criteria in Pandas Python

I am trying to do an index match in 2 data set but having trouble. Here is an example of what I am trying to do. I want to fill in column "a", "b", "c" that are empty in df with the df2 data where "Machine", "Year", and "Order Type".

The first dataframe lets call this one "df"

    Machine Year    Cost    a   b   c   
0   abc     2014    5500    nan nan nan                                     
1   abc     2015    89      nan nan nan                                 
2   abc     2016    600     nan nan nan                                 
3   abc     2017    250     nan nan nan                                 
4   abc     2018    2100    nan nan nan                                         
5   abc     2019    590     nan nan nan                                         
6   dcb     2020    3000    nan nan nan                                         
7   dcb     2021    100     nan nan nan                                     

The second data set is called "df2"

    Order Type  Machine Year    Total Count
0   a           abc      2014   1
1   b           abc      2014   1
2   c           abc      2014   2
4   c           dcb      2015   4
3   a           abc      2016   3

Final Output is:

    Machine Year    Cost    a   b   c   
0   abc     2014    5500    1   1   2                                       
1   abc     2015    89      nan nan nan                                 
2   abc     2016    600     3 nan nan                                   
3   abc     2017    250     nan nan nan                                 
4   abc     2018    2100    nan nan nan                                         
5   abc     2019    590     1   nan nan                                         
6   dcb     2014    3000    nan nan 4                                           
7   dcb     2015    100     nan nan nan     

Thanks for help in advance

Upvotes: 0

Views: 307

Answers (1)

Parfait
Parfait

Reputation: 107687

Consider DataFrame.pivot to reshape df2 to merge with df1.

final_df = (
    df1.reindex(["Machine", "Type", "Cost"], axis=True)
       .merge(
            df.pivot(
                index=["Machine", "Year"], 
                columns="Order Type", 
                values="Total Count"
            ).reset_index(),
            on = ["Machine", "Year"]
       )
)

Upvotes: 1

Related Questions