Ivan HH
Ivan HH

Reputation: 29

Modify column in according another column dataframe python

I have two dataframes. One is the master dataframe and the other df is used to fil my master dataframe.

what I want is fil one column in according another column without alter the others columns.

This is example of master df

| id | Purch. order | cost | size | code |
| 1  |    G918282   | 8283 | large| hchs |
| 2  |    EE18282   | 1283 | small| ueus |
| 3  |    DD08282   | 5583 | large| kdks |
| 4  |    GU88912   | 8232 | large| jdhd |
| 5  |     NaN      | 1283 | large| jdjd |
| 6  |     Nan      | 5583 | large| qqas |
| 7  |     Nan      | 8232 | large| djjs |

This is example of the another df

| id | Purch. order | cost |
| 1  |    G918282   | 7728 |
| 2  |    EE18282   | 2211 |
| 3  |    DD08282   | 5321 |
| 4  |    GU88912   | 4778 |
| 5  |     NaN      | 4283 |
| 6  |     Nan      | 9993 |
| 7  |     Nan      | 3442 |

This is the result I'd like

| id | Purch. order | cost | size | code |
| 1  |    G918282   | 7728 | large| hchs |
| 2  |    EE18282   | 2211 | small| ueus |
| 3  |    DD08282   | 5321 | large| kdks |
| 4  |    GU88912   | 4778 | large| jdhd |
| 5  |     NaN      | 1283 | large| jdjd |
| 6  |     Nan      | 5583 | large| qqas |
| 7  |     Nan      | 8232 | large| djjs |

Where only the cost column is modified only if the secondary df coincides with the purch. order and if it's not NaN.

I hope you can help me... and I'm sorry if my english is so basic, not is my mother language. Thanks a lot.

Upvotes: 1

Views: 99

Answers (2)

DavideBrex
DavideBrex

Reputation: 2414

You can do it with merge followed by updating the cost column based on where the Nan are:

final_df = df1.merge(df2[~df2["Purch. order"].isna()], on = 'Purch. order', how="left")

final_df.loc[~final_df['Purch. order'].isnull(), "cost"] = final_df['cost_y'] # not nan
final_df.loc[final_df['Purch. order'].isnull(), "cost"] = final_df['cost_x'] # nan
final_df = final_df.drop(['id_y','cost_x','cost_y'],axis=1)

Output:

id _x  Purch. order size    code    cost
    0   1   G918282 large   hchs    7728.0
    1   2   EE18282 small   ueus    2211.0
    2   3   DD08282 large   kdks    5321.0
    3   4   GU88912 large   jdhd    4778.0
    4   5   NaN     large   jdjd    1283.0
    5   6   NaN     large   qqas    5583.0
    6   7   NaN     large   djjs    8232.0

Upvotes: 1

Umar.H
Umar.H

Reputation: 23099

lets try Update which works along indexes, by default overwrite is set to True which will overwrite overlapping values in your target dataframe. use overwrite=False if you only want to change NA values.

master_df = master_df.set_index(['id','Purch. order'])
another_df = another_df.dropna(subset=['Purch. order']).set_index(['id','Purch. order'])

master_df.update(another_df)
print(master_df)
                   cost    size   code
id Purch. order                       
1  G918282       7728.0   large   hchs
2  EE18282       2211.0   small   ueus
3  DD08282       5321.0   large   kdks
4  GU88912       4778.0   large   jdhd
5  NaN           1283.0   large   jdjd
6  Nan           5583.0   large   qqas
7  Nan           8232.0   large   djjs

Upvotes: 1

Related Questions