Reputation: 201
I have a pandas dataframe (df1) that looks like this:
No car pl. Value Expected
1 Toyota HK 0.1 0.12
1 Toyota NY 0.2 NaN
2 Saab LOS 0.3 NaN
2 Saab UK 0.4 0.6
2 Saab HK 0.5 0.51
3 Audi NYU 0.6 NaN
3 Audi LOS 0.7 NaN
4 VW UK 0.8 NaN
5 Audi HK 0.9 NaN
And I have another dataframe (df2) that looks like this:
No pl. Expected
2 LOS 0.35
3 NYU 0.62
3 LOS 0.76
5 HK 0.91
I would like my final dataframe to look like this:
No car pl. Value Expected
1 Toyota HK 0.1 0.12
1 Toyota NY 0.2 NaN
2 Saab LOS 0.3 0.35
2 Saab UK 0.4 0.6
2 Saab HK 0.5 0.51
3 Audi NYU 0.6 0.62
3 Audi LOS 0.7 0.76
4 VW UK 0.8 NaN
5 Audi HK 0.9 0.91
I tried this:
df = df1.fillna(df1.merge(df2, on=['No','pl.']))
But df1 remains unchanged in the output
The questions that I have seen here have been of dataframes with the same shape. Is there a way to do this when the shapes are different?
Thanks in advance!
Upvotes: 3
Views: 1635
Reputation: 42886
Since we have two key columns where we want to match on and update our df1
dataframe, we can use set_index
with fillna
, since fillna
aligns in the indices:
keys = ['No', 'pl.']
df1 = df1.set_index(keys).fillna(df2.set_index(keys)).reset_index()
No pl. car Value Expected
0 1 HK Toyota 0.1 0.12
1 1 NY Toyota 0.2 NaN
2 2 LOS Saab 0.3 0.35
3 2 UK Saab 0.4 0.60
4 2 HK Saab 0.5 0.51
5 3 NYU Audi 0.6 0.62
6 3 LOS Audi 0.7 0.76
7 4 UK VW 0.8 NaN
8 5 HK Audi 0.9 0.91
Or we can use the dedicated method Series.update
for this:
df1 = df1.set_index(keys)
df1['Expected'].update(df2.set_index(keys)['Expected'])
df1 = df1.reset_index()
No pl. car Value Expected
0 1 HK Toyota 0.1 0.12
1 1 NY Toyota 0.2 NaN
2 2 LOS Saab 0.3 0.35
3 2 UK Saab 0.4 0.60
4 2 HK Saab 0.5 0.51
5 3 NYU Audi 0.6 0.62
6 3 LOS Audi 0.7 0.76
7 4 UK VW 0.8 NaN
8 5 HK Audi 0.9 0.91
Upvotes: 2
Reputation: 862541
Use left join with suffixes
parameter and then replace missing values by Series.fillna
with DataFrame.pop
for use and drop column Expected_
:
df = df1.merge(df2, on=['No','pl.'], how='left', suffixes=('_',''))
df['Expected'] = df.pop('Expected_').fillna(df['Expected'])
print (df)
No car pl. Value Expected
0 1 Toyota HK 0.1 0.12
1 1 Toyota NY 0.2 NaN
2 2 Saab LOS 0.3 0.35
3 2 Saab UK 0.4 0.60
4 2 Saab HK 0.5 0.51
5 3 Audi NYU 0.6 0.62
6 3 Audi LOS 0.7 0.76
7 4 VW UK 0.8 NaN
8 5 Audi HK 0.9 0.91
Upvotes: 1