user14185615
user14185615

Reputation:

Combining rows of two Dataframes

I've been looking through the documentation and Pandas Merging 101, but help was nowhere to be found.

Suppose I have two Dataframes as follows

>>> df

    Values  Count  Percentage
0    Apple      0           0
1   Banana      0           0
2  Samsung      0           0
3   Orange      0           0
>>> df2

    Values  Count  Percentage
0    Apple     14        0.74
1  Samsung      5        0.26

And I want to merge the two Dataframes to produce the following result

>>> result

    Values  Count  Percentage
0    Apple     14        0.74
1   Banana      0           0
2  Samsung      5        0.26
3   Orange      0           0

Keep in mind that df2 is always a subset of df.
In other words, every Values in df2 will be guaranteed to present in the Values of df.

Here is what I have so far

result = df.merge(df2, on='Values', how='left')
>>> result

    Values  Count_x  Percentage_x  Count_y  Percentage_y
0    Apple        0             0     14.0          0.74
1   Banana        0             0      NaN           NaN
2  Samsung        0             0      5.0          0.26
3   Orange        0             0      NaN           NaN

But the result is kind of disappointing.
Any help would be greatly appreciated.

Upvotes: 3

Views: 85

Answers (3)

Nk03
Nk03

Reputation: 14949

Or you can use update. After setting the index, just update the value in df1 with new values:

df1 = df1.set_index('Values')
df1 = df1.update(df2.set_index('Values')).reset_index()

Upvotes: 2

Valdi_Bo
Valdi_Bo

Reputation: 30971

You should:

  • set the index in df to Values and save the result in a temporary DataFrame,
  • update it with df2 with the index set also to Values (update is performed in place),
  • reset the index to convert Values to a regular column,
  • maybe save this result back under df.

The code to do it is:

wrk = df.set_index('Values')
wrk.update(df2.set_index('Values'))
df = wrk.reset_index()

The result is:

    Values  Count  Percentage
0    Apple   14.0        0.74
1   Banana    0.0        0.00
2  Samsung    5.0        0.26
3   Orange    0.0        0.00

Upvotes: 1

Gusti Adli
Gusti Adli

Reputation: 1213

I think what you want is a pd.concat followed by a .drop_duplicates()

>>> df1
    Values  Count  Percentage
0    Apple      0           0
1   Banana      0           0
2  Samsung      0           0
3   Orange      0           0
>>> df2
    Values  Count  Percentage
0    Apple     14        0.74
1  Samsung      5        0.26
>>> pd.concat([df1, df2]).drop_duplicates(subset='Values', keep='last')
    Values  Count  Percentage
1   Banana      0        0.00
3   Orange      0        0.00
0    Apple     14        0.74
1  Samsung      5        0.26

Upvotes: 1

Related Questions