royalewithcheese
royalewithcheese

Reputation: 502

How to join two tables with same column names but with different data using pandas?

Supposing

df1,

col1 | col2 | col3 | col4 |
A    |  131 | 666  | 777  |
B    |  123 | 345  | 435  |
C    | 1424 | 3214 | 2314 |

df2,

col1 | col2 | col3 | col4 |
A    |  10  | 1    | 0    |
B    |  20  | 14   | 68   |
C    |  23  | 43   | 4    |

final df that I want to achieve,

col1 | col2           | col3         | col4      |
A    |  131 (10%)     | 666 (1%)     | 777       |
B    |  123 (20%)     | 345 (14%)    | 435 (68%) |
C    |  1424 (23%)    | 3214 (43%)   | 2314 (4%) | 

P.S. The numbers are just random

Upvotes: 1

Views: 448

Answers (2)

jezrael
jezrael

Reputation: 862641

You can convert DataFrames to strings, replace 0 to missing values, add ( %), so not added for missing values and last is added first DataFrame:

df = ((df1.set_index('col1').astype(str) + 
      (' (' + df2.set_index('col1').astype(str).replace('0', np.nan) + '%)').fillna(''))
      .reset_index())
print (df)
  col1        col2        col3       col4
0    A   131 (10%)    666 (1%)        777
1    B   123 (20%)   345 (14%)  435 (68%)
2    C  1424 (23%)  3214 (43%)  2314 (4%)

Another idea is test values by DataFrame.mask:

df11 = df1.set_index('col1').astype(str)
df22 = df2.set_index('col1').astype(str)

df = (df11 + (' (' + df22 + '%)').mask(df22.eq('0'), '')).reset_index()
      
print (df)
  col1        col2        col3       col4
0    A   131 (10%)    666 (1%)        777
1    B   123 (20%)   345 (14%)  435 (68%)
2    C  1424 (23%)  3214 (43%)  2314 (4%)

Upvotes: 3

U13-Forward
U13-Forward

Reputation: 71580

Or applymap:

>>> (df1.set_index('col1').astype(str).add(df2.set_index('col1')
                      .applymap(lambda x: f' ({x}%)' if x else ''))
                      .reset_index())
  col1        col2        col3       col4
0    A   131 (10%)    666 (1%)        777
1    B   123 (20%)   345 (14%)  435 (68%)
2    C  1424 (23%)  3214 (43%)  2314 (4%)
>>> 

This code adds the strings from df2 with a percentage sign if it isn't 0. It uses set_index to merge on the same col1, and uses applymap to format it.

Upvotes: 2

Related Questions