Ethan Li
Ethan Li

Reputation: 95

How do I combine two dataframes by both overlapping columns and indices?

Suppose I have 2 dataframes with overlapping column and index names that look as such:

  A B C D
A 0 1 0 1
B 0 1 1 0
C 1 0 1 0
D 0 0 0 1

  A C D E
A 1 0 0 0
B 0 1 0 0
D 0 0 0 0
E 1 0 0 1

I want to combine these two dataframes into one such that cells with the same column and index names are combined. The end result should look like this:

  A B C D E
A 1 1 0 1 0
B 0 1 1 0 0
C 1 0 1 0 0
D 0 0 0 1 0
E 1 0 0 0 1

I've tried using the Pandas.concat method but it only concatenates along one of the axes.

Upvotes: 7

Views: 2166

Answers (2)

piRSquared
piRSquared

Reputation: 294218

align and np.maximum

  • pandas.DataFrame.align will produce a copy of the calling DataFrame and the argument DataFrame with their index and column attributes aligned and return them as a tuple of two DataFrame
  • Pass both to numpy.maximum which will conveniently respect that these are pandas.DataFrame objects and return a new DataFrame with the appropriate maximal values.

np.maximum(*df1.align(df2, fill_value=0))

   A  B  C  D  E
A  1  1  0  1  0
B  0  1  1  0  0
C  1  0  1  0  0
D  0  0  0  1  0
E  1  0  0  0  1

Upvotes: 22

Quang Hoang
Quang Hoang

Reputation: 150735

How about:

(df1.add(df2, fill_value=0)
    .fillna(0)
    .gt(0)
    .astype(int))

output:

    A   B   C   D   E
A   1   1   0   1   0
B   0   1   1   0   0
C   1   0   1   0   0
D   0   0   0   1   0
E   1   0   0   0   1

Upvotes: 7

Related Questions