Row-wise union of dataframe in pandas

Question

Say that I have two DataFrames's:

df1 = pd.DataFrame([('A', 0.3), ('B', 0.4)], columns = ('ID', 'Buy'))
df2 = pd.DataFrame([('B', 3), ('A', 4)], columns = ('ID', 'Sell'))

That yield:

    ID  Buy
0   A   0.3
1   B   0.4

and

    ID  Sell
0   B   3
1   A   4

respectively.

Now, I want to obtain a single DataFrame that collects the data, namely:

    ID  Buy  Sell
0   A   0.3  4
1   B   0.4  3

Note that the order of the lines in df1 and df2 may not be the same. Furthermore, there might ID's that appear only in one frame and not in the other --- in this case the missing value should be filled with NaN I guess.

How can I do it?

I tried something like

pd.concat([df1, df2], join = 'outer', axis = 1)

but doesn't return the desired result.

EdChum · Accepted Answer

I think you want to merge on ID column:

In [12]:

df1 = pd.DataFrame([('A', 0.3), ('B', 0.4)], columns = ('ID', 'Buy'))
df2 = pd.DataFrame([('B', 3), ('A', 4)], columns = ('ID', 'Sell'))
df1.merge(df2, on='ID', how='outer')
Out[12]:
  ID  Buy  Sell
0  A  0.3     4
1  B  0.4     3

Row-wise union of dataframe in pandas

Answers (1)

Related Questions