Reputation: 3885
I want to join two pandas dataframes on "ColA", but the thing is that values in "ColA" in these two dataframes are not in order and dataframes are not the same lenght. I want to join them so that missing values are changed with 0 value, and that values in "ColA" are matching.
df1 = pd.DataFrame({"ColA":["num 1", "num 2", "num 3"],
"ColB":[5,6,7]})
print(df1)
df2 = pd.DataFrame({"ColA":["num 2", "num 3","num 1", "num 4"],
"ColC":[3,2,1,5]})
print(df2)
ColA ColB
0 num 1 5
1 num 2 6
2 num 3 7
ColA ColC
0 num 2 3
1 num 3 2
2 num 1 1
3 num 4 5
Result should look like this:
# num1 is matched with appropriate values and num4 has the value 0 for "ColB"
ColA ColB ColC
0 num 1 5 1
1 num 2 6 3
2 num 3 7 2
3 num 4 0 5
Upvotes: 1
Views: 138
Reputation: 862471
Use DataFrame.merge
with outer join, convert NaN
s to 0
and last if necessary convert dtype
s to original by dictionary:
d = df1.dtypes.append(df2.dtypes).to_dict()
df = df1.merge(df2, how='outer', on='ColA').fillna(0).astype(d)
print (df)
ColA ColB ColC
0 num 1 5 1
1 num 2 6 3
2 num 3 7 2
3 num 4 0 5
Or use concat
with convert all columns to integers (if possible):
df = (pd.concat([df1.set_index('ColA'),
df2.set_index('ColA')], axis=1, sort=True)
.fillna(0)
.astype(int)
.rename_axis('ColA')
.reset_index())
print (df)
ColA ColB ColC
0 num 1 5 1
1 num 2 6 3
2 num 3 7 2
3 num 4 0 5
Upvotes: 3