Reputation: 103
I have df1
:
x y no.
0 -17.7 -0.785430 y1
1 -15.0 -3820.085000 y4
2 -12.5 2.138833 y3
.. .... ........ ..
40 15.6 5.486901 y2
41 19.2 1.980686 y3
42 19.6 9.364718 y2
and df2
:
delta y x
0 0.053884 -17.7
1 0.085000 -15.0
2 0.143237 -12.5
.. ........ ....
40 0.113099 15.6
41 0.102245 19.2
42 0.235282 19.6
They both have 43 rows, and x
column is exactly the same on both.
Somehow when I merge them on x
I get a df with 123 rows:
x y no. delta y
0 -17.7 -0.785430 y1 0.053884
1 -15.0 -3820.085000 y4 0.085000
2 -12.5 2.138833 y3 0.143237
3 -12.4 1.721205 y3 0.251180
4 -12.1 2.227343 y2 0.127343
.. ... ... .. ...
118 12.1 1.642526 y3 0.143886
119 14.4 2576.435000 y4 0.171000
120 15.6 5.486901 y2 0.113099
121 19.2 1.980686 y3 0.102245
122 19.6 9.364718 y2 0.235282
My input: final = df1.merge(df2, on="x")
x float64
y float64
no. object
dtype: object
delta y float64 x float64 dtype: object
x float64 y float64 no. object dtype: object
delta y float64 x float64 dtype: object
x float64 y float64 no. object dtype: object
delta y float64 x float64 dtype: object
df1 = pd.DataFrame({'x': {0: -17.7, 1: -15.0, 2: -12.5, 3: -12.4, 4: -12.1, 5: -11.2, 6: -8.9, 7: -7.5, 8: -7.5, 9: -6.0, 10: -6.0, 11: -4.7, 12: -4.1, 13: -3.8, 14: -3.4, 15: -3.4, 16: -1.9, 17: -1.5, 18: -1.1, 19: -0.4, 20: -0.1, 21: 3.5, 22: 3.8, 23: 5.3, 24: 5.3, 25: 5.3, 26: 5.3, 27: 5.3, 28: 5.3, 29: 5.3, 30: 5.3, 31: 5.3, 32: 6.4, 33: 6.8, 34: 6.8, 35: 10.2, 36: 10.3, 37: 11.9, 38: 12.1, 39: 14.4, 40: 15.6, 41: 19.2, 42: 19.6}, 'y': {0: -0.7854295, 1: -3820.085, 2: 2.1388333, 3: 1.7212046, 4: 2.227343, 5: 0.04315967, 6: -0.9616607, 7: -1.9878536, 8: -0.52237016, 9: -283.27216, 10: -282.5332, 11: -0.4335017, 12: -1.1585577, 13: -0.008831219, 14: 848.92303, 15: -57.407845, 16: -9.010686, 17: -3.2473037, 18: 0.5536767, 19: 1.8351307, 20: 4.8347697, 21: -6.45842, 22: -1.5683812, 23: 0.9338831, 24: 0.9338831, 25: 97.65833, 26: 1.6500127, 27: 1.6500127, 28: 97.65833, 29: 97.65833, 30: 1.6500127, 31: 97.65833, 32: -3.655422, 33: 1.9058462, 34: 227.5592, 35: 857.7455, 36: -0.68584794, 37: 1.6785516, 38: 1.6425261, 39: 2576.435, 40: 5.4869013, 41: 1.9806856, 42: 9.364718}, 'no.': {0: 'y1', 1: 'y4', 2: 'y3', 3: 'y3', 4: 'y2', 5: 'y3', 6: 'y2', 7: 'y2', 8: 'y2', 9: 'y4', 10: 'y4', 11: 'y1', 12: 'y3', 13: 'y1', 14: 'y4', 15: 'y4', 16: 'y4', 17: 'y4', 18: 'y1', 19: 'y3', 20: 'y4', 21: 'y2', 22: 'y3', 23: 'y3', 24: 'y3', 25: 'y4', 26: 'y3', 27: 'y3', 28: 'y4', 29: 'y3', 30: 'y4', 31: 'y4', 32: 'y2', 33: 'y3', 34: 'y3', 35: 'y4', 36: 'y3', 37: 'y3', 38: 'y3', 39: 'y4', 40: 'y2', 41: 'y3', 42: 'y2'}})
df2 = pd.DataFrame({'delta y': {0: 0.05388353000000001, 1: 0.08500000000003638, 2: 0.14323679999999994, 3: 0.25117999999999996, 4: 0.12734299999999976, 5: 0.36285006000000003, 6: 0.13833930000000005, 7: 0.5121464, 8: 1.97762984, 9: 0.2721599999999853, 10: 0.4667999999999779, 11: 0.2692114, 12: 0.00890970000000002, 13: 0.314458351, 14: 906.34703, 15: 0.0161549999999977, 16: 0.06831400000000087, 17: 0.3723036999999998, 18: 0.2988478, 19: 0.006991300000000145, 20: 0.14423030000000026, 21: 0.04157999999999973, 22: 0.013554200000000183, 23: 0.17486560000000007, 24: 0.17486560000000007, 25: 0.03866999999999621, 26: 0.541264, 27: 0.541264, 28: 0.03866999999999621, 29: 96.5495813, 30: 96.0469873, 31: 0.03866999999999621, 32: 0.05542200000000008, 33: 0.1670513, 34: 225.82040510000002, 35: 0.38250000000005, 36: 0.59580486, 37: 0.10641100000000003, 38: 0.14388610000000002, 39: 0.17099999999982174, 40: 0.11309869999999922, 41: 0.10224489999999986, 42: 0.23528199999999977}, 'x': {0: -17.7, 1: -15.0, 2: -12.5, 3: -12.4, 4: -12.1, 5: -11.2, 6: -8.9, 7: -7.5, 8: -7.5, 9: -6.0, 10: -6.0, 11: -4.7, 12: -4.1, 13: -3.8, 14: -3.4, 15: -3.4, 16: -1.9, 17: -1.5, 18: -1.1, 19: -0.4, 20: -0.1, 21: 3.5, 22: 3.8, 23: 5.3, 24: 5.3, 25: 5.3, 26: 5.3, 27: 5.3, 28: 5.3, 29: 5.3, 30: 5.3, 31: 5.3, 32: 6.4, 33: 6.8, 34: 6.8, 35: 10.2, 36: 10.3, 37: 11.9, 38: 12.1, 39: 14.4, 40: 15.6, 41: 19.2, 42: 19.6}})
final = df1.merge(df2, on="x")
Upvotes: 0
Views: 65
Reputation: 77367
The problem is that x
values are not unique, so the merge duplicates rows to get all of the combinations. In a simple example
>>> import pandas as pd
>>> df1=pd.DataFrame({"a":[1,2,3,2], "b":['a', 'b', 'c', 'd']})
>>> df2=pd.DataFrame({"a":[1,2,3,2], "c":['aa', 'bb', 'cc', 'dd']})
>>> df1.merge(df2, on='a')
a b c
0 1 a aa
1 2 b bb
2 2 b dd
3 2 d bb
4 2 d dd
5 3 c cc
2
is not unique in the column and gets all of the combinations (notice b --> dd and d --> dd).
In your case, the x
column is identical in the two dataframes. This would also mean that indexes haven't changed and you could assign the columns you want to df1
.
df1["delta y"] = df2["delta y"]
Upvotes: 1
Reputation: 15
Try the following syntax and I encourage you to thoroughly read the official documentation of python, the link is given at the bottom. I think you might have different x values in df1 and df2 and they are not 100% identical. This could be perhaps because of the decimals.
import pandas as pd
left = pd.DataFrame(
{
"key": ["K0", "K1", "K2", "K3"],
"A": ["A0", "A1", "A2", "A3"],
"B": ["B0", "B1", "B2", "B3"],
}
)
right = pd.DataFrame(
{
"key": ["K0", "K1", "K2", "K3"],
"C": ["C0", "C1", "C2", "C3"],
"D": ["D0", "D1", "D2", "D3"],
}
)
result = pd.merge(left, right, on="key")
Python Merge,Join, Concatenate Official Guide
Upvotes: 1
Reputation: 443
try the following: df1.join(df2)
join is a column-wise left join
pd.merge is a column-wise inner join
pd.concat is a row-wise outer join
pd.concat: takes Iterable arguments. Thus, it cannot take DataFrames directly (use [df,df2]) Dimensions of DataFrame should match along axis
Join and pd.merge: can take DataFrame arguments
ref: Merge two dataframes by index
Upvotes: 1