Reputation: 4575
I have two dataframes like the ones sampled below. I'm trying to append the records from one of the dataframes to the bottom of the first. So the final data frame should only have two columns. Instead I seem to be appending the columns from one dataframe on to the right side of the first. Does anyone see what I'm doing wrong?
Code:
appendDf=df1.append(df2)
df1
28343 \
0 42267
1 157180
2 186320
https://s.m.com/is/ime/M/ts/mized/5_fpx.tif
0 https://sl.com/is/i/M/...
1 https://sl.com/is/i/M/…
2 https://sl.com/is/im/M/...
df2
454 \
0 223
1 155
2 334
https://s.m.com/is/ime/M/ts/mized/5.tif
0 https://slret.com/is/i/M/...
1 https://slfdsd.com/is/i/M/…
2 https://slfd.com/is/im/M/...
appendDf.head()
28343 https://s.m.com/is/ime/M/ts/mized/5_fpx.tif 454 https://s.m.com/is/ime/M/ts/mized/5.tif
Upvotes: 0
Views: 5126
Reputation: 9019
Your DataFrames
do not seem to have column headers (I imagine the first row of your data is being used as the column headers), which is likely the root of your issue. When you append the second DataFrame
, the program doesn't know which columns the data correspond to, so it adds them as new columns. See the following example:
import pandas as pd
df1 = pd.DataFrame([[28343, 'http://link1'], [42267, 'http://link2'],
[157180, 'http://link3'], [186320, 'http://link4']], columns=['ID','Link'])
df2 = pd.DataFrame([[454, 'http://link5'], [223, 'http://link6'],
[155, 'http://link7'], [334, 'http://link8']])
appendedDF = df1.append(df2)
Yields:
ID Link 0 1
0 28343.0 http://link1 NaN NaN
1 42267.0 http://link2 NaN NaN
2 157180.0 http://link3 NaN NaN
3 186320.0 http://link4 NaN NaN
0 NaN NaN 454.0 http://link5
1 NaN NaN 223.0 http://link6
2 NaN NaN 155.0 http://link7
3 NaN NaN 334.0 http://link8
Correct implementation:
import pandas as pd
df1 = pd.DataFrame([[28343, 'http://link1'], [42267, 'http://link2'],
[157180, 'http://link3'], [186320, 'http://link4']], columns=['ID','Link'])
df2 = pd.DataFrame([[454, 'http://link5'], [223, 'http://link6'],
[155, 'http://link7'], [334, 'http://link8']], columns=['ID','Link'])
appendedDF = df1.append(df2).reset_index(drop=True)
Yields:
ID Link
0 28343 http://link1
1 42267 http://link2
2 157180 http://link3
3 186320 http://link4
4 454 http://link5
5 223 http://link6
6 155 http://link7
7 334 http://link8
Upvotes: 1