Reputation: 3677
I am trying to merge 2 datasets together where column names overlap.
for example like this:
df1:
col1 col2
aa aa
bb bb
df2:
col2 col3
cc dd
new_df = pd.concat([df1,df2],axis=1)
new_df:
col1 col2 col3
aa aa
bb bb
cc dd
When I run the above line in my code I get something like this:
col1 col2 col2.1 col3
aa aa nan
bb bb nan
cc nan dd
How do I prevent the .1
from appearing and force pd.concat to match the column names and insert data?
Upvotes: 1
Views: 357
Reputation: 477308
You concatenated along the wrong axis. here you used the column axis, whereas you want to concatenate over the index axis:
>>> pd.concat([df1, df2], axis='rows')
col1 col2 col3
0 aa aa NaN
1 bb bb NaN
0 NaN cc dd
So by either specifying axis=0
, axis='rows'
, axis='index'
, or omitting it totally, the columns are "grouped", and you concatenate "vertically".
Upvotes: 3