Reputation: 180
I'm trying to merge two excel sheets using the common filed Serial but throwing some errors. My program is as below :
(user1_env)root@ubuntu:~/user1/test/compare_files# cat compare.py
import pandas as pd
source1_df = pd.read_excel('a.xlsx', sheetname='source1')
source2_df = pd.read_excel('a.xlsx', sheetname='source2')
joined_df = source1_df.join(source2_df, on='Serial')
joined_df.to_excel('/root/user1/test/compare_files/result.xlsx')
getting error as below :
(user1_env)root@ubuntu:~/user1/test/compare_files# python3.5 compare.py
Traceback (most recent call last):
File "compare.py", line 5, in <module>
joined_df = source1_df.join(source2_df, on='Serial')
File "/home/user1/miniconda3/envs/user1_env/lib/python3.5/site-packages/pandas/core/frame.py", line 4385, in join
rsuffix=rsuffix, sort=sort)
File "/home/user1/miniconda3/envs/user1_env/lib/python3.5/site-packages/pandas/core/frame.py", line 4399, in _join_compat
suffixes=(lsuffix, rsuffix), sort=sort)
File "/home/user1/miniconda3/envs/user1_env/lib/python3.5/site-packages/pandas/tools/merge.py", line 39, in merge
return op.get_result()
File "/home/user1/miniconda3/envs/user1_env/lib/python3.5/site-packages/pandas/tools/merge.py", line 223, in get_result
rdata.items, rsuf)
File "/home/user1/miniconda3/envs/user1_env/lib/python3.5/site-packages/pandas/core/internals.py", line 4445, in items_overlap_with_suffix
to_rename)
ValueError: columns overlap but no suffix specified: Index(['Serial'], dtype='object')
I'm referring below SO link for the issue : python compare two excel sheet and append correct record
Upvotes: 0
Views: 617
Reputation: 1032
Small modification worked for me,
import pandas as pd
source1_df = pd.read_excel('a.xlsx', sheetname='source1')
source2_df = pd.read_excel('a.xlsx', sheetname='source2')
joined_df = pd.merge(source1_df,source2_df,on='Serial',how='outer')
joined_df.to_excel('/home/gk/test/result.xlsx')
Upvotes: 1
Reputation: 656
It is because of the overlapping column names after join. You can either set your index to Serial
and join, or specify a rsuffix=
or lsuffix=
value in your join
function so that the suffix value would be appended to the common column names.
Upvotes: 1