Reputation: 41
UPDATE ***** I solved using awk on Windows. Using this command to successfuly add files side by side.
call awk -F"\t" "NR==FNR{a[NR]=$1; next} {print a[FNR], $0}" OFS="\t" test1.csv test2.csv
I've tried this a few ways but still can't get it to work, I am guessing it is something to do with the special characters in one of the files. Using the paste
tool, it's simple.
paste test1.csv test2.csv. > Test3.csv in Linux.
But I haven't got access to anything Linux-related for this task.
My environment is Windows 7, with Python 2.7 (No Pandas) and Perl Strawberry installed.
I need to merge 2 (or more) csv files together side by side. The files will always have the same number of lines.
I've tried this using python and it didn't work.
Join txt files side by side in python
I've tried this using Batch and it didn't work.
Merge csv file side by side using batch file.
test1.csv contains
python pdf2txt.py -o C:\Users\user\Desktop\Folder\Folder2\
python pdf2txt.py -o C:\Users\user\Desktop\Folder\Folder2\
python pdf2txt.py -o C:\Users\user\Desktop\Folder\Folder2\
test2.csv contains
123456.pdf
123457.pdf
124587.pdf
What I want the output to be (Test3.csv) is a tab delimited file containing;
python pdf2txt.py -o C:\Users\user\Desktop\Folder\Folder2\ 123456.pdf
python pdf2txt.py -o C:\Users\user\Desktop\Folder\Folder2\ 123457.pdf
python pdf2txt.py -o C:\Users\user\Desktop\Folder\Folder2\ 124587.pdf
Any help is greatly appreciated.
thanks you.
Upvotes: 0
Views: 2035
Reputation: 979
The python pyexcel package has pyexcel.cookbook.merge_two_files
(and pyexcel.cookbook.merge_files
for merging N files).
(pip install pyexcel
, see http://docs.pyexcel.org)
Upvotes: 0
Reputation: 4592
Here is an alternative and more intuitive solution using pyexcel:
>>> import pyexcel as p
>>> left=p.get_sheet(file_name='left.csv')
>>> left
left.csv:
+------------------------------------------------------------+
| python pdf2txt.py -o C:\Users\user\Desktop\Folder\Folder2\ |
+------------------------------------------------------------+
| python pdf2txt.py -o C:\Users\user\Desktop\Folder\Folder2\ |
+------------------------------------------------------------+
| python pdf2txt.py -o C:\Users\user\Desktop\Folder\Folder2\ |
+------------------------------------------------------------+
>>> right=p.get_sheet(file_name='right.csv')
>>> right
right.csv:
+------------+
| 123456.pdf |
+------------+
| 123457.pdf |
+------------+
| 124587.pdf |
+------------+
>>> left.column+=right # that's it
>>> left
left.csv:
+------------------------------------------------------------+------------+
| python pdf2txt.py -o C:\Users\user\Desktop\Folder\Folder2\ | 123456.pdf |
+------------------------------------------------------------+------------+
| python pdf2txt.py -o C:\Users\user\Desktop\Folder\Folder2\ | 123457.pdf |
+------------------------------------------------------------+------------+
| python pdf2txt.py -o C:\Users\user\Desktop\Folder\Folder2\ | 124587.pdf |
+------------------------------------------------------------+------------+
>>> left.save_as('merged.csv') # save it
For huge data sets, above solution will be slow or impossible to complete. Hence here is the code to cope with huge data sets:
>>> import pyexcel as p
>>> left=p.iget_array(file_name='left.csv')
>>> right=p.iget_array(file_name='right.csv')
>>> p.isave_as(array=(a+b for a, b in zip(left, right)), dest_file_name='merged.csv')
>>> p.free_resources()
Upvotes: 0
Reputation: 60994
Here's a solution using zip
. You may need to play around with the delimiter and quote chars depending on the exact setup of your csv files
with open('test1.csv', 'rb') as t1, open('test2.csv', 'rb') as t2, open('output.csv', 'wb') as output:
r1 = csv.reader(t1, delimiter=' ')
r2 = csv.reader(t2, delimiter=' ')
w = csv.writer(output, delimiter=' ')
for a, b in zip(r1, r2):
w.writerow(a + b)
Upvotes: 4