Reputation: 593
I want adapt a csv from comma-separated to tab-separated. There are also commas between quotes, so I need an exception for that. So, some googling and stackoverflow got me this:
import re
f1 = open('query_result.csv', 'r')
f2 = open('query_result_tab_separated.csv', 'w')
for line in f1:
line = re.sub(',(?=(([^\"]*\"){2})*[^\"]*$)(?![^\[]*\])', '\t', line)
f2.write(line)
f1.close()
However, between the quotes I also find escaped quotes \". An example of a line:
"01-003412467812","Drontmann B.V.",1,6420,"Expert in \"Social, Life and Tech Sciences\""
My current code changes the comma after Social into a tab as well, but I don't want this. How can I make an exception for quotes and within that exception and exception for escaped quotes?
Upvotes: 1
Views: 83
Reputation: 87154
The csv
module can handle this. You can set the escape character and specify how quotes within a field are escaped using escapechar
and doublequote
:
import csv
with open('file.csv') as infile, open('file_tabs.csv', 'w') as outfile:
r = csv.reader(infile, doublequote=False, escapechar='\\')
w = csv.writer(outfile, delimiter='\t', doublequote=False, escapechar='\\')
w.writerows(r)
This will create a new tab delimited file that preserves the commas and escaped quotes within a field from the original file. Alternatively, the default settings will use ""
(double quote) to escape the quotes:
w = csv.writer(outfile, delimiter='\t')
which would write data like this:
01-003412467812 Drontmann B.V. 1 6420 "Expert in ""Social, Life and Tech Sciences"""
Upvotes: 0
Reputation: 4523
You can't do this with regexp.
Python has a csv
module which is intended to do this:
import csv
with open('test.csv', 'rb') as csvfile:
data = csv.reader(csvfile, delimiter=',', quotechar='"', escapechar='\\')
for row in data:
print ' | '.join(row)
Upvotes: 2