CorneeldH
CorneeldH

Reputation: 593

Python Exception for escaped quotes within exception for quotes

I want adapt a csv from comma-separated to tab-separated. There are also commas between quotes, so I need an exception for that. So, some googling and stackoverflow got me this:

import re
f1 = open('query_result.csv', 'r')
f2 = open('query_result_tab_separated.csv', 'w')
for line in f1:
    line = re.sub(',(?=(([^\"]*\"){2})*[^\"]*$)(?![^\[]*\])', '\t', line)
f2.write(line)
f1.close()

However, between the quotes I also find escaped quotes \". An example of a line:

"01-003412467812","Drontmann B.V.",1,6420,"Expert in \"Social, Life and Tech Sciences\""

My current code changes the comma after Social into a tab as well, but I don't want this. How can I make an exception for quotes and within that exception and exception for escaped quotes?

Upvotes: 1

Views: 83

Answers (2)

mhawke
mhawke

Reputation: 87154

The csv module can handle this. You can set the escape character and specify how quotes within a field are escaped using escapechar and doublequote:

import csv

with open('file.csv') as infile, open('file_tabs.csv', 'w') as outfile:
    r = csv.reader(infile, doublequote=False, escapechar='\\')
    w = csv.writer(outfile, delimiter='\t', doublequote=False, escapechar='\\')
    w.writerows(r)

This will create a new tab delimited file that preserves the commas and escaped quotes within a field from the original file. Alternatively, the default settings will use "" (double quote) to escape the quotes:

w = csv.writer(outfile, delimiter='\t')

which would write data like this:

01-003412467812 Drontmann B.V.  1   6420    "Expert in ""Social, Life and Tech Sciences"""

Upvotes: 0

Eric Citaire
Eric Citaire

Reputation: 4523

You can't do this with regexp.

Python has a csv module which is intended to do this:

import csv
with open('test.csv', 'rb') as csvfile:
    data = csv.reader(csvfile, delimiter=',', quotechar='"', escapechar='\\')
    for row in data:
        print ' | '.join(row)

Upvotes: 2

Related Questions