tijko
tijko

Reputation: 8292

Custom Python CSV delimiter

How to ignore commas in between double quotes and remove commas that are not between double quotes?

Upvotes: 0

Views: 1094

Answers (2)

Li-aung Yip
Li-aung Yip

Reputation: 12486

Just for your interest, you can (mostly) do this using regular expressions;

mystr = 'No quotes,"Quotes",1.0,42,"String, with, quotes",1,2,3,"",,""'
import re
csv_field_regex = re.compile("""
(?:^|,)         # Lookbehind for start-of-string, or comma
(
    "[^"]*"     # If string is quoted: match everything up to next quote
    |
    [^,]*       # If string is unquoted: match everything up to the next comma
)
(?=$|,)         # Lookahead for end-of-string or comma
""", re.VERBOSE)

m = csv_field_regex.findall(mystr)

>>> pprint.pprint(m)
['No quotes',
 '"Quotes"',
 '1.0',
 '42',
 '"String, with, quotes"',
 '1',
 '2',
 '3',
 '""',
 '',
 '""']

This handles everything except escaped quote marks appearing inside quoted strings. It's possible to handle this case too, but the regex gets nastier; this is why we have the csv module.

Upvotes: 1

Sean Vieira
Sean Vieira

Reputation: 159905

Batteries are included - simply use the csv module that comes with Python.

Example:

import csv

if __name__ == '__main__':
    file_path = r"/your/file/path/here.csv"
    file_handle = open(file_path, "r")
    csv_handle = csv.reader(file_handle)
    # Now you can work with the *values* in the csv file.

Upvotes: 3

Related Questions