user984003
user984003

Reputation: 29597

Split string by comma, ignoring comma inside string. Am trying CSV

I have a string like this:

s = '1,2,"hello, there"'

And I want to turn it into a list:

[1,2,"hello, there"]

Normally I'd use split:

my_list = s.split(",") 

However, that doesn't work if there's a comma in a string.

So, I've read that I need to use cvs, but I don't really see how. I've tried:

from csv import reader
s = '1,2,"hello, there"'
ll = reader(s)
print ll 
for row in ll:
    print row

Which writes:

<_csv.reader object at 0x020EBC70>

['1']
['', '']
['2']
['', '']
['hello, there']

I've also tried with

ll = reader(s, delimiter=',')

Upvotes: 1

Views: 949

Answers (5)

Ivan Anishchuk
Ivan Anishchuk

Reputation: 485

It's usually easier to re-use than to invent a bicycle... You just to use csv library properly. If you can't for some reason, you can always check the source code out and learn how's the parsing done there.

Example for parsing a single string into a list. Notice that the string in wrapped in list.

>>> import csv
>>> s = '1,2,"hello, there"'
>>> list(csv.reader([s]))[0]
['1', '2', 'hello, there']

Upvotes: 1

Ben
Ben

Reputation: 6777

You could also use ast.literal_eval if you want to preserve the integers:

>>> from ast import literal_eval
>>> literal_eval('[{}]'.format('1,2,"hello, there"'))
[1, 2, 'hello, there']

Upvotes: 0

meiamsome
meiamsome

Reputation: 2944

You can split first by the string delimiters, then by the commas for every even index (The ones not in the string)

import itertools

new_data = s.split('"')
for i in range(len(new_data)):
    if i % 2 == 1: # Skip odd indices, making them arrays
       new_data[i] = [new_data[i]]
    else:
        new_data[i] = new_data[i].split(",")
data = itertools.chain(*new_data)

Which goes something like:

'1,2,"hello, there"'
['1,2,', 'hello, there']
[['1', '2'], ['hello, there']]
['1', '2', 'hello, there']

But it's probably better to use the csv library if that's what you're working with.

Upvotes: 0

dkroy
dkroy

Reputation: 2020

It is that way because you provide the csv reader input as a string. If you do not want to use a file or a StringIO object just wrap your string in a list as shown below.

>>> import csv
>>> s = ['1,2,"hello, there"']
>>> ll = csv.reader(s, delimiter=',', quotechar='"', quoting=csv.QUOTE_MINIMAL)
>>> list(ll)
[['1', '2', 'hello, there']]

Upvotes: 2

gabe
gabe

Reputation: 2511

It sounds like you probably want to use the csv module. To use the reader on a string, you want a StringIO object.

As an example:

>> import csv, StringIO
>> print list(csv.reader(StringIO.StringIO(s)))
[['1', '2', 'hello, there']]

To clarify, csv.reader expects a buffer object, not a string. So StringIO does the trick. However, if you're reading this csv from a file object, (a typical use case) you can just as easily give the file object to the reader and it'll work the same way.

Upvotes: 1

Related Questions