Reputation: 15476
Consider the following string:
'538.48,0.29,"533.59 - 540.00","AZO",102482,"+0.05%","N/A",0.00,535.09,"AutoZone, Inc. Co",538.77,"N/A"'
I need to split this into a list so it looks like the following:
[538.48, 0.29, "533.59 - 540.00", "AZO", 102482, "+0.05%" , "N/A", 0.00, 535.09, "AutoZone, Inc. Co", 538.77, "N/A"]
The problem is I can't use list.split(',')
because the 10th field has a comma within it. The question is then how best to split the original string into a list when arbitrary fields may have a comma?
Upvotes: 1
Views: 739
Reputation: 1122242
Use the csv
module rather than attempt to split this yourself, it handles quoted values, including quoted values containing the delimiter, out of the box:
>>> import csv
>>> from pprint import pprint
>>> data = '538.48,0.29,"533.59 - 540.00","AZO",102482,"+0.05%","N/A",0.00,535.09,"AutoZone, Inc. Co",538.77,"N/A"'
>>> reader = csv.reader(data.splitlines())
>>> pprint(next(reader))
['538.48',
'0.29',
'533.59 - 540.00',
'AZO',
'102482',
'+0.05%',
'N/A',
'0.00',
'535.09',
'AutoZone, Inc. Co',
'538.77',
'N/A']
Note the 'AutoZone, Inc. Co'
column value.
If you are reading this data from a file, pass in the file object to the csv.reader()
object directly rather than hand it sequences of strings.
You can even have the numeric values (anything not quoted) interpreted as floating point values, by setting quoting=csv.QUOTE_NONNUMERIC
:
>>> reader = csv.reader(data.splitlines(), quoting=csv.QUOTE_NONNUMERIC)
>>> pprint(next(reader))
[538.48,
0.29,
'533.59 - 540.00',
'AZO',
102482.0,
'+0.05%',
'N/A',
0.0,
535.09,
'AutoZone, Inc. Co',
538.77,
'N/A']
Upvotes: 2