Reputation: 1300
I am trying to split a string by ",". 'split' function works fine for the following 'example1' as expected.
example1 = "1,'aaa',337.5,17195,.02,0,0,'yes','abc'"
example1.split(",")
Result: ['1', "'aaa'", '337.5', '17195', '.02', '0', '0', "'yes'", "'abc'"]
But, here i have a scenario, where there are commas within the single quotes, on which i do not want to split on.
example2 = "1,'aaa',337.5,17195,.02,0,0,'yes','abc, def, xyz'"
example2.split(",")
Result: ["1,'aaa',337.5,17195,.02,0,0,'yes','abc,", 'def,', "xyz'"]
But I am trying to get this result instead:
['1', "'aaa'", '337.5', '17195', '.02', '0', '0', "'yes'", "'abc, def, xyz'"]
How can I achieve this with string split function?
Upvotes: 4
Views: 1554
Reputation: 29985
Assuming that you want to keep those '
s around the elements ("'aaa'"
instead of 'aaa'
as in your expected output), here's how you may do it with a function:
def spl(st, ch):
res = []
temp = []
in_quote = False
for x in st:
if (x == "'"):
in_quote = not in_quote
if (not in_quote and x == ch):
res.append("".join(temp))
temp = []
else:
temp.append(x)
res.append("".join(temp))
return res
example2 = "1,'aaa',337.5,17195,.02,0,0,'yes','abc, def, xyz'"
print(spl(example2, ','))
Output:
['1', "'aaa'", '337.5', '17195', '.02', '0', '0', "'yes'", "'abc, def, xyz'"]
Upvotes: 0
Reputation: 164823
You should first try to use built-ins or the standard library to read in your data as a list, for instance directly from a CSV file via the csv
module.
If your string is from a source you cannot control, adding opening and closing square brackets gives a valid list
, so you can use ast.literal_eval
:
from ast import literal_eval
example2 = "1,'aaa',337.5,17195,.02,0,0,'yes','abc, def, xyz'"
res = literal_eval(f'[{example2}]')
# [1, 'aaa', 337.5, 17195, 0.02, 0, 0, 'yes', 'abc, def, xyz']
This does convert numeric data to integers / floats as appropriate. If you would like to keep them as strings, as per @JonClements' comment, you can pass to csv.reader
:
import csv
res = next(csv.reader([example2], quotechar="'"))
# ['1', 'aaa', '337.5', '17195', '.02', '0', '0', 'yes', 'abc, def, xyz']
Upvotes: 7