Reputation: 61
I have a string like this here:
"BLAX", "BLAY", "BLAZ, BLUBB", "BLAP"
And yes the double quotes are within this string.
Now I want to split this string into several parts with mystring.split(",")
What I got is this
"BLAX"
"BLAY"
"BLAZ
BLUBB"
"BLAP"
But what I want is this:
"BLAX"
"BLAY"
"BLAZ, BLUBB"
"BLAP"
How can I achieve this and as well I want to keep the double quotes? I need this because I work with toml files.
Solution: Thanks @Giacomo Alzetta
I used the split command with the regular expression. Thanks also for explaining this!
Upvotes: 2
Views: 802
Reputation: 25269
You may replace
and split
s.replace('", ', '"|').split('|')
Out[672]: ['"BLAX"', ' "BLAY"', ' "BLAZ, BLUBB"', ' "BLAP"']
Upvotes: 0
Reputation: 82795
You can also use the csv
module.
Ex:
import csv
s = '"BLAX", "BLAY", "BLAZ, BLUBB", "BLAP"'
r = csv.reader(s, delimiter = ',', quotechar='"')
res = [j for i in r for j in i if j.strip()]
print(res)
Output:
['BLAX', 'BLAY', 'BLAZ, BLUBB', 'BLAP']
Upvotes: 2
Reputation: 5478
As I said in comments, you can split at more than a single separator. A comma gets both a one in quotes and outside, but we can do split at ",
(added a space so that we don't have to strip it ;) )
Then we add the missing quotations:
original = '"BLAX", "BLAY", "BLAZ, BLUBB", "BLAP"'
[s if s.endswith('"') else s+'"' for s in original.split('", ')]
Output: ['"BLAX"', '"BLAY"', '"BLAZ, BLUBB"', '"BLAP"']
This approach doesn't use regexes, so it's faster. You also don't need to play with what regexes are correct for your case (I generally like regexes, but I like smart splitting and operations more).
Upvotes: 1
Reputation: 195593
You can use ast.literal_eval
and then add '"'
manually:
s = '"BLAX", "BLAY", "BLAZ, BLUBB", "BLAP"'
from ast import literal_eval
data = literal_eval('(' + s + ')')
for d in data:
print('"{}"'.format(d))
Prints:
"BLAX"
"BLAY"
"BLAZ, BLUBB"
"BLAP"
Upvotes: 2
Reputation: 13413
you can split by "
then remove the unwanted leftovers, and rewrap everything in quotes, with a simple list-comp.
string = '"BLAX", "BLAY", "BLAZ, BLUBB", "BLAP"'
parts = ['"{}"'.format(s) for s in string.split('"') if s not in ('', ', ')]
for p in parts:
print(p)
Output:
"BLAX"
"BLAY"
"BLAZ, BLUBB"
"BLAP"
Upvotes: 1
Reputation: 2479
You can use a regular expression and the re.split
function:
>>> import re
>>> re.split(r'(?<="),', '"BLAX", "BLAY", "BLAZ, BLUBB", "BLAP"')
['"BLAX"', ' "BLAY"', ' "BLAZ, BLUBB"', ' "BLAP"']
(?<=")
means must be preceded by "
but the "
is not included in the actual match so only the ,
is used to actually do the splitting.
You could split by ",
but then you'd have to fix up the parts where the "
is now missing:
>>> '"BLAX", "BLAY", "BLAZ, BLUBB", "BLAP"'.split('",')
['"BLAX', ' "BLAY', ' "BLAZ, BLUBB', ' "BLAP"']
>>> [el + ('' if el.endswith('"') else '"') for el in '"BLAX", "BLAY", "BLAZ, BLUBB", "BLAP"'.split('",')]
['"BLAX"', ' "BLAY"', ' "BLAZ, BLUBB"', ' "BLAP"']
Upvotes: 1