Deep
Deep

Reputation: 49

Split by comma and how to exclude comma from quotes in split in python

I am struggling to split this string on the basis of comma but comma inside the double quotes should be ignored.

cStr = 'aaaa,bbbb,"ccc,ddd"' 

expected result : ['aaaa','bbbb',"ccc,ddd" ]

please help me, I tried different methods as mentioned in below soln but couldn't resolve this issue [I am not allowed to use csv, pyparsing module]

there is already similar question asked before for the below input.

cStr = '"aaaa","bbbb","ccc,ddd"' 

solution

result = ['"aaa"','"bbb"','"ccc,ddd"'] 

Upvotes: 1

Views: 465

Answers (3)

rohetoric
rohetoric

Reputation: 354

This can be achieved in three steps-

cstr = 'aaaa,bbbb,"ccc,ddd","eee,fff,ggg"'

Step 1-

X = cstr.split(',"')

Step 2-

regular_list = [i if '"' in i else i.split(",") for i in X ]

Step 3-

final_list = []
for i in regular_list:
    if type(i) == list:
        for j in i:
            final_list.append(j)
    else:
        final_list.append('"'+i)

Final output -

['aaaa', 'bbbb', '"ccc,ddd"', '"eee,fff,ggg"']

Upvotes: 0

Maurice Meyer
Maurice Meyer

Reputation: 18106

You could use list comprehension, no other libraries needed:

cStr = 'aaaa,bbbb,"ccc,ddd"'

# split by ," afterwards by , if item does not end with double quotes
l = [
    item.split(',') if not item.endswith('"') else [item[:-1]]
    for item in cStr.split(',"')
]
print(sum(l, []))

Out:

['aaaa', 'bbbb', 'ccc,ddd']

Upvotes: 0

Tim Biegeleisen
Tim Biegeleisen

Reputation: 521178

The usual way I handle this is to use a regex alternation which eagerly matches double quoted terms first, before non quoted CSV terms:

import re

cStr = 'aaaa,bbbb,"ccc,ddd"'
matches = re.findall(r'(".*?"|[^,]+)', cStr)
print(matches)  # ['aaaa', 'bbbb', '"ccc,ddd"']

Upvotes: 2

Related Questions