Reputation: 5780
In my case the delimiter string is ' '
(3 consecutive spaces, but the answer should work for any multi-character delimiter), and an edge case text to search in could be this:
'Coord="GLOB"AL Axis=X Type="Y ZR" Color="Gray Dark" Alt="Q Z"qz Loc=End'
The solution should return the following strings:
Coord="GLOB"AL
Axis=X
Type="Y ZR"
Color="Gray Dark"
Alt="Q Z"qz
Loc=End
I've looked for regex solutions, evaluating also the inverse problem (match multi-character delimiter unless inside quotes), since the re.split
command of Python 3.4.3 allows to easily split a text by a regex pattern, but I'm not sure there is a regex solution, therefore I'm open also to (efficient) non regex solutions.
I've seen some solution to the inverse problem using lookahead/lookbehind containing regex pattern, but they did not work because Python lookahead/lookbehind (unlike other languages engine) requires fixed-width pattern.
This question is not a duplicate of Regex matching spaces, but not in "strings" or similar other questions, because:
Upvotes: 3
Views: 1362
Reputation: 67968
x='Coord="GLOB"AL Axis=X Type="Y ZR" Color="Gray Dark" Alt="Q Z"qz Loc=End'
print re.split(r'\s+(?=(?:[^"]*"[^"]*")*[^"]*$)',x)
You need to use lookahead
to see if the space
it not in between ""
Output ['Coord="GLOB"AL', 'Axis=X', 'Type="Y ZR"', 'Color="Gray Dark"', 'Alt="Q Z"qz', 'Loc=End']
For a generalized version if you want to split
on delimiters
not present inside ""
use
re.split(r'delimiter(?=(?:[^"]*"[^"]*")*[^"]*$)',x)
Upvotes: 4