Reputation: 147
I'm trying to split an extremely long string by commas. I have two requirements, however:
so for example, the input would be:
text = "hello,+how are you?,I am fine, thanks"
and the output of this is:
['hello,+how are you?', 'I am fine, thanks']
i.e. the only comma that seperated the values was the one that was not followed by a '+' or a space
I have managed requirement 1) as follows:
re.split(r',(?=[^\s]+)',text)
I cannot figure out how to add requirement 2)
Upvotes: 0
Views: 2916
Reputation: 27577
I suggest you go with @HampusLarsson's answer, but I'd like to squeeze in an answer that doesn't use imported modules:
s = "hello,+how are you?,I am fine, thanks"
ind = [0]+[i for i,v in enumerate(s)
if v == ',' and s[i+1] not in [' ','+']]
parts = [s[i:j].lstrip(',')
for i,j in zip(ind, ind[1:]+[None])]
print(parts)
Output:
['hello,+how are you?', 'I am fine, thanks']
Upvotes: 0
Reputation: 3100
The simplest solution is to only look for the pattern that you don't want, and exclude it altogether. You do that using negative-lookahead in regular-expression.
>>> text = "hello,+how are you?,I am fine, thanks"
>>> re.split(r',(?![+ ])', text)
['hello,+how are you?', 'I am fine, thanks']
This will match ,
unless it's followed either by a literal +
or a space
.
Upvotes: 3