Callum Brown
Callum Brown

Reputation: 147

Split string by comma unless followed by a space or a '+'

I'm trying to split an extremely long string by commas. I have two requirements, however:

  1. the comma cannot be followed by a space
  2. the comma cannot be followed by a '+' symbol

so for example, the input would be:

text = "hello,+how are you?,I am fine, thanks"

and the output of this is:

['hello,+how are you?', 'I am fine, thanks']

i.e. the only comma that seperated the values was the one that was not followed by a '+' or a space

I have managed requirement 1) as follows:

re.split(r',(?=[^\s]+)',text)

I cannot figure out how to add requirement 2)

Upvotes: 0

Views: 2916

Answers (3)

Red
Red

Reputation: 27577

I suggest you go with @HampusLarsson's answer, but I'd like to squeeze in an answer that doesn't use imported modules:

s = "hello,+how are you?,I am fine, thanks"

ind = [0]+[i for i,v in enumerate(s)
           if v == ',' and s[i+1] not in [' ','+']]

parts = [s[i:j].lstrip(',')
         for i,j in zip(ind, ind[1:]+[None])]

print(parts)

Output:

['hello,+how are you?', 'I am fine, thanks']

Upvotes: 0

Hampus Larsson
Hampus Larsson

Reputation: 3100

The simplest solution is to only look for the pattern that you don't want, and exclude it altogether. You do that using negative-lookahead in regular-expression.

>>> text = "hello,+how are you?,I am fine, thanks"
>>> re.split(r',(?![+ ])', text)
['hello,+how are you?', 'I am fine, thanks']

This will match , unless it's followed either by a literal + or a space.

Upvotes: 3

Pratyaksh Saini
Pratyaksh Saini

Reputation: 113

Try this

re.split(r',(?=[^\s +])',text)

Upvotes: 0

Related Questions