PuffedRiceCrackers
PuffedRiceCrackers

Reputation: 785

Regex on a string that may or may not be comma separated

I'm trying to capture a line of string that may or may not have a comma (:only 0 or 1 comma will be given). So the data will be something like below and regex execution will happen line by line.

cake,strawberry
shortbread
english-muffin,blueberry

Desired capture of first group:

cake
shortbread
english-muffin

Desired capture of last group:

strawberry

blueberry

What I initially tried was (.*?)(,)?(.*) but that captured cake,strawberry as one group. I also tried several others but it was more or less the same. Should I take this as 2 separate patterns?

Upvotes: 1

Views: 798

Answers (1)

Mark Tolonen
Mark Tolonen

Reputation: 177725

Use ([^,]*)(?:,(.*))?:

  • ([^,]*) match zero or more "not a comma" and capture it
  • (?:,(.*))? optionally, match a comma and capture everything after it

Note: (?:) is a non-capturing group.

Python demo:

import re

lines = ['cake,strawberry',
         'shortbread',
         'english-muffin,blueberry']

for line in lines:
    print(re.match('([^,]*)(?:,(.*))?',line).groups())
('cake', 'strawberry')
('shortbread', None)
('english-muffin', 'blueberry')

Upvotes: 2

Related Questions