Richard Oswald
Richard Oswald

Reputation: 61

Regex parsing from delimited string with sequential groups

I'm trying to parse out words from a delimited string, and have the capture groups in sequential order. for example

dog.cat.chicken.horse.whale

I know of ([^.]+) which can parse out each word but this puts every string in capture group 1.

Match 1
Full match  0-3 `dog`
Group 1.    0-3 `dog`
Match 2
Full match  4-7 `cat`
Group 1.    4-7 `cat`
Match 3
Full match  8-15    `chicken`
Group 1.    8-15    `chicken`
Match 4
Full match  16-21   `horse`
Group 1.    16-21   `horse`
Match 5
Full match  22-27   `whale`
Group 1.    22-27   `whale`

What I really need is something like

Match 1
Full match  0-27    `dog.cat.chicken.horse.whale`
Group 1.    0-3 `dog`
Group 2.    4-7 `cat`
Group 3.    8-15    `chicken`
Group 4.    16-21   `horse`
Group 5.    22-27   `whale`

I've tried multiple iterations with no success, does anyone know how to do this?

Upvotes: 6

Views: 230

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626926

There is no good solution for this case. All you might do is add optional non-capturing groups with the capturing ones to account for some set number of groups.

So, it might look like

([^.]+)\.([^.]+)\.([^.]+)\.([^.]+)\.([^.]+)(?:\.([^.]+))?(?:\.([^.]+))?(?:\.([^.]+))?

and so on and so forth, just add more (?:\.([^.]+))? until you reach some limit that you should define.

See the regex demo.

Note that you might want to anchor the pattern to avoid partial matches:

^([^.]+)\.([^.]+)\.([^.]+)\.([^.]+)\.([^.]+)(?:\.([^.]+))?(?:\.([^.]+))?(?:\.([^.]+))?$

The ^ matches the start of the string and $ asserts the position at the end of the string.

Upvotes: 1

Related Questions