Wesley Bland
Wesley Bland

Reputation: 9062

Trying to construct regex with predefined, comma-separated list of words to capture

I'm working on generating a Groovy script to parse a configuration string where I want to capture each of the words (for a combination of GitHub hooks and Jenkins scripts). I want to parse strings that look like this:

test:config1a,config1b/config2a/config3a,config3c

If I leave off the ability to have a comma separated list, I can get it working with a regex that looks like this:

configs = input_string =~ /^test:(config1a|config1b)\/(config2a|config2b)\/(config3a|config3b|config3c)/

However, adding the ability to use a comma separated string for any of the individual configs throws a wrench in it. I can get it to match, but I can't get out the list of values:

configs = input_string =~ /^test:((config1a|config1b),?)+\/((config2a|config2b),?)+\/((config3a|config3b|config3c),?)+/

The output of the above string would be:

[test:config1a,config1b/config2a/config3a,config3c, config1b, config1b, config2a, config3c, config3c]

The output is the same without Groovy if I put it in regex101.com (for some reason I can't save the regex to link here).

Upvotes: 1

Views: 190

Answers (1)

Srdjan M.
Srdjan M.

Reputation: 3405

Regex: config(?:1[ab]|2[ab]|3[abc])(?=[,/]|$)

Details:

  • () Capturing group
  • (?:) Non-capturing group
  • | Or
  • [] Match a single character present in the list
  • ? Matches between zero and one times

Groovy code:

def input = "test:config1a,config1b/config2a/config3a,config3c"
def configs = (input =~ /config(?:1[ab]|2[ab]|3[abc])(?=[,\/]|$)/).collect { it }

Output:

[config1a, config1b, config2a, config3a, config3c]

Code demo

Upvotes: 1

Related Questions