falcon
falcon

Reputation: 1434

Regex capturing groups within logical OR

I have a set of strings I need to parse and extract values from. They look like:

    /apple/1212d3fe 
    /cat/23224a2f4 
    /auto/445478eefd
    /somethingelse/1234fded

It should match only apple, cat and auto. The output I expect is:

1212, d3fe
23224, a2f4
445478, eefd
null

I need to come up with a regex capturing groups to do the same. I am able to extract the second part but not the first one. The closest I came up with is:

String r2 = "^/(apple/[0-9]{4}|cat/[0-9]{5}|auto/[0-9]{6})([a-f0-9]{4})$";
System.out.println(r2);

Pattern pattern2 = Pattern.compile(r2);

Matcher matcher2 = pattern2.matcher("/apple/2323efff");
if (matcher2.find()) {
  System.out.println(matcher2.group(1));
  System.out.println(matcher2.group(2));
}

UPDATED QUESTION:

I have a set of strings I need to parse and extract values from. They look like:

    /apple/1212d3fe 
    /cat/23e24a2f4 
    /auto/df5478eefd
    /somethingelse/1234fded

It should match only apple, cat and auto. The output I expect is the everything after the 2nd '/' split as follows: 4 characters if 'apple', 5 characters if 'cat' and 6 characters if 'auto' like:

1212, d3fe
23e24, a2f4
df5478, eefd
null

I need to come up with a regex capturing groups to do the same. I am able to extract the second part but not the first one. The closest I came up with is:

String r2 = "^/(apple/[0-9]{4}|cat/[0-9]{5}|auto/[0-9]{6})([a-f0-9]{4})$";
System.out.println(r2);

Pattern pattern2 = Pattern.compile(r2);

Matcher matcher2 = pattern2.matcher("/apple/2323efff");
if (matcher2.find()) {
  System.out.println(matcher2.group(1));
  System.out.println(matcher2.group(2));
}

I can do it without the regex OR(|) but it breaks when I include it. Any help with the right regex?

Upvotes: 1

Views: 331

Answers (3)

anubhava
anubhava

Reputation: 785128

Updated Answer:

As per your updated question you can use this regex based on lookbehind assertions:

/((?<=apple/).{4}|(?<=cat/).{5}|(?<=auto/).{6})(.+)$

RegEx Demo

  • This regex uses 2 capture groups after matching /
  • In 1st group we have 3 lookbehind conditions with alternations.
  • (?<=apple/).{4} makes sure that we match 4 characters that have apple/ on left hand side. Likewise we match 5 and 6 character strings that have cat/ and /auto/.
  • In 2nd capture group we match remaining characters before end of line.

Upvotes: 2

Samuel Philipp
Samuel Philipp

Reputation: 11042

If you want the last group to have exactly 4 digits you can use this regex:

/(apple|cat|auto)/([0-9a-f]+)([0-9a-f]{4})

Here is a working example:

List<String> strings = Arrays.asList("/apple/1212d3fe", "/cat/23224a2f4", "/auto/445478eefd");
Pattern pattern = Pattern.compile("/(apple|cat|auto)/([0-9a-f]+)([0-9a-f]{4})");
for (String string : strings) {
    Matcher matcher = pattern.matcher(string);
    if (matcher.find()) {
        System.out.println(matcher.group(1));
        System.out.println(matcher.group(2));
        System.out.println(matcher.group(3));
    }
}

If you want for digits after apple, 5 after cat and 6 after auto you can split your algorithm in 2 parts:

List<String> strings = Arrays.asList("/apple/1212d3fe", "/cat/23224a2f4", "/auto/445478eefd", "/some/445478eefd");
Pattern firstPattern = Pattern.compile("/(apple|cat|auto)/([0-9a-f]+)");
for (String string : strings) {
    Matcher firstMatcher = firstPattern.matcher(string);
    if (firstMatcher.find()) {
        String first = firstMatcher.group(1);
        System.out.println(first);
        int length = getLength(first);
        Pattern secondPattern = Pattern.compile("([0-9a-f]{" + length + "})([0-9a-f]{4})");
        Matcher secondMatcher = secondPattern.matcher(string);
        if (secondMatcher.find()) {
            System.out.println(secondMatcher.group(1));
            System.out.println(secondMatcher.group(2));
        }
    }
}

private static int getLength(String key) {
    switch (key) {
        case "apple":
            return 4;
        case "cat":
            return 5;
        case "auto":
            return 6;
    }
    throw new IllegalArgumentException("key not allowed");
}

Upvotes: 0

SomeDude
SomeDude

Reputation: 14228

You could use the regex \/[apple|auto|cat]+\/(\d*)(.*), See here

Upvotes: 0

Related Questions