user634545
user634545

Reputation: 9419

Regular expression, match substring between pipes

I want to extract/match substrings/sizes in the following string "|XS|XL|S|M|" using regular expression. In this particular case, XS, XL, S and M.

I have tried the following regular expressions without success.

\|(\w+)\|

Matches: XS, S

(?=.(\w+)) 

Matches: XS, S, XL, L, S, M

Upvotes: 4

Views: 14514

Answers (3)

Nicolas Lalevée
Nicolas Lalevée

Reputation: 2013

This should work for you: ([^|]+). It means everything but pipes.

Upvotes: 3

Bohemian
Bohemian

Reputation: 424983

You are consuming the pipes. Instead, use look arounds:

(?<=\|).*?(?=\|)

To split the string, use a pipe as a delimiter after trimming leading/trailing pipes.
In java, to do it in one line:

String[] sizes = str.replaceAll("(^\\|)|(\\|$)", "").split("\\|");

Upvotes: 2

Boris the Spider
Boris the Spider

Reputation: 61128

You problem with the first match is that is consumes the pipes, so they are not there for the next match.

The second pattern is a little convoluted but what you are saying is for each character in the string grab all word characters that follow it, without consuming them. So at the first pipe that is XS, the engine then moves to the X where the answer is S. The engine then moved to the S where the pattern doesn't match.

You need to use positive lookaround, so you match and consume the text between pipes without consuming the pipes. You want to, for any group of word characters, assert that it has a pipe preceding and following it. In which case, you want to consume it.

If your language supports it (You don't mention which regex engine you are using) this pattern will work:

(?<=\|)[^|]++(?=\|)
  • (?<=\|) asserts that there is a pipe behind the pattern
  • [^|]++ possessively matches all non-pipe characters
  • (?=\|) asserts that there is a pipe following the pattern

Here is a testcase in Java (ignore the \\, there are just Java syntax):

public static void main(String[] args) throws Exception {
    final String test = "|XS|XL|S|M|";
    final Pattern pattern = Pattern.compile("(?<=\\|)[^|]++(?=\\|)");
    final Matcher matcher = pattern.matcher(test);
    while(matcher.find()) {
        System.out.println(matcher.group());
    }
}

Output:

XS
XL
S
M

Upvotes: 12

Related Questions