Mark Renouf
Mark Renouf

Reputation: 30990

What's up with this regular expression not matching?

public class PatternTest {
  public static void main(String[] args) {
    System.out.println("117_117_0009v0_172_5738_5740".matches("^([0-9_]+v._.)"));
  }
}

This program prints "false". What?!

I am expecting to match the prefix of the string: "117_117_0009v0_1"

I know this stuff, really I do... but for the life of me, I've been staring at this for 20 minutes and have tried every variation I can think of and I'm obviously missing something simple and obvious here.

Hoping the many eyes of SO can pick it out for me before I lose my mind over this.

Thanks!


The final working version ended up as:

String text = "117_117_0009v0_172_5738_5740";
String regex = "[0-9_]+v._.";

Pattern p = Pattern.compile(regex);

Mather m = p.matcher(text);
if (m.lookingAt()) {
  System.out.println(m.group());
}

One non-obvious discovery/reminder for me was that before accessing matcher groups, one of matches() lookingAt() or find() must be called. If not an IllegalStateException is thrown with the unhelpful message "Match not found". Despite this, groupCount() will still return non-zero, but it lies. Do not beleive it.

I forgot how ugly this API is. Argh...

Upvotes: 1

Views: 421

Answers (3)

Vitalii Fedorenko
Vitalii Fedorenko

Reputation: 114420

If you want to check if a string starts with the certain pattern you should use Matcher.lookingAt() method:

Pattern pattern = Pattern.compile("([0-9_]+v._.)");
Matcher matcher = pattern.matcher("117_117_0009v0_172_5738_5740");
if (matcher.lookingAt()) {
  int groupCount = matcher.groupCount();
  for (int i = 0; i <= groupCount; i++) {
     System.out.println(i + " : " + matcher.group(i));
  }
}

Javadoc:

boolean java.util.regex.Matcher.lookingAt()

Attempts to match the input sequence, starting at the beginning of the region, against the pattern. Like the matches method, this method always starts at the beginning of the region; unlike that method, it does not require that the entire region be matched. If the match succeeds then more information can be obtained via the start, end, and group methods.

Upvotes: 1

npinti
npinti

Reputation: 52185

by default Java sticks in the ^ and $ operators, so something like this should work:

public class PatternTest {
  public static void main(String[] args) {
    System.out.println("117_117_0009v0_172_5738_5740".matches("^([0-9_]+v._.).*$"));
  }
}

returns:

true

Match content:

117_117_0009v0_1

This is the code I used to extract the match:

       Pattern p = Pattern.compile("^([0-9_]+v._.).*$");
       String str = "117_117_0009v0_172_5738_5740";

        Matcher m = p.matcher(str);
        if (m.matches())
        {
            System.out.println(m.group(1));
        }

Upvotes: 3

Neel Basu
Neel Basu

Reputation: 12904

I donno Java Flavor of Regular Expression However This PCRE Regular Expression Should work ^([\d_]+v\d_\d).+ Dont know why you are using ._. instead of \d_\d

Upvotes: 0

Related Questions