Omniwombat
Omniwombat

Reputation: 744

How do I avoid the implicit "^" and "$" in Java regular expression matching?

I've been struggling with doing some relatively straightforward regular expression matching in Java 1.4.2. I'm much more comfortable with the Perl way of doing things. Here's what's going on:

I am attempting to match /^<foo>/ from "<foo><bar>"

I try:

Pattern myPattern= Pattern.compile("^<foo>");
Matcher myMatcher= myPattern.matcher("<foo><bar>");
System.out.println(myMatcher.matches());

And I get "false"

I am used to saying:

print "<foo><bar>" =~ /^<foo>/;

which does indeed return true.

After much searching and experimentation, I discovered this which said:

"The String method further optimizes its search criteria by placing an invisible ^ before the pattern and a $ after it."

When I tried:

Pattern myPattern= Pattern.compile("^<foo>.*");
Matcher myMatcher= myPattern.matcher("<foo><bar>");
System.out.println(myMatcher.matches());

then it returns the expected true. I do not want that pattern though. The terminating .* should not be necessary.

Then I discovered the Matcher.useAnchoringBounds(boolean) method. I thought that expressly telling it to not use the anchoring bounds would work. It did not. I tried issuing a

myMatcher.reset();

in case I needed to flush it after turning the attribute off. No luck. Subsequently calling .matches() still returns false.

What have I overlooked?

Edit: Well, that was easy, thanks.

Upvotes: 4

Views: 1023

Answers (3)

erickson
erickson

Reputation: 269657

If you examine the "match", what part of the input string do you expect to find?

In other words,

Matcher myMatcher= myPattern.matcher("<foo><bar>");
if (myMatcher.matches()) {
  System.out.println(myMatcher.group(0));
}

… should print what?

If you are expecting it to print just "<foo>", use the find() method on Matcher instead of matches(). If you really want to find matches when the input starts with "<foo>", then you need to explicitly indicate that with a '^'.

If you are expecting it to match "<foo><bar>", you need to include the trailing ".*".

Upvotes: 3

matt b
matt b

Reputation: 139921

Matcher.useAnchoringBounds() was added in JDK1.5 so if you are using 1.4, I'm not sure that it would help you even if it did work (notice the @since 1.5 in the Javadocs).

The Javadocs for Matcher also state that the match() method:

Attempts to match the entire region against the pattern.

(emphasis mine)

Which explains why you only got .matches() == true when you changed the pattern to end with .*.

To match against the region starting at the beginning, but not necessarily requiring that the entire region be matched, use either the find() or lookingAt() methods.

Upvotes: 3

jdigital
jdigital

Reputation: 12276

Use the Matcher find method (instead of the matches method)

Upvotes: 11

Related Questions