Itay Maman
Itay Maman

Reputation: 30733

Java regexp: exact match from a given point at the middle of the input

I have an input string on which I need to run several regexp patterns (some sort of a parser). When running these regexps I want to consider only a certain part of the string (from a given position until its end) and I want a pattern to match only if the match with input starts at the given position.

Let's assume the input string is abcdefghij, that the current position is 1, and that I have these two patterns

Given that my position is 1 then I want p1 to match and I want p2 not to match (as p2 matches the hij part of the input - that is: the match starts at position 7 and not at position 1).

Using Matcher.find(offset) does not work as it does not require the match to start at the given position:

// Output: true (whereas I want it to be false)
System.out.println(Pattern.compile("[h-j]+").matcher("abcdefghij").find(1));

Note that adding a ^ to my patterns does not solve the problem:

// Output: false (whereas I want it to be true)
System.out.println(Pattern.compile("^[b-e]+").matcher("abcdefghij").find(1));

Other alternatives (that do not work):

(1) Applying .substring() on my input string (and adding ^ to ally my patterns) will work but the complexity of .substring() is O(n), which may be problematic for me (this is a library code that will be used on potentially large inputs which in ways that I cannot predict upfront)

(2) I can use the matcher's object .start() method to determine where the match occurred, as follows:

matcher = Pattern.compile("[h-j]+").matcher("abcdefghij");
System.out.println(matcher.find(1) && matcher.start() == 1);

My problem with that is that the regexp algorithm will run throughout the entire input string (which may be long) and only after it found a match the matcher.start() == offset condition will reject the match if it is not at the desired position. Seems inefficient.

Upvotes: 2

Views: 219

Answers (1)

BeeOnRope
BeeOnRope

Reputation: 65046

Use Matcher.lookingAt() which anchors at the start but not the end (unlike find which doesn't anchor).

Specifically:

Matcher m = Pattern.compile(".....").matcher(input);
m.region(offset, input.length());
if (m.lookingAt()) { 
  ...
}

Upvotes: 2

Related Questions