Reputation: 723
I'm trying to get the last result of a match without having to cycle through .find()
Here's my code:
String in = "num 123 num 1 num 698 num 19238 num 2134";
Pattern p = Pattern.compile("num ([0-9]+)");
Matcher m = p.matcher(in);
if (m.find()) {
in = m.group(1);
}
That will give me the first result. How do I find the LAST match without cycling through a potentially huge list?
Upvotes: 30
Views: 59359
Reputation: 1
just use \Z - end of string mach
String in = "num 123 num 1 num 698 num 19238 num 2134";
Pattern p = Pattern.compile("num ([0-9]+)\\Z");
Matcher m = p.matcher(in);
if (m.find()) {
in = m.group(1);
}
Upvotes: 0
Reputation: 170158
You could prepend .*
to your regex, which will greedily consume all characters up to the last match:
import java.util.regex.*;
class Test {
public static void main (String[] args) {
String in = "num 123 num 1 num 698 num 19238 num 2134";
Pattern p = Pattern.compile(".*num ([0-9]+)");
Matcher m = p.matcher(in);
if(m.find()) {
System.out.println(m.group(1));
}
}
}
Prints:
2134
You could also reverse the string as well as change your regex to match the reverse instead:
import java.util.regex.*;
class Test {
public static void main (String[] args) {
String in = "num 123 num 1 num 698 num 19238 num 2134";
Pattern p = Pattern.compile("([0-9]+) mun");
Matcher m = p.matcher(new StringBuilder(in).reverse());
if(m.find()) {
System.out.println(new StringBuilder(m.group(1)).reverse());
}
}
}
But neither solution is better than just looping through all matches using while (m.find())
, IMO.
Upvotes: 21
Reputation: 24641
Compared to the currently accepted answer, this one does not blindly discard elements of the list using the".*"
prefix. Instead, it uses "(element delimiter)*(element)"
to pick out the last element using .group(2)
. See the function magic_last
in code below.
To demonstrate the benefit of this approach I have also included a function to pick out the n-th element which is robust enough to accept a list that has fewer than n elements. See the function magic
in code below.
Filtering out the "num " text and only getting the number is left as an exercise for the reader (just add an extra group around the digits pattern: ([0-9]+)
and pick out group 4 instead of group 2).
package com.example;
import static java.lang.System.out;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Foo {
public static void main (String [] args) {
String element = "num [0-9]+";
String delimiter = ", ";
String input;
input = "here is a num bro: num 001; hope you like it";
magic_last(input, element, delimiter);
magic(1, input, element, delimiter);
magic(2, input, element, delimiter);
magic(3, input, element, delimiter);
input = "here are some nums bro: num 001, num 002, num 003, num 004, num 005, num 006; hope you like them";
magic_last(input, element, delimiter);
magic(1, input, element, delimiter);
magic(2, input, element, delimiter);
magic(3, input, element, delimiter);
magic(4, input, element, delimiter);
magic(5, input, element, delimiter);
magic(6, input, element, delimiter);
magic(7, input, element, delimiter);
magic(8, input, element, delimiter);
}
public static void magic_last (String input, String element, String delimiter) {
String regexp = "(" + element + delimiter + ")*(" + element + ")";
Pattern pattern = Pattern.compile(regexp);
Matcher matcher = pattern.matcher(input);
if (matcher.find()) {
out.println(matcher.group(2));
}
}
public static void magic (int n, String input, String element, String delimiter) {
String regexp = "(" + element + delimiter + "){0," + (n - 1) + "}(" + element + ")(" + delimiter + element + ")*";
Pattern pattern = Pattern.compile(regexp);
Matcher matcher = pattern.matcher(input);
if (matcher.find()) {
out.println(matcher.group(2));
}
}
}
Output:
num 001
num 001
num 001
num 001
num 006
num 001
num 002
num 003
num 004
num 005
num 006
num 006
num 006
Upvotes: 0
Reputation: 963
Use negative lookahead:
String in = "num 123 num 1 num 698 num 19238 num 2134";
Pattern p = Pattern.compile("num (\\d+)(?!.*num \\d+)");
Matcher m = p.matcher(in);
if (m.find()) {
in= m.group(1);
}
The regular expression reads as "num followed by one space and at least one digit without any (num followed by one space and at least one digit) at any point after it".
You can get even fancier by combining it with positive lookbehind:
String in = "num 123 num 1 num 698 num 19238 num 2134";
Pattern p = Pattern.compile("(?<=num )\\d+(?!.*num \\d+)");
Matcher m = p.matcher(in);
if (m.find()) {
in = m.group();
}
That one reads as "at least one digit preceded by (num and one space) and not followed by (num followed by one space and at least one digit) at any point after it".
That way you don't have to mess with grouping and worry about the potential IndexOutOfBoundsException
thrown from Matcher.group(int)
.
Upvotes: 4
Reputation: 576
To get the last match even this works and not sure why this was not mentioned earlier:
String in = "num 123 num 1 num 698 num 19238 num 2134";
Pattern p = Pattern.compile("num '([0-9]+) ");
Matcher m = p.matcher(in);
if (m.find()) {
in= m.group(m.groupCount());
}
Upvotes: 15
Reputation: 164
String in = "num 123 num 1 num 698 num 19238 num 2134";
Pattern p = Pattern.compile("num '([0-9]+) ");
Matcher m = p.matcher(in);
String result = "";
while (m.find())
{
result = m.group(1);
}
Upvotes: 0
Reputation: 613
This seems like a more equally plausible approach.
public class LastMatchTest {
public static void main(String[] args) throws Exception {
String target = "num 123 num 1 num 698 num 19238 num 2134";
Pattern regex = Pattern.compile("(?:.*?num.*?(\\d+))+");
Matcher regexMatcher = regex.matcher(target);
if (regexMatcher.find()) {
System.out.println(regexMatcher.group(1));
}
}
}
The .*?
is a reluctant match so it won't gobble up everything. The ?:
forces a non-capturing group so the inner group is group 1. Matching multiples in a greedy fashion causes it to match across the entire string until all matches are exhausted leaving group 1 with the value of your last match.
Upvotes: 0
Reputation: 10544
Regular expressions are greedy:
Matcher m=Pattern.compile(".*num '([0-9]+) ",Pattern.DOTALL).matcher("num 123 num 1 num 698 num 19238 num 2134");
will give you a Matcher
for the last match, and you can apply it to most regexes by prepending ".*". Of course, if you can't use DOTALL
, you might want to use (?:\d|\D)
or something similar as your wildcard.
Upvotes: 0
Reputation: 81074
Java does not provide such a mechanism. The only thing I can suggest would be a binary search for the last index.
It would be something like this:
N = haystack.length();
if ( matcher.find(N/2) ) {
recursively try right side
else
recursively try left side
And here's code that does it since I found it to be an interesting problem:
import org.junit.Test;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import static org.junit.Assert.assertEquals;
public class RecursiveFind {
@Test
public void testFindLastIndexOf() {
assertEquals(0, findLastIndexOf("abcdddddd", "abc"));
assertEquals(1, findLastIndexOf("dabcdddddd", "abc"));
assertEquals(4, findLastIndexOf("aaaaabc", "abc"));
assertEquals(4, findLastIndexOf("aaaaabc", "a+b"));
assertEquals(6, findLastIndexOf("aabcaaabc", "a+b"));
assertEquals(2, findLastIndexOf("abcde", "c"));
assertEquals(2, findLastIndexOf("abcdef", "c"));
assertEquals(2, findLastIndexOf("abcd", "c"));
}
public static int findLastIndexOf(String haystack, String needle) {
return findLastIndexOf(0, haystack.length(), Pattern.compile(needle).matcher(haystack));
}
private static int findLastIndexOf(int start, int end, Matcher m) {
if ( start > end ) {
return -1;
}
int pivot = ((end-start) / 2) + start;
if ( m.find(pivot) ) {
//recurse on right side
return findLastIndexOfRecurse(end, m);
} else if (m.find(start)) {
//recurse on left side
return findLastIndexOfRecurse(pivot, m);
} else {
//not found at all between start and end
return -1;
}
}
private static int findLastIndexOfRecurse(int end, Matcher m) {
int foundIndex = m.start();
int recurseIndex = findLastIndexOf(foundIndex + 1, end, m);
if ( recurseIndex == -1 ) {
return foundIndex;
} else {
return recurseIndex;
}
}
}
I haven't found a breaking test case yet.
Upvotes: 3
Reputation: 5728
Java patterns are greedy by default, the following should do it.
String in = "num 123 num 1 num 698 num 19238 num 2134";
Pattern p = Pattern.compile( ".*num ([0-9]+).*$" );
Matcher m = p.matcher( in );
if ( m.matches() )
{
System.out.println( m.group( 1 ));
}
Upvotes: 2
Reputation: 30032
Why not keep it simple?
in.replaceAll(".*[^\\d](\\d+).*", "$1")
Upvotes: 6