Jens Schauder
Jens Schauder

Reputation: 81970

Extracting two numbers from a string

I have a String like the following one:

"some value is 25 but must not be bigger then 12"

I want to extract the two numbers from the string.

The numbers are integers.

There might be no text before the first number and some text after the second number.

I tried to do it with a regexp and groups, but failed miserably:

public MessageParser(String message) {
    Pattern stringWith2Numbers = Pattern.compile(".*(\\d?).*(\\d?).*");
    Matcher matcher = stringWith2Numbers.matcher(message);
    if (!matcher.matches()) {
        couldParse = false;
        firstNumber = 0;
        secondNumber = 0;
    } else {
        final String firstNumberString = matcher.group(1);
        firstNumber = Integer.valueOf(firstNumberString);
        final String secondNumberString = matcher.group(2);
        secondNumber = Integer.valueOf(secondNumberString);

        couldParse = true;
    }
}

Any help is apreciated.

Upvotes: 3

Views: 2238

Answers (3)

jjnguy
jjnguy

Reputation: 138912

Your pattern should look more like:

Pattern stringWith2Numbers = Pattern.compile("\\D*(\\d+)\\D+(\\d+)\\D*");

You need to accept \\d+ because it can be one or more digits.

Upvotes: 8

BlairHippo
BlairHippo

Reputation: 9658

Your ".*" patterns are being greedy, as is their wont, and are gobbling up as much as they can -- which is going to be the entire string. So that first ".*" is matching the entire string, rendering the rest moot. Also, your "\\d?" clauses indicate a single digit which happens to be optional, neither of which is what you want.

This is probably more in line with what you're shooting for:

Pattern stringWith2Numbers = Pattern.compile(".*?(\\d+).*?(\\d+).*?");

Of course, since you don't really care about the stuff before or after the numbers, why bother with them?

Pattern stringWith2Numbers = Pattern.compile("(\\d+).*?(\\d+)");

That ought to do the trick.

Edit: Taking time out from writing butt-kickingly awesome comics, Alan Moore pointed out some problems with my solution in the comments. For starters, if you have only a single multi-digit number in the string, my solution gets it wrong. Applying it to "This 123 is a bad string" would cause it to return "12" and "3" when it ought to simply fail. A better regex would stipulate that there MUST be at least one non-digit character separating the two numbers, like so:

Pattern stringWith2Numbers = Pattern.compile("(\\d+)\\D+(\\d+)");

Also, matches() applies the pattern to the entire string, essentially bracketing it in ^ and $; find() would do the trick, but that's not what the OP was using. So sticking with matches(), we'd need to bring back in those "useless" clauses in front of and after the two numbers. (Though having them explicitly match non-digits instead of the wildcard is better form.) So it would look like:

Pattern stringWith2Numbers = Pattern.compile("\\D*(\\d+)\\D+(\\d+)\\D*");

... which, it must be noted, is damn near identical to jjnguy's answer.

Upvotes: 3

Ben S
Ben S

Reputation: 69362

Your regex matches, but everything gets eaten up by your first .* and the rest matches the empty string.

Change your regex to "\\D*(\\d+)\\D+(\\d+)\\D*".

This should be read as: At least one numeric digit followed by at least one character that isn't a numeric digit, followed by at least one numeric digit.

Upvotes: 2

Related Questions