Hazem Elraffiee
Hazem Elraffiee

Reputation: 453

How to match a long String with a regex fast?

I have this regex "((\\-)?[0-9]+(.([0-9])+)? )+" that should match sequence of numbers each separated by a single space. For example "5 4 1 2 2.4 3 7.8" or "5 4 1 2 2.4 8.001 7.8".

In order to check if the string matches the regex I do:

if((value+" ").matches("((\\-)?[0-9]+(.([0-9])+)? )+")){
    // anything
}

The thing is, when I give this small string like the examples above, it goes perfectly. But for longer strings like: "2000000 2000000 2000000 2000000 2000000 2000000 2000000 2000000" it goes perfectly if matches, but takes up to 5 seconds if doesn't match. Check this:

String value = "2000000 2000000 2000000 2000000 2000000 2000000 2000000 2000000 h";

System.out.println("Start: "+System.currentTimeMillis());
if((value+" ").matches("((\\-)?[0-9]+(.([0-9])+)? )+")){
    System.out.println("OK");
}else{
    System.out.println("NOK");
}
System.out.println("End: "+System.currentTimeMillis());

This takes up to 5 seconds!! while if you removed the " h" from the end of the string, it would take less than 1 ms.

Any ideas?

Upvotes: 3

Views: 1402

Answers (2)

neworld
neworld

Reputation: 7793

First you need fix your regex:

"((\\-)?[0-9]+(\\.([0-9])+)? )+"

because your version match any symbol between two numbers including the space. Maybe this makes performance down.

After that you can first try to find any character and if found, do not check with your regex or splitting to smaller pieces as someone tell before.

Upvotes: 1

Brian Agnew
Brian Agnew

Reputation: 272257

I suspect you'll get much faster performance if you split the above into a sequence of numbers (by splitting on whitespace) and then applying a simpler regexp to each substring.

Upvotes: 3

Related Questions