littleK
littleK

Reputation: 20163

Poor performance of many if-else statements in Java

I have a method that checks all of the combinations of 5 different conditions with 32 if-else statements (think of the truth table). The 5 different letters represent methods that each run their own regular expressions on a string, and return a boolean indicating whether or not the string matches the regex. For example:

if(A,B,C,D,E){

}else if(A,B,C,D,!E){

}else if(A,B,C,!D,!E){

}...etc,etc.

However, it is really affecting the performance of my application (sorry, I can't go into too many details). Can anyone recommend a better way to handle such logic?

Each method using a regular expression looks like this:

String re1 = "regex here";
Pattern p = Pattern.compile(re1, Pattern.DOTALL);
Matcher m = p.matcher(value);
return m.find();

Thanks!

Upvotes: 8

Views: 4608

Answers (10)

toto2
toto2

Reputation: 5326

I have a solution with EnumSet. However it's too verbose and I guess I prefer @Peter Lawrey's solution.

In Effective Java by Bloch it's recommended to use EnumSet over bit fields, but I would make an exception here. Nonetheless I posted my solution because it could be useful for someone with a slightly different problem.

import java.util.EnumSet;

public enum MatchingRegex {
  Tall, Blue, Hairy;

  public static EnumSet<MatchingRegex> findValidConditions(String stringToMatch) {
     EnumSet<MatchingRegex> validConditions = EnumSet.noneOf(MatchingRegex.class);
     if (... check regex stringToMatch for Tall)
       validConditions.add(Tall);
     if (... check regex stringToMatch for Blue)
       validConditions.add(Blue);
     if (... check regex stringToMatch for Hairy)
       validConditions.add(Hairy);
     return validConditions;         
  }
}

and you use it like this:

Set<MatchingRegex> validConditions = MatchingRegex.findValidConditions(stringToMatch);

if (validConditions.equals(EnumSet.of(MatchingRegex.Tall, MathchingRegex.Blue, MatchingRegex.Hairy))
   ...
else if (validConditions.equals(EnumSet.of(MatchingRegex.Tall, MathchingRegex.Blue))
   ...
else if ... all 8 conditions like this

But it would be more efficient like this:

if (validConditions.contains(MatchingRegex.Tall)) {
  if (validConditions.contains(MatchingRegex.Blue)) {
     if (validConditions.contains(MatchingRegex.Hairy)) 
        ... // tall blue hairy
     else
        ... // tall blue (not hairy)
  } else {
     if (validConditions.contains(MatchingRegex.Hairy)) 
        ... // tall (not blue) hairy
     else
        ... // tall (not blue) (not hairy)
} else {
      ... remaining 4 conditions
}

Upvotes: 2

Andrew Spencer
Andrew Spencer

Reputation: 16514

All the above answers are wrong, because the correct answer to an optimisation question is: Measure! Use a profiler to measure where your code is spending its time.

Having said that, I'd be prepared to bet that the biggest win is avoiding compiling the regexes more than once each. And after that, as others suggested, only evaluate each condition once and store the results in boolean variables. So thait84 has the best answer.

I'm also prepared to bet jtahlborn and Peter Lawrey's and Salvatore Previti suggestions (essentially the same), clever though they are, will get you negligible additional benefit, unless you're running on a 6502...

(This answer reads like I'm full of it, so in the interests of full disclosure I should mention that I'm actually hopeless at optimisation. But measuring still is the right answer.)

Upvotes: 4

tjg184
tjg184

Reputation: 4686

Without knowing more details, it might be helpful to arrange the if statements in such a way that the ones which do the "heavy" lifting are executed last. This is making the assumption that the other conditionals will be true thereby avoiding the "heavy" lifting ones all together. In short, take advantage of short-circuits if possible.

Upvotes: 3

G_H
G_H

Reputation: 12019

Maybe partition it into layers, like so:

if(A) {
    if(B) {
        //... the rest
    } else {
        //... the rest
    }
} else {
    if(B) {
        //... the rest
    } else {
        //... the rest
    }
}

Still, feels like there must be a better way to do this.

Upvotes: 2

Salvatore Previti
Salvatore Previti

Reputation: 9080

One possible solution: use a switch creating a binary value.

int value = (a ? 1 : 0) | (b ? 2 : 0) | (c ? 4 : 0) | (d ? 8 : 0) | (e ? 16 : 0);

switch (value)
{
    case 0:
    case 1:
    case 2:
    case 3:
    case 4:
    ...
    case 31:
}

If you can avoid the switch and use an array it would be faster.

Upvotes: 2

PypeBros
PypeBros

Reputation: 2646

pre-generating A,B,C,D and E as booleans rather than evaluating them in if conditions blocks would provide both readability and performance. If you're also concerned about performance the different cases, you may organise them as a tree or combine them into a single integer (X = (A?1:0)|(B?2:0)|...|(E?16:0)) that you'd use in a switch.

Upvotes: 0

thait84
thait84

Reputation: 198

Run the regex once for each string and store the results in to booleans and just do the if / else on the booleans instead of running the regex multiple times. Also, if you can, try to re-use a pre-compiled version of your regex and re-use this.

Upvotes: 2

Peter Lawrey
Peter Lawrey

Reputation: 533880

You can try

boolean a,b,c,d,e;
int combination = (a?16:0) + (b?8:0) + (c?4:0) + (d?2:0) + (e?1:0);
switch(combination) {
   case 0:
        break;
   // through to
   case 31:
        break;
}

Upvotes: 18

Frank Tudor
Frank Tudor

Reputation: 4534

You could also adapt your if/else to a switch/case (which I understand is faster)

Upvotes: 0

jtahlborn
jtahlborn

Reputation: 53694

represent each condition as a bit flag, test each condition once, and set the relevant flag in a single int. then switch on the int value.

int result = 0;
if(A) {
  result |= 1;
}
if(B) {
  result |= 2;
}
// ...

switch(result) {
  case 0: // (!A,!B,!C,!D,!E)
  case 1: // (A,!B,!C,!D,!E)
  // ...
}

Upvotes: 8

Related Questions