Reputation: 20163
I have a method that checks all of the combinations of 5 different conditions with 32 if-else statements (think of the truth table). The 5 different letters represent methods that each run their own regular expressions on a string, and return a boolean indicating whether or not the string matches the regex. For example:
if(A,B,C,D,E){
}else if(A,B,C,D,!E){
}else if(A,B,C,!D,!E){
}...etc,etc.
However, it is really affecting the performance of my application (sorry, I can't go into too many details). Can anyone recommend a better way to handle such logic?
Each method using a regular expression looks like this:
String re1 = "regex here";
Pattern p = Pattern.compile(re1, Pattern.DOTALL);
Matcher m = p.matcher(value);
return m.find();
Thanks!
Upvotes: 8
Views: 4608
Reputation: 5326
I have a solution with EnumSet. However it's too verbose and I guess I prefer @Peter Lawrey's solution.
In Effective Java by Bloch it's recommended to use EnumSet over bit fields, but I would make an exception here. Nonetheless I posted my solution because it could be useful for someone with a slightly different problem.
import java.util.EnumSet;
public enum MatchingRegex {
Tall, Blue, Hairy;
public static EnumSet<MatchingRegex> findValidConditions(String stringToMatch) {
EnumSet<MatchingRegex> validConditions = EnumSet.noneOf(MatchingRegex.class);
if (... check regex stringToMatch for Tall)
validConditions.add(Tall);
if (... check regex stringToMatch for Blue)
validConditions.add(Blue);
if (... check regex stringToMatch for Hairy)
validConditions.add(Hairy);
return validConditions;
}
}
and you use it like this:
Set<MatchingRegex> validConditions = MatchingRegex.findValidConditions(stringToMatch);
if (validConditions.equals(EnumSet.of(MatchingRegex.Tall, MathchingRegex.Blue, MatchingRegex.Hairy))
...
else if (validConditions.equals(EnumSet.of(MatchingRegex.Tall, MathchingRegex.Blue))
...
else if ... all 8 conditions like this
But it would be more efficient like this:
if (validConditions.contains(MatchingRegex.Tall)) {
if (validConditions.contains(MatchingRegex.Blue)) {
if (validConditions.contains(MatchingRegex.Hairy))
... // tall blue hairy
else
... // tall blue (not hairy)
} else {
if (validConditions.contains(MatchingRegex.Hairy))
... // tall (not blue) hairy
else
... // tall (not blue) (not hairy)
} else {
... remaining 4 conditions
}
Upvotes: 2
Reputation: 16514
All the above answers are wrong, because the correct answer to an optimisation question is: Measure! Use a profiler to measure where your code is spending its time.
Having said that, I'd be prepared to bet that the biggest win is avoiding compiling the regexes more than once each. And after that, as others suggested, only evaluate each condition once and store the results in boolean variables. So thait84 has the best answer.
I'm also prepared to bet jtahlborn and Peter Lawrey's and Salvatore Previti suggestions (essentially the same), clever though they are, will get you negligible additional benefit, unless you're running on a 6502...
(This answer reads like I'm full of it, so in the interests of full disclosure I should mention that I'm actually hopeless at optimisation. But measuring still is the right answer.)
Upvotes: 4
Reputation: 4686
Without knowing more details, it might be helpful to arrange the if statements in such a way that the ones which do the "heavy" lifting are executed last. This is making the assumption that the other conditionals will be true thereby avoiding the "heavy" lifting ones all together. In short, take advantage of short-circuits if possible.
Upvotes: 3
Reputation: 12019
Maybe partition it into layers, like so:
if(A) {
if(B) {
//... the rest
} else {
//... the rest
}
} else {
if(B) {
//... the rest
} else {
//... the rest
}
}
Still, feels like there must be a better way to do this.
Upvotes: 2
Reputation: 9080
One possible solution: use a switch creating a binary value.
int value = (a ? 1 : 0) | (b ? 2 : 0) | (c ? 4 : 0) | (d ? 8 : 0) | (e ? 16 : 0);
switch (value)
{
case 0:
case 1:
case 2:
case 3:
case 4:
...
case 31:
}
If you can avoid the switch and use an array it would be faster.
Upvotes: 2
Reputation: 2646
pre-generating A,B,C,D and E as booleans rather than evaluating them in if
conditions blocks would provide both readability and performance. If you're also concerned about performance the different cases, you may organise them as a tree or combine them into a single integer (X = (A?1:0)|(B?2:0)|...|(E?16:0)) that you'd use in a switch
.
Upvotes: 0
Reputation: 198
Run the regex once for each string and store the results in to booleans and just do the if / else on the booleans instead of running the regex multiple times. Also, if you can, try to re-use a pre-compiled version of your regex and re-use this.
Upvotes: 2
Reputation: 533880
You can try
boolean a,b,c,d,e;
int combination = (a?16:0) + (b?8:0) + (c?4:0) + (d?2:0) + (e?1:0);
switch(combination) {
case 0:
break;
// through to
case 31:
break;
}
Upvotes: 18
Reputation: 4534
You could also adapt your if/else to a switch/case (which I understand is faster)
Upvotes: 0
Reputation: 53694
represent each condition as a bit flag, test each condition once, and set the relevant flag in a single int. then switch on the int value.
int result = 0;
if(A) {
result |= 1;
}
if(B) {
result |= 2;
}
// ...
switch(result) {
case 0: // (!A,!B,!C,!D,!E)
case 1: // (A,!B,!C,!D,!E)
// ...
}
Upvotes: 8