vitrums
vitrums

Reputation: 512

How to avoid duplicating nested switch-statements

Many of us encountered cases, where a top-level switch-statement would seem to be a great solution. And on a closer look you started to recognize some problems.

Suppose we parse a string manually, i.e. char-by-char. For the sake of simplicity suppose the string is comprised of a subset of latin letters and something else (e.g. parentheses). Latin characters ([abc]) make our lexer perform a very similar task (yet not completely primitive) and we instinctively want to group these cases. On the surface, there are two ways:

1) No grouping: code duplication.

case 'a':
  doOnA();
  break;
case 'b':
  doOnB();
  break;
case 'c':
  doOnC();
  break;
// Other cases
case '(':
  doOnOpening();
  break;
...

Methods doOnA, doOnB, doOnC are somewhat ugly due to it's copy-paste nature:

void doOnA() {
  // Do something specific to 'a'
  IntermediateResult ir = ...;
  // And then do something common
  ... // This code is copied to every method
}

To reduce the amount of copy-pasting common lines could be grouped in a method like:

void thisCodeRepeatsInEveryMethodNow(IntermediateResult ir) {
  ...
}

And then we hope, that JVM inlines it at some point.

2) Grouping: nested switch-statement.

case 'a':
case 'b':
case 'c':
  doOnLatinLetter(c);
  break;
case '(':
  doOnOpening();
  break;
...

Since we hear often that nested switch-statements are evil, we introduce the method doOnLatinLetter.

void doOnLatinLetter(char c) {
  IntermediateResult ir;
  switch (c) {
    case 'a':
      ir = ...;
      break;
    case 'b':
      ir = ...;
      break;
    case 'c':
      ir = ...;
      break;
  }

  // And then do something common
  ...
}

So the price is that now we have to check if c is indeed 'a', 'b' or 'c' twice. But is it the only price?

Notice that this method doOnLatinLetter doesn't help avoid any code duplication (it has a single call throughout the program) and therefore its "refactoring merits" are minimal (fall into the same category as moving static inner class to the new file etc.) Nevertheless, the introduction of such a method is a frequent advice in this case. In other words code looks fancier with it. My guess is that this kind of refactoring always had both supporters and opponents.

3) So eventually I came down to a mixture of IF and SWITCH blocks like this:

if (c >= 'a' && c <='c') { // grouping by latin letters
  IntermediateResult ir;
  switch (c) {
    case 'a':
      ir = ...;
      break;
    case 'b':
      ir = ...;
      break;
    case 'c':
      ir = ...;
      break;
  }

  // And then do something common
  ...
} else { // everything else, that doesn't require grouping
  switch (c) {
    case '(':
    doOnOpening();
    break;
  ...
  }
}

If semantically we have to fragment the input into more groups, then we'll just add more "else if" clauses.

But my problem is that both 2nd and 3rd solution still tend to look like hacks. Obviously polymorphism here will also feel awkward. Is there a true elegant way in Java to tackle this problem?

Upvotes: 1

Views: 1313

Answers (2)

Marc G. Smith
Marc G. Smith

Reputation: 886

You could consider having separate handlers for each case and using inheritance to cover grouped cases, with a provider to collect them together. For example (the inner classes are just for brevity I would break them out into their own files in the real world)...

public class HandlerProvider implements IntConsumer {
    private final Map<Character, Handler> handlers = new HashMap<>();
    private final Handler defaultHandler = new NoOpHandler();

    public HandlerProvider() {
        register('a', new LetterAHandler());
        // Other case handlers ...
        register('(', new OpeningHandler());
    }

    public void register(char ch, Handler handler) {
        handlers.put(ch, handler);
    }

    public void accept(int value) {
        Character ch = (char) value;
        get(ch).accept(ch);
    }

    public Handler get(char ch) {
        return handlers.getOrDefault(ch, defaultHandler);
    }

    public interface Handler {
        void accept(char c);
    }

    public abstract class LetterHandler implements Handler {
        public void accept(char character) {
            IntermediateResult ir = getIntermediateResult(character);
            // And then do something common
        }
        public abstract IntermediateResult getIntermediateResult(int character);
    }

    public class LetterAHandler extends LetterHandler {
        public IntermediateResult getIntermediateResult(int character) {
            IntermediateResult ir = new IntermediateResult();
            // Do something specific to 'character'
            return ir;
        }
    }

    // Other case handlers ...

    public class OpeningHandler implements Handler {
        public void accept(char character) {
        }
    }

    // Default handler if no other matches are found... 
    // could have different behaviour such throw an error
    public class NoOpHandler implements Handler {
        public void accept(char character) {
        }
    }
}

Then you can process the characters as a stream:

    HandlerProvider provider = new HandlerProvider();
    String input = ...;
    input.chars().forEachOrdered(provider);

Or alternatively each character individually

    for (char c : s.toCharArray()) {
       provider.get(c).accept(ch);        
    }

It's a bit more verbose up front but a more flexible and ultimately with a lot of cases becomes easier to maintain. Obviously you need to an extra parameter to the Handler method so your handlers can act on whatever the output is going to be or at least interrogate the current state. But you get the general idea.

Upvotes: 1

Alexandre Cassagne
Alexandre Cassagne

Reputation: 2463

This is something that will be specific to your coding style and that of your organisation. However, I think the 'blend' of solution 2 and 3 would throw me off. The reason for using a switch statement is typically to be exhaustive about the cases—in fact, many IDEs will give you a warning if that is not the case!

In your examples, the most parsimonious version of this code seems to be—to me—solution 2). However, if you frequently need to check whether the character is a letter, consider creating helper functions:

private boolean isLower(char c) {
    return c >= 'a' && c <= 'z';
}
private boolean isOpeningToken(char c) {
    return c == '(' || c == '{';
}

You should then be able to perform a simple if .. else statement which conveys easily readable semantics, like so:

char c = ...;
if (isLower(c)) {
    // perform some letter-specific code
}
else if (isOpeningToken(c)) {
    // '(' or '{' -- you could use an if or an else statement 
    //               to distinguish between these tokens and 
    //               run appropriate handler
}
else {
    throw new Exception("Unhandled token!");
}

Upvotes: 0

Related Questions