suprafly
suprafly

Reputation: 109

Erlang Compiler - Pattern Matching Performance and Underlying Data Structure

I am wondering what the underlying data structure used is, and what the performance of pattern matching is? In particular, compared to the performance of searching a Trie.

Update: I am looking for a concise and precise understanding of what pattern matching implemented by the Erlang compiler is. What is the underlying data structure, and how efficient is a search for a pattern?

Upvotes: 0

Views: 377

Answers (1)

RichardC
RichardC

Reputation: 10557

Pattern matching compilation does not have an "underlying data structure" in itself - it is just a strategy for decomposing any given data structure according to a set of patterns and minimizing the number of steps needed to tell whether a match is found, or whether no match is possible.

If your input is a string and the patterns are prefixes of that string, then the behaviour is just like that of trie search. Taking the example from https://en.wikipedia.org/wiki/Trie and expressing it as an Erlang case switch:

case String of
    "tea" ->  3;
    "ted" ->  4;
    "inn" ->  5;
    "to"  ->  7;
    "in"  ->  9;
    "i"   -> 11;
    "ten" -> 12;
    "A"   -> 15
end

Since there are no guard expressions to complicate the clauses, the compiler is free to reorder them for better efficiency (sorting them by type and value), so that all the patterns that share the same prefix are adjacent. This is convenient for the programmer, who doesn't have to care about manually keeping the list in order.

After that, the compiler will turn the set of clauses into a number of nested smaller case expressions performing a minimal number of tests. First, it would check whether the first character is an A, an i, or a t. If not, there can be no match, otherwise branch to examine the rest of the string; e.g., if the first character was an i, check if the next character is an n or the end of the string. Again, if it is neither, then there can be no match, otherwise branch again. And so on, generating code to examine all branches of all the patterns.

Upvotes: 1

Related Questions