Reputation: 17497
We all know that you can compile your much-used regular expressions into something that performs very good. But what is this witchcraft happening behind the curtains?
I guess that a finite state automaton gets built there, but you must know better than me.
Upvotes: 8
Views: 629
Reputation: 25704
The details of regular expression compilation vary by implementation. For example, compilation in Python or re2 simply creates an instance of a regular expression object. This object's state machine may be modeled as a graph or virtual machine. Without compilation (example: RE.match(expression, input)
), a new regular expression object is created behind the scenes each time match
is called. This is needless work if you're going to use an expression more than once.
In C#, one of three things can happen when you compile:
You mention an interest in algorithms. Take a look at Russ Cox's excellent articles for two approaches:
Upvotes: 6
Reputation: 798526
Compiling a regular expression is similar to compiling Java or Python code; the regular expression is converted to an intermediate representation that the RE engine then interprets in order to perform the appropriate operations upon a string.
Upvotes: 1