Reputation: 20245
The Django web framework uses regular expressions for dispatching incoming requests.
Let's say the application is huge and there are a lot of regular expressions, say hundreds.
For any incoming request, how can I determine which regular expressions are matched as quickly as possible? It is insane to iterate over the listed regular expressions one by one.
Upvotes: 4
Views: 223
Reputation: 9796
Combine the regular expressions into one by concatenating them together with | and named groups: (?<1>regex1)|(?<2>regex2)|(?<3>regex3...
Test the input once and determine which regular expression matched by checking the named groups.
Upvotes: 0
Reputation: 373052
One option would be to construct a single deterministic automaton that can match all of the regular expressions in parallel. One way to do this would be as follows:
Now, whenever you receive a new message, you can run the table-driven DFA on that message, which effectively runs every single regular expression in parallel and returns which regular expressions, if any, match. This may have a large cost in memory, since the generated DFA might be pretty big, but the time to match any incoming regular expression would be linear in the size of the string to match.
Upvotes: 5
Reputation: 15340
If you have control over the application, why not include some metadata that indicates the type of Regex to use? You can use that metadata to then select the correct RegEx.
Upvotes: 0