Reputation: 3218
re = new Regex ((.*?)someliteraltext(.*?moreliteral), RegexOptions.Singleline);
re.Match(c);
Note that Singleline is used, so that "." matches newline.
I run this on a chunk of text that is about 100k characters and it runs for minutes.
Can it be faster?
Upvotes: 0
Views: 505
Reputation: 20664
It's slow because it requires a lot of backtracking. The article here:
http://www.regular-expressions.info/engine.html
might give you some idea of just how much work it's doing.
As @Wrikken suggested, by removing the initial "(.*?)". This capture group will capture everything from the start of the string until "someliteraltext".
Alternatively, use "IndexOf" to find "someliteraltext" and then "moreliteral" after it. That should be faster.
Upvotes: 2
Reputation: 7586
I agree with the comments that the thing that's most likely slowing it down the most is that starting (.*?). If you have 1000 characters in front of the first "someliteraltext", that's already 1001 matches of that portion of the regex. @CodeInChaos' suggestion of prefixing with ^
(beginning of string) is a quick way to limit those matches. If that isn't acceptable, you'll need to explain more of what you're trying to do with the matches to get a better answer.
Upvotes: 3