alexD
alexD

Reputation: 2364

Regex.Replace much slower than conditional statement using String.Contains

I have a list of 400 strings that all end in "_GONOGO" or "_ALLOC". When the application starts up, I need to strip off the "_GONOGO" or "_ALLOC" from every one of these strings.

I tried this: 'string blah = Regex.Replace(string, "(_GONOGO|_ALLOC)", ""));'

but it is MUCH slower than a simple conditional statement like this:

if (string.Contains("_GONOGO"))
          // use Substring
else if (string.Contains("_ALLOC"))
          // use Substring w/different index

I'm new to regular expressions, so I'm hoping that someone has a better solution or I am doing something horribly wrong. It's not a big deal, but it would be nice to turn this 4 line conditional into one simple regex line.

Upvotes: 2

Views: 1355

Answers (5)

Sam Harwell
Sam Harwell

Reputation: 99869

When you have that much information about your problem domain, you can make things pretty simple:

const int AllocLength = 6;
const int GonogoLength = 7;
string s = ...;
if (s[s.Length - 1] == 'C')
    s = s.Substring(0, s.Length - AllocLength);
else
    s = s.Substring(0, s.Length - GonogoLength);

This is theoretically faster than Abraham's solution, but not as flexible. If the strings have any chance of changing then this one would suffer from maintainability problems that his does not.

Upvotes: 1

Av Pinzur
Av Pinzur

Reputation: 2228

If they all end in one of those patterns, it would likely be faster to drop replace altogether and use:

string result = source.Substring(0, source.LastIndexOf('_'));

Upvotes: 1

Michael Petrotta
Michael Petrotta

Reputation: 60902

This is expected; in general, manipulating a string by hand will be faster than using a regular expression. Using a regex involves compiling an expression down to a regex tree, and that takes time.

If you're using this regex in multiple places, you can use the RegexOptions.Compiled flag to reduce the per-match overhead, as David describes in his answer. Other regex experts might have tips for improving the expression. You might consider sticking with the String.Replace, though; it's fast and readable.

Upvotes: 3

David Andres
David Andres

Reputation: 31781

Regex replacements may work faster if you compile the regex first. As in:

Regex exp = new Regex(
    @"(_GONOGO|_ALLOC)",
    RegexOptions.Compiled);

exp.Replace(string, String.Empty);

Upvotes: 4

Adam Robinson
Adam Robinson

Reputation: 185643

While it isn't RegEx, you could do

string blah = string.Replace("_GONOGO", "").Replace("_ALLOC", "");

RegEx is great for complex expressions, but the overhead can sometimes be overkill for very simple operations like this.

Upvotes: 8

Related Questions