Reputation: 8780
In cases where there aren't any characters to be escaped, would
if (s.Contains("\""))
s = s.Replace("\"", "\"\"");
really run any faster than
s = s.Replace("\"", "\"\"");
The Contains method has to search through the string just like the Replace method does, and if the Replace method doesn't find anything to escape then I'd think it shouldn't really take any longer than the Contains method takes to look through the string. But if there is a character to escape then you're having to search through the string twice.
Upvotes: 0
Views: 266
Reputation: 707
New version:
string test = "A quick brown fox jumps over a lazy dog.";
int count = 1000 * 1000;
Stopwatch watch = new Stopwatch();
for (int i = 0; i < 4; i++)
{
string result = String.Empty;
watch.Restart();
for (int c = 0; c < count; c++)
{
switch (i)
{
case 0: // warmup
break;
case 1:
if (test.Contains("\""))
{
result = test.Replace("\"", "\"\"");
}
break;
case 2:
result = test.Replace("\"", "\"\"");
break;
case 3:
if (test.IndexOf('\"') >= 0)
{
result = test.Replace("\"", "\"\"");
}
break;
}
}
watch.Stop();
Console.WriteLine("Test {0,16} {1,7:N3} ms {2}",
new string[]{"Base","Contains-Replace","Replace","IndexOf-Replace"}[i],
watch.Elapsed.TotalMilliseconds,
result);
Result:
Test Base 3.026 ms
Test Contains-Replace 284.780 ms
Test Replace 214.503 ms
Test IndexOf-Replace 64.447 ms
So Contains(string) itself is quite slow. The reason is due to NLS (nature language processing API. But IndexOf(char) is much faster.
Upvotes: 0
Reputation: 707
Andre Calil's micro-benchmark modified:
List<string> StringList = new List<string>();
for (int i = 0; i < 10000; i++)
{
StringList.Add(DateTime.Now.Ticks + " abc ");
}
string temp;
KeyValuePair<string, string>[] StringsToReplace = new KeyValuePair<string, string>[6];
// Use array to avoid dictionary access cost
StringsToReplace[0] = new KeyValuePair<string,string>("1", ".1.");
StringsToReplace[1] = new KeyValuePair<string,string>("a", "z");
StringsToReplace[2] = new KeyValuePair<string,string>("b", "x");
StringsToReplace[3] = new KeyValuePair<string,string>("c", "v");
StringsToReplace[4] = new KeyValuePair<string,string>("d", "u");
StringsToReplace[5] = new KeyValuePair<string,string>("e", "t");
int TotalIterations = 100;
Stopwatch stopWatch1 = new Stopwatch();
Stopwatch stopWatch2 = new Stopwatch();
GC.Collect(); // remove influence of garbage objects
for (int j = 0; j <= TotalIterations; j++)
{
stopWatch1.Start(); // StopWatch Start/Stop does its own accumation
for (int i = 0; i < StringList.Count; i++)
{
for (int k = 0; k < StringsToReplace.Length; k++)
{
temp = StringList[i].Replace(StringsToReplace[k].Value, StringsToReplace[k].Key);
}
}
stopWatch1.Stop();
stopWatch2.Start();
for (int i = 0; i < StringList.Count; i++)
{
for (int k = 0; k < StringsToReplace.Length; k++)
{
if (StringList[i].Contains(StringsToReplace[k].Value))
temp = StringList[i].Replace(StringsToReplace[k].Value, StringsToReplace[k].Key);
}
}
stopWatch2.Stop();
if (j == 0) // discard first run, warm only
{
stopWatch1.Reset();
stopWatch2.Reset();
}
}
// Elapsed.TotalMilliseconds return in double, more accurate than ElapsedMilliseconds
Console.WriteLine("Replace : {0:N3} ms", stopWatch1.Elapsed.TotalMilliseconds / TotalIterations);
Console.WriteLine("Contains > Replace: {0:N3} ms", stopWatch2.Elapsed.TotalMilliseconds / TotalIterations);
Results in favor of directly calling Replace because real replacement is needed 50% of time:
Replace : 7.453 ms
Contains > Replace: 8.381 ms
Upvotes: 1
Reputation: 8818
If anyone is curious (you aren't, trust me), the equation to find out the necessary percentage of strings which need a replacement for the OP's method to be more efficient is (roughly) this:
Where:
C = Time it takes to create a new string
R = average Ratio of Delimiter position to string length
n = average string length
p = average position of the delimiter (quote in this case) in the string
t = time it takes to search one character of a string
So if your percentage of strings that have a quote in them is greater than this -- by all means, use the first method.
This doesn't account for variables I considered negligible, such as time to return, the actual replacement of the character (though you can really include that in C
), or any kind of compiler optimization that I don't know about.
On a side note: the fact that I just spent half an hour solving the world's most useless equation is a perfect example of why I switched from Math to CS.
Upvotes: 0
Reputation: 7692
This method performs an ordinal (case-sensitive and culture-insensitive) search to find oldValue.
This method performs an ordinal (case-sensitive and culture-insensitive) comparison. The search begins at the first character position of this string and continues through the last character position.
so, I'd say that there's no pratical difference. Still, I'll run a micro-benchmarking here.
Micro benchmarking (man, I love this)
The code:
List<string> StringList = new List<string>();
for (int i = 0; i < 10000; i++)
{
StringList.Add(DateTime.Now.Ticks + " abc ");
}
string temp;
Dictionary<string, string> StringsToReplace = new Dictionary<string, string>();
StringsToReplace.Add("1", ".1.");
StringsToReplace.Add("a", "z");
StringsToReplace.Add("b", "x");
StringsToReplace.Add("c", "v");
StringsToReplace.Add("d", "u");
StringsToReplace.Add("e", "t");
long ReplaceElapsedTime = 0;
long ContainsReplaceElapsedTime = 0;
int TotalIterations = 10000;
for (int j = 0; j < TotalIterations; j++)
{
Stopwatch stopWatch = new Stopwatch();
stopWatch.Start();
for (int i = 0; i < StringList.Count; i++)
{
foreach (KeyValuePair<string, string> CurrentPair in StringsToReplace)
{
temp = StringList[i].Replace(CurrentPair.Value, CurrentPair.Key);
}
}
stopWatch.Stop();
ReplaceElapsedTime += stopWatch.ElapsedMilliseconds;
stopWatch.Reset();
stopWatch.Start();
for (int i = 0; i < StringList.Count; i++)
{
foreach (KeyValuePair<string, string> CurrentPair in StringsToReplace)
{
if (StringList[i].Contains(CurrentPair.Value))
temp = StringList[i].Replace(CurrentPair.Value, CurrentPair.Key);
}
}
stopWatch.Stop();
ContainsReplaceElapsedTime += stopWatch.ElapsedMilliseconds;
}
Console.WriteLine(string.Format("Replace: {0} ms", ReplaceElapsedTime/TotalIterations));
Console.WriteLine(string.Format("Contains > Replace: {0} ms", ContainsReplaceElapsedTime/TotalIterations));
Console.ReadLine();
Results:
Replace: 14 ms
Contains > Replace: 14 ms
=)
Upvotes: 0
Reputation: 700272
Some basic performance testing, doing each 10 000 000 times (with a string about 30 characters with a quotation mark somewhere in the middle), gives:
Contains("\"") 1327 ms.
Contains('"') 2949 ms.
Replace("\"", "\"\"") 2528 m.s
The Contains
call takes about half the time of the Replace
call. I thought that looking for a character would be faster than looking for a string, but surprisingly it isn't.
So, comparing the cases of checking first and always replacing to get the break even, where x is the time of the Contains
call and y is how often you need to do the replacement:
x + 2xy = 2x
gives:
y = 0.5
This means that the two methods break even at about 50%. If you need to do the replacement more than half the time, there is no gain in checking for the existance first.
Upvotes: 4
Reputation: 437356
As with all performance-related questions, the answer is measure.
It depends on how intelligent Replace
is (does it allocate any memory "in advance"? if so, how big a performance hit is that?), on the ratio of strings that need replacements over the total number of strings you process, on the lengths of the strings, which index the first "
is located at (Contains
is faster if it can return true
sooner) etc.
Upvotes: 0
Reputation: 171188
That depends on the ratio of strings needing fixup to all strings. If very little strings must be fixed this is certainly a win. The win will be very small though as Replace surely doesn't allocate a new string if it doesn't find any match. This is a common optimization.
The break-even point cannot be predicted, though. Measure.
Upvotes: 0