Reputation: 6357
I want to know if
string.IndexOf(char)
is faster than
string.Contains(string)
Intention is to check if a single character is present in a string
. I know that as per requirement I should use string.Contains(string)
but that is not the point of this question. I did try to disassemble mscorlib.dll
in an attempt to compare their implementations but I could not find the implementation of
string.IndexOf(char)
as it is implemented within the CLR itself. Implementation of
string.Contains(string)
looks quite heavy though.
Upvotes: 0
Views: 6001
Reputation: 373
I ran a test similar to Dmitry Bychenko's, but ran each test 1000 times. For each iteration, I generated a new, pseudo-random 20,000,001-character long string to avoid any distortion of results caused by any caching that might be happening. Looking at the .NET source code, it seems like most of these methods are ultimately calling unmanaged code, as Konstantin K. pointed out. And it appears that there can be a big difference in performance depending on whether your string can be an ASCII string or whether it contains Unicode characters. Check out that source.IndexOf("範")
in .NET Core 6 below!
I ran the same tests in a .NET Framework 4.8 console application and a .NET Core 6.0 console application. Here are the results:
"ASCII" String:
source.IndexOf('b') Median: 2 ms Average: 2.462 ms
source.IndexOf("b") Median: 54 ms Average: 55.432 ms
source.Contains('b') Median: 46 ms Average: 47.869 ms
source.Contains("b") Median: 12 ms Average: 12.773 ms
String containing Unicode characters:
source.IndexOf('範') Median: 2 ms Average: 2.030 ms
source.IndexOf("範") Median: 116 ms Average: 116.324 ms
source.Contains('範') Median: 43 ms Average: 43.218 ms
source.Contains("範") Median: 7 ms Average: 6.621 ms
"ASCII" String:
source.IndexOf('b') Median: 0 ms Average: 0.305 ms
source.IndexOf("b") Median: 16 ms Average: 15.789 ms
source.Contains('b') Median: 0 ms Average: 0.099 ms
source.Contains("b") Median: 0 ms Average: 0.300 ms
String containing Unicode characters:
source.IndexOf('範') Median: 0 ms Average: 0.354 ms
source.IndexOf("範") Median: 259 ms Average: 259.086 ms
source.Contains('範') Median: 0 ms Average: 0.027 ms
source.Contains("範") Median: 0 ms Average: 0.351 ms
How many times are you going to search a 20 million character long string for a single character? And how many times would you find yourself doing that 1000 times in a row? The point being that an end user probably would never notice if something took 2 milliseconds or 40. No need to do micro-optimizations, unless you truly need to. (And avoid String.IndexOf(String)
with Chinese characters in .NET Core!)
Upvotes: 0
Reputation: 178
Source code for mscorlib.dll
is available (and has been at the time of the original post) at Microsoft Reference Source. You can see there in String
class, that Contains(string) is a wrapper
public bool Contains( string value ) {
return ( IndexOf(value, StringComparison.Ordinal) >=0 );
}
around IndexOf
function, which is using extern InternalFindNLSStringEx
function, that must be pretty fast. But not so fast, as extern IndexOf(char)
function, for obvious reasons.
This explains, why the calculation speed for IndexOf(string)
and Contains(string)
is nearly the same. But be careful with your becnhmarks, because deep down in CLR it uses some sort of caching, I think, so the speed of one function depends on in which order you call it - before or after another.
This question wasn't present in original post, but as someone above mentioned this, let's also look at Contains<char>
which is an Enumerable
extension:
public static bool Contains<TSource>(this IEnumerable<TSource> source, TSource value, IEqualityComparer<TSource> comparer)
{
if (comparer == null) comparer = EqualityComparer<TSource>.Default;
if (source == null) throw Error.ArgumentNull("source");
foreach (TSource element in source)
if (comparer.Equals(element, value)) return true;
return false;
}
So, it is going to call Equals
on every element in sequence. This is neat, and allows you to use custom comparer, but using IndexOf(char)
would be faster.
Upvotes: 3
Reputation: 186688
Just test and see
String source = new String('a', 10000000) + "b" + new String('c', 10000000);
Stopwatch sw = new Stopwatch();
sw.Start();
// Just an estimation
int result = source.IndexOf('b');
// int result = source.IndexOf("b");
// Boolean result = source.Contains("b");
// Boolean result = source.Contains('b');
sw.Stop();
int time = sw.ElapsedMilliseconds;
At my workstation (i5 3.2 GHz, .Net 5.0 64 bit) it takes about 10 ms for Char
and 38 ms for String
Edit: Performance's outcome is
IndexOf(Char) 10 -- Fastest
IndexOf(String) 38
Contains(Char) 100 -- Slowest
Contains(String) 41
so IndexOf(String)
and Contains(String)
are roughly the same
Upvotes: 9
Reputation: 9772
IndexOf(char)
is a simple loop and
Contains()
(and IndexOf(string)
) is usually two nested loops,
the former will be faster.Upvotes: 0