Hosam Aly
Hosam Aly

Reputation: 42474

Is String.IndexOf(char) really slower than manual searching?

When doing a simple performance measurement, I was astonished to see that calling String.IndexOf(char) was actually slower than doing it manually! Is this really true?! Here is my test code:

const string str = @"91023m lkajsdfl;jkasdf;piou-09324\\adf \asdf\45\ 65u\ 86\ 8\\\;";
static int testIndexOf() { return str.IndexOf('\\'); }
static int testManualIndexOf() {
    string s = str;
    for (int i = 0; i < s.Length; ++i)
        if (s[i] == '\\') return i;
    return -1;
}

I ran each method 25 million times, and measured the time using a Stopwatch. The manual version was consistently 25% faster than the other one.

Any thoughts?!


Edit 2: Running the tests on a different machine with .NET 4.0 yields results very similar to those in Marc Gravell's answer. String.IndexOf(char) is faster than manual searching.


Edit: Today (Jan 4 '09) I wanted to check this issue in a more comprehensive way, so I created a new project, and wrote the code you'll find below. I got the following results when running release mode from cmd for 100 million iterations:

  -   String.      IndexOf : 00:00:07.6490042
  - MyString.PublicIndexOf : 00:00:05.6676471
  - MyString.InternIndexOf : 00:00:06.1191796
  - MyString.PublicIndexOf2: 00:00:09.1363687
  - MyString.InternIndexOf2: 00:00:09.1182569

I ran these tests at least 20 times, getting almost the same results every time. My machine is XP SP3, VS 2008 SP1, P4 3.0 GHz with no hyper-threading and 1 GB RAM. I find the results really strange. As you can see, String.IndexOf was about 33% slower than my PublicIndexOf. Even stranger, I wrote the same method as internal, and it was about 8% slower than the public one! I do not understand what's happening, and I hope you could help me understand!

The testing code is below. (Sorry for the repeated code, but I found that using a delegate showed different timings, with the public and internal methods taking the same time.)

public static class MyString {
    public static int PublicIndexOf(string str, char value) {
        for (int i = 0; i < str.Length; ++i)
            if (str[i] == value) return i;
        return -1;
    }

    internal static int InternIndexOf(string str, char value) {
        for (int i = 0; i < str.Length; ++i)
            if (str[i] == value) return i;
        return -1;
    }

    public static int PublicIndexOf2(string str, char value, int startIndex) {
        if (startIndex < 0 || startIndex >= str.Length)
            throw new ArgumentOutOfRangeException("startIndex");
        for (; startIndex < str.Length; ++startIndex)
            if (str[startIndex] == value) return startIndex;
        return -1;
    }

    internal static int InternIndexOf2(string str, char value, int startIndex) {
        if (startIndex < 0 || startIndex >= str.Length)
            throw new ArgumentOutOfRangeException("startIndex");
        for (; startIndex < str.Length; ++startIndex)
            if (str[startIndex] == value) return startIndex;
        return -1;
    }
}

class Program {
    static void Main(string[] args) {
        int iterations = 100 * 1000 * 1000; // 100 millions
        char separator = '\\';
        string str = @"91023m lkajsdfl;jkasdf;piou-09324\\adf \asdf\45\ 65u\ 86\ 8\\\;";
        Stopwatch watch = new Stopwatch();

        // test String.IndexOf
        int sum = 0;
        watch.Start();
        for (int i = 0; i < iterations; ++i)
            sum += str.IndexOf(separator);
        watch.Stop();
        Console.WriteLine("  String.      IndexOf : ({0}, {1})", watch.Elapsed, sum);

        // test MyString.PublicIndexOf
        sum = 0;
        watch.Reset(); watch.Start();
        for (int i = 0; i < iterations; ++i)
            sum += MyString.PublicIndexOf(str, separator);
        watch.Stop();
        Console.WriteLine("MyString.PublicIndexOf : ({0}, {1})", watch.Elapsed, sum);

        // test MyString.InternIndexOf
        sum = 0;
        watch.Reset(); watch.Start();
        for (int i = 0; i < iterations; ++i)
            sum += MyString.InternIndexOf(str, separator);
        watch.Stop();
        Console.WriteLine("MyString.InternIndexOf : ({0}, {1})", watch.Elapsed, sum);

        // test MyString.PublicIndexOf2
        sum = 0;
        watch.Reset(); watch.Start();
        for (int i = 0; i < iterations; ++i)
            sum += MyString.PublicIndexOf2(str, separator,0);
        watch.Stop();
        Console.WriteLine("MyString.PublicIndexOf2: ({0}, {1})", watch.Elapsed, sum);

        // test MyString.InternIndexOf2
        sum = 0;
        watch.Reset(); watch.Start();
        for (int i = 0; i < iterations; ++i)
            sum += MyString.InternIndexOf2(str, separator,0);
        watch.Stop();
        Console.WriteLine("MyString.InternIndexOf2: ({0}, {1})", watch.Elapsed, sum);
    }
}

Upvotes: 4

Views: 2271

Answers (10)

Marc Gravell
Marc Gravell

Reputation: 1064204

(updated)

With your newly posted rig, I get the numbers below, which are in line with what I would expect:

String.      IndexOf : (00:00:06.3134669, -994967296)
MyString.PublicIndexOf : (00:00:07.0769368, -994967296)
MyString.InternIndexOf : (00:00:08.3463652, -994967296)
MyString.PublicIndexOf2: (00:00:12.0049268, -994967296)
MyString.InternIndexOf2: (00:00:12.4344756, -994967296)

(original)

I cannot reproduce your numbers in release at the console. I did 25M iterations, with results:

  • 1556ms, (825000000) (testIndexOf)
  • 1798ms, (825000000) (testManualIndexOf)

So IndexOf is quicker. I suspect your test rig is not being run in release mode at the command line (you can't run performance tests with the debugger attached; even the IDE adds too much overhead).

Here is my rig:

static void Main()
{
    const int LOOP = 25000000;
    int chk1 = 0, chk2 = 0;
    Stopwatch watch1 = Stopwatch.StartNew();
    for (int i = 0; i < LOOP; i++)
    {
        chk1 += testIndexOf();
    }
    watch1.Stop();
    Stopwatch watch2 = Stopwatch.StartNew();
    for (int i = 0; i < LOOP; i++)
    {
        chk2 += testManualIndexOf();
    }
    watch2.Stop();

    Console.WriteLine("{0}ms, ({1})", watch1.ElapsedMilliseconds, chk1);
    Console.WriteLine("{0}ms, ({1})", watch2.ElapsedMilliseconds, chk2);

}

Upvotes: 6

Bartek Szabat
Bartek Szabat

Reputation: 3014

There is definely sth wrong with your test or with your environment.

I just run this code:

    static void Main(string[] args)
    {
        {
            Stopwatch sw = new Stopwatch();
            sw.Start();
            for (int i = 0; i < 25000000; i++)
                testIndexOf();
            sw.Stop();
            Console.WriteLine(sw.Elapsed); //2.27 s
        }
        {
            Stopwatch sw = new Stopwatch();
            sw.Start();
            for (int i = 0; i < 25000000; i++)
                testManualIndexOf();
            sw.Stop();
            Console.WriteLine(sw.Elapsed); //9.81 s
        }

        Console.ReadLine();
    }

on two different machines, one Vista, one Linux/mono and found that internal IndexOf is 4 times faster.

Upvotes: 5

jason
jason

Reputation: 241789

I turned your code into an actual method, and added a test to check for null. I then wrote a loop to generate ten million random strings of random length (1 <= length <= 250) and compared the time using the instance method String.IndexOf and the custom testManualIndexOf. String.IndexOf took 5159 milliseconds and testManualIndexOf took 4838 milliseconds for a difference of 6.2%. Following is the not-so-pretty code:

public static void Main(string[] args) {
    int nTrials = 10000000;
    Random rg = new Random(1);
    Stopwatch sw = new Stopwatch();
    for (int i = 0; i < nTrials; i++) {
        int len = 1 + rg.Next(250);
        char[] randomChar = new char[len];
        for(int j = 0; j < len; j++) {
            randomChar[j] = GetRandomLetterOrDigit(rg);
        }
        string s = new string(randomChar);
        char findChar = GetRandomLetterOrDigit(rg);
        sw.Start();
        int index = s.IndexOf(findChar);
        //int index = testManualIndexOf(s, findChar);
        sw.Stop();
    }
    Console.WriteLine(sw.ElapsedMilliseconds);
}

private static char GetRandomLetterOrDigit(Random rg) {
    char c;
    int rc = rg.Next(62);

    if (rc < 26) {
        c = (char)('A' + rc);
    }
    else if (rc < 52) {
        c = (char)('a' + rc - 26);
    }
    else {
        c = (char)('0' + rc - 52);
    }
    return c;
}

private static int testManualIndexOf(string s, char c) {
    if (s == null) {
        throw new ArgumentNullException("s");
    }
    for (int i = 0; i < s.Length; ++i) {
        if (s[i] == c) return i;
    }
    return -1;
}

Upvotes: 1

Greg Beech
Greg Beech

Reputation: 136727

Your test isn't really a fair one, as you aren't implementing the same functionality:

  1. You aren't passing any arguments to your manual test, instead using a const reference, which requires fewer op-codes to be executed than when arguments are passed as in the real IndexOf method.
  2. Your IndexOf test actually makes two method calls rather than one. It's likely the outer method call will be inlined in a Release build not run under the debugger, but your test doesn't indicate the conditions under which it was run (if this is the case, then you can ignore this point).

But essentially, given that your method does less, I'm not surprised it runs faster.

Upvotes: 6

R. Martinho Fernandes
R. Martinho Fernandes

Reputation: 234654

As was already said, I think the function call may account for some of the difference. Parameter passing and checking takes time.

I tried to see the code for IndexOf with Reflector but... bad luck, it's an internal CLR call. I think that alone should be enough to pick it over the manual approach, since it may improve with CLR improvements.

Upvotes: 1

PEZ
PEZ

Reputation: 17004

Even if it is faster doing it "manually" it's not wise to do it. It's faster today, maybe, in your version of .net, your CLR setup. Maybe. At least for that particular string. But you can be pretty sure that the builtin will have better chances of improving with time. Go with your language. And IndexOf is much clearer too.

Upvotes: 8

tvanfosson
tvanfosson

Reputation: 532745

Well, you have an extra function call overhead in the first instance. Did you try wrapping your manual search in a method and calling it from your test -- that's really the best comparison since it's unlikely that you would want to write that loop every time you wanted to search for a character in a string.

Upvotes: 2

Bartek Szabat
Bartek Szabat

Reputation: 3014

Perhaps these 25% is the cost of calling function.

Your method (testManualIndexOf) does not call any function.

Upvotes: 6

ctacke
ctacke

Reputation: 67198

Is the time for this specific search actually critical, or are you doing the lethal "optimization without cause"? Did you try it with shorter and longer strings? With different search strings? The built-in function may be optimized for other cases where searching might be slower and your specific case ends up being slower. I wouldn't condemn the IndexOf API as being "always slower" until I did a bit more testing of a whole lot more cases.

Upvotes: 2

tuinstoel
tuinstoel

Reputation: 7316

Isn't it possible to download the code of the .net framework and to see what str.IndexOf really does?

Upvotes: 2

Related Questions