Extension methods on IEnumerable: how is it performance?

Question

From my mentor: Prefer native methods (implemented directly on the collection) over extension methods of IEnumerable, because:

The LINQ-to-Objects extension methods are implemented on IEnumerable, meaning that in the worst-case scenario (when the item you search for does not exist in the collection) you will have to enumerate thru all elements. If you have a Contains or Exists method implemented directly on the collection, it could make use of internal knowledge and maybe just do a hash table look up or some other quick operation.

I was a deeply confused, because I think Microsoft should have implemented hash table for IEnumerable Contains/Exists already. A quick benchmark with List and IEnumerable show no differences:

static void Main(string[] args)
{
    Console.Write("input the number of elements: ");
    int count = Convert.ToInt32(Console.ReadLine());
    Console.Write("input the number of loops: ");
    int loop = Convert.ToInt32(Console.ReadLine());

    Random r = new Random();

    Stopwatch sw = new Stopwatch();
    for (int i = 0; i < loop; i++)
    {
        var list = CreateListOfInt(count);
        sw.Start();
        for (int j = 0; j < count; j++)
        {
            DoContains(list, r.Next());
        }
        sw.Stop();
    }

    Console.WriteLine("List native method: Iterated {0} times on {1} elements, elapsed :{2}",loop,count,sw.Elapsed);

    sw.Reset();
    for (int i = 0; i < loop; i++)
    {
        var list = CreateListOfInt(count);
        sw.Start();
        for (int j = 0; j < count; j++)
        {
            DoContainsEnumerable(list, r.Next());
        }
        sw.Stop();
    }

    Console.WriteLine("IEnumerable extension method: Iterated {0} times on {1} elements, elapsed :{2}", loop, count, sw.Elapsed);

    sw.Reset();
    for (int i = 0; i < loop; i++)
    {
        var list = CreateListOfInt2(count);
        sw.Start();
        for (int j = 0; j < count; j++)
        {
            //make sure that the element is not in the list
            DoContains(list, r.Next(20000, 50000));
        }
        sw.Stop();
    }
    Console.WriteLine("List native method: element does not exist:Iterated {0} times on {1} elements, elapsed :{2}", loop, count, sw.Elapsed);

    sw.Reset();
    for (int i = 0; i < loop; i++)
    {
        var list = CreateListOfInt2(count);
        sw.Start();
        for (int j = 0; j < count; j++)
        {
            //make sure that the element is not in the list
            DoContainsEnumerable(list, r.Next(20000, 50000));
        }
        sw.Stop();
    }
    Console.WriteLine("IEnumerable extension method: element does not exist: Iterated {0} times on {1} elements, elapsed :{2}", loop, count, sw.Elapsed);


    Console.ReadKey();
}

static List CreateListOfInt(int count)
{
    Random r = new Random(1000);
    List numbers = new List(count);
    for (int i = 0; i < count; i++)
    {
        numbers.Add(r.Next());
    }
    return numbers;
}

static bool DoContains(List list, int number)
{
    return list.Contains(number);
}

static bool DoContainsEnumerable(IEnumerable list, int number)
{
    return list.Contains(number);
}


//define the scope of randomly created number, to make sure that lookup number will not in the List
static List CreateListOfInt2(int count)
{
    Random r = new Random(1000);
    List numbers = new List(count);
    for (int i = 0; i < count; i++)
    {
        numbers.Add(r.Next(0,10000));
    }
    return numbers;
}

}

Edit: I tried HashSet implementation, which greatly increases performance:

  sw.Reset();
            for (int i = 0; i < loop; i++)
            {
                var list = CreateListOfInt2(count);
                HashSet hashtable = new HashSet(list);
                sw.Start();
                for (int j = 0; j < count; j++)
                {
                    //make sure that the element is not in the list
                    hashtable.Contains(r.Next(20000, 50000));
                }
                sw.Stop();
            }
            Console.WriteLine("IEnumerable extension method: element does not exist: Iterated {0} times on {1} elements, elapsed :{2}", loop, count, sw.Elapsed);

Still, what is your opinion about my mentor saying?

Can anyone clear out for me? Is my mentor right? If he's right, what is wrong with my code?

Thank you very much

dlev · Accepted Answer

List Contains calls are just iterating the list, so they won't be faster than the extension method. If you were to use a HashSet and try a series of Contains() operations, you would find a marked improvement.

Edit: the reason Microsoft didn't utilize a hash for the IEnumerable extension methods is they could not guarantee that the implementing class used a hash or something similar. They had to go with the naive approach because the IEnumerable interface only guarantees that the implementing class be enumerated.

Extension methods on IEnumerable<T>: how is it performance?

Answers (2)

Related Questions

Extension methods on IEnumerable&lt;T&gt;: how is it performance?

Answers (2)

Related Questions

Extension methods on IEnumerable<T>: how is it performance?