Yuriy Zaletskyy
Yuriy Zaletskyy

Reputation: 5151

List perfomance vs ArrayList memory allocation performance

I have following code:

namespace ConsoleCodeGenerator
{
    internal class Foo
    {
        public double F { get; set; }
    }

    internal class Program
    {
        private static void Main(string[] args)
        {
            //int size = 100000;
            int size = 70000000;

            List<Foo> list = new List<Foo>(size);
            ArrayList arrayList = new ArrayList(size);

            Stopwatch sw = new Stopwatch();
            sw.Start();
            for (int i = 0; i < size; i++)
            {
                Foo f = new Foo();
                f.F = i;
                list.Add(f);
            }
            sw.Stop();
            Console.WriteLine("List: {0}", sw.ElapsedMilliseconds);

            Stopwatch sw2 = new Stopwatch();
            sw2.Start();
            for (int i = 0; i < size; i++)
            {
                Foo f = new Foo();
                f.F = i;
                arrayList.Add(f);
            }
            sw2.Stop();
            Console.WriteLine("arrayList: {0}", sw2.ElapsedMilliseconds);
        }
    }
}

if I use int size = 100000; then List outperfoms ArrayList with proportion 2:6 miliseconds. But if to make size = 70000000; Then ArrayList has better perfomance 5450:4809 at my computer. It looks like for processing huge ( around millions of items) ArrayList may be faster then List. Why boxing/unboxing matters at small memory allocation, and doesn't matter at big arrays

Upvotes: 0

Views: 303

Answers (1)

Luaan
Luaan

Reputation: 63722

Your misunderstanding is a bit deeper than that.

For one, making a good benchmark is hard - yours isn't good.

Second, boxing only occurs on value types - you're adding a class in both cases, so no boxing occurs even with ArrayList. In fact, by wrapping the double in a class, you've just manually boxed the value - that's what boxing means (of course, the IL box / unbox instructions are probably a bit more efficient). Try inserting the double directly, and you'll see the massive difference.

To expand a bit on the benchmark issues, you're completely ignoring memory allocation (and collection) patterns. While you're preallocating the arrays themselves (that's what the capacity argument is for), you're not preallocating the objects (Foo). This wouldn't be important with structs or doubles, for example, but in this case, you're simply pushing all the memory pressure into the relevant cycles.

The List is elligible for collection as soon as it's no longer used in the method, so the ArrayList will get a free, pre-prepared memory as soon as it needs a collection. So even the order of the tests will make a small difference.

Finally, you want repeatability - make a hundred tests with List, another hundred with ArrayList, in as much of an isolation as possible. And don't forget to pre-heat the benchmark to get rid of initialization times.

You can find a lot of information on making a decent benchmark in C#. It really isn't easy.

Upvotes: 4

Related Questions