maynull
maynull

Reputation: 2046

Is it true that the bigger the size of a list the more time it takes to add new values to it?

I'm making a program that continuously receives data(string type) from the internet realtime. For the sake of better performance, it stores new data in a list(memory) and writes it into a file only once a day.

I'm wondering if the bigger the size of a list gets the more time it takes to add new values to it. For example, is there any difference, in regards to performance, between adding new data to a list the size of which is 10 and doing the same to a list that is bigger than 3000000? And I'd like to know if there will be any difference in performance if I set the default size of a list from the start, like new List<string>(3000000).

I'd appreciate it if I could get some advice on better ways to do this job.

Upvotes: 2

Views: 574

Answers (2)

Lajos Arpad
Lajos Arpad

Reputation: 76621

As Michael Randall pointed out in his wonderful answer (upvote), the answer to the actual question is yes. However, even though we know that a list becoming big will get slower to add items, we still have a problem. You could create a list of lists.

For the sake of simplicity I will call "outer list" the list of list and "inner list" the lists inside the outer list. You would start by creating the first inner list and letting items enter it, until it gets fairly large, let's say, of 10 000 elements. Then, you create the next inner list and new items would be put there, until that reaches the limit. And on and on and on. This would mean that at the end of the day you would have maybe 300 lists, each having 10 000 elements. It would slighly complicate your work, but would rid you of the performance drop when you add items to it.

Upvotes: 1

TheGeneral
TheGeneral

Reputation: 81513

This is the actual source code for adding an item to a list which you can find here list.cs - Reference Source - Microsoft

public void Add(T item)
{
   if (_size == _items.Length) EnsureCapacity(_size + 1);
   _items[_size++] = item;
   _version++;
}

private void EnsureCapacity(int min)
{
   if (_items.Length < min)
   {
      int newCapacity = _items.Length == 0 ? _defaultCapacity : _items.Length * 2;
      // Allow the list to grow to maximum possible capacity (~2G elements) before encountering overflow.
      // Note that this check works even when _items.Length overflowed thanks to the (uint) cast
      if ((uint)newCapacity > Array.MaxArrayLength) newCapacity = Array.MaxArrayLength;
      if (newCapacity < min) newCapacity = min;
      Capacity = newCapacity;
   }
}

public int Capacity
{
   ...
   set
   {
      ...
      if (value != _items.Length)
      {
         if (value > 0)
         {
            T[] newItems = new T[value];
            if (_size > 0)
            {
               Array.Copy(_items, 0, newItems, 0, _size);
            }
            _items = newItems;
         }
         else
         {
            _items = _emptyArray;
         }
      }
   }
}

In summary, it doubles the capacity every time, which means it really only expands the array a limited amount of times. Do this it creates a new array, and uses Array.Copy() to copy the data, which is extremely fast.

To give you an example, here is an array of bytes with 100,000,000 elements, and it copies it in 75 milliseconds. also remember it will only ever grow a maximum of about 32 times, before it hits the max array limit of .Net

var r = new Random();
var bytes = new byte[100000000];
var bytes2 = new byte[100000000];
r.NextBytes(bytes);

var sw = Stopwatch.StartNew();
Array.Copy(bytes,bytes2,bytes.Length);
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds);

I'd appreciate it if I could get some advice on better ways to do this job

Well if this was really mission critical stuff and you wanted to save allocations and memory pressure on the garbage collector and Large Object Heap, just create a list with a capacity set sufficiently large once (or an array), and just reuse it. However, there are probably other thing you would need to worry about first in my honest opinion.

Upvotes: 5

Related Questions