George2
George2

Reputation: 45771

C# List<T>.ToArray performance is bad?

I'm using .Net 3.5 (C#) and I've heard the performance of C# List<T>.ToArray is "bad", since it memory copies for all elements to form a new array. Is that true?

Upvotes: 34

Views: 46928

Answers (7)

Sorantis
Sorantis

Reputation: 14712

Reasons to call ToArray()

  • If the returned value is not meant to be modified, returning it as an array makes that fact a bit clearer.
  • If the caller is expected to perform many non-sequential accesses to the data, there can be a performance benefit to an array over a List<>.
  • If you know you will need to pass the returned value to a third-party function that expects an array.
  • Compatibility with calling functions that need to work with .NET version 1 or 1.1. These versions don't have the List<> type (or any generic types, for that matter).

Reasons not to call ToArray()

  • If the caller ever does need to add or remove elements, a List<> is absolutely required.
  • The performance benefits are not necessarily guaranteed, especially if the caller is accessing the data in a sequential fashion. There is also the additional step of converting from List<> to array, which takes processing time.
  • The caller can always convert the list to an array themselves.

Taken from here.

Upvotes: 23

J Gaspar
J Gaspar

Reputation: 21

This is what Microsoft's official documentation says about List.ToArray's time complexity

The elements are copied using Array.Copy, which is an O(n) operation, where n is Count.

Then, looking at Array.Copy, we see that it is usually not cloning the data but instead using references:

If sourceArray and destinationArray are both reference-type arrays or are both arrays of type Object, a shallow copy is performed. A shallow copy of an Array is a new Array containing references to the same elements as the original Array. The elements themselves or anything referenced by the elements are not copied. In contrast, a deep copy of an Array copies the elements and everything directly or indirectly referenced by the elements.

So in conclusion, this is a pretty efficient way of getting an array from a list.

Upvotes: 2

Curtis Yallop
Curtis Yallop

Reputation: 7309

For any kind of List/ICollection where it knows the length, it can allocate an array of exactly the right size from the start.

T[] destinationArray = new T[this._size];
Array.Copy(this._items, 0, destinationArray, 0, this._size);
return destinationArray;

If your source type is IEnumerable (not a List/Collection) then the source is:

items = new TElement[4];
..
if (no more space) {
    TElement[] newItems = new TElement[checked(count * 2)];
    Array.Copy(items, 0, newItems, 0, count);
    items = newItems;

It starts at size 4 and grows exponentially, doubling each time it runs out of space. Each time it doubles, it has to reallocate memory and copy the data over.

If we know the source-data size, we can avoid this slight overhead. However in most cases eg array size <=1024, it will execute so quickly, that we don't even need to think about this implementation detail.

References: Enumerable.cs, List.cs (F12ing into them), Joe's answer

Upvotes: 0

to StackOverflow
to StackOverflow

Reputation: 124696

No that's not true. Performance is good since all it does is memory copy all elements (*) to form a new array.

Of course it depends on what you define as "good" or "bad" performance.

(*) references for reference types, values for value types.

EDIT

In response to your comment, using Reflector is a good way to check the implementation (see below). Or just think for a couple of minutes about how you would implement it, and take it on trust that Microsoft's engineers won't come up with a worse solution.

public T[] ToArray()
{
    T[] destinationArray = new T[this._size];
    Array.Copy(this._items, 0, destinationArray, 0, this._size);
    return destinationArray;
}

Of course, "good" or "bad" performance only has a meaning relative to some alternative. If in your specific case, there is an alternative technique to achieve your goal that is measurably faster, then you can consider performance to be "bad". If there is no such alternative, then performance is "good" (or "good enough").

EDIT 2

In response to the comment: "No re-construction of objects?" :

No reconstruction for reference types. For value types the values are copied, which could loosely be described as reconstruction.

Upvotes: 57

chris166
chris166

Reputation: 4807

Yes, it's true that it does a memory copy of all elements. Is it a performance problem? That depends on your performance requirements.

A List contains an array internally to hold all the elements. The array grows if the capacity is no longer sufficient for the list. Any time that happens, the list will copy all elements into a new array. That happens all the time, and for most people that is no performance problem.

E.g. a list with a default constructor starts at capacity 16, and when you .Add() the 17th element, it creates a new array of size 32, copies the 16 old values and adds the 17th.

The size difference is also the reason why ToArray() returns a new array instance instead of passing the private reference.

Upvotes: 7

Daniel Earwicker
Daniel Earwicker

Reputation: 116664

Performance has to be understood in relative terms. Converting an array to a List involves copying the array, and the cost of that will depend on the size of the array. But you have to compare that cost to other other things your program is doing. How did you obtain the information to put into the array in the first place? If it was by reading from the disk, or a network connection, or a database, then an array copy in memory is very unlikely to make a detectable difference to the time taken.

Upvotes: 1

TimM
TimM

Reputation:

it creates new references in an array, but that's just the only thing that that method could and should do...

Upvotes: 1

Related Questions