Reputation: 4965
I was reading the code for List<T>.InsertRange()
in the .NET 4.0 framework, when I noticed a strange peculiarity. Here is the code for reference:
public void InsertRange(int index, IEnumerable<T> collection) {
if (collection==null) {
ThrowHelper.ThrowArgumentNullException(ExceptionArgument.collection);
}
if ((uint)index > (uint)_size) {
ThrowHelper.ThrowArgumentOutOfRangeException(ExceptionArgument.index, ExceptionResource.ArgumentOutOfRange_Index);
}
Contract.EndContractBlock();
ICollection<T> c = collection as ICollection<T>;
if( c != null ) { // if collection is ICollection<T>
int count = c.Count;
if (count > 0) {
EnsureCapacity(_size + count);
if (index < _size) {
Array.Copy(_items, index, _items, index + count, _size - index);
}
// If we're inserting a List into itself, we want to be able to deal with that.
if (this == c) {
// Copy first part of _items to insert location
Array.Copy(_items, 0, _items, index, index);
// Copy last part of _items back to inserted location
Array.Copy(_items, index+count, _items, index*2, _size-index);
}
else {
T[] itemsToInsert = new T[count];
c.CopyTo(itemsToInsert, 0);
itemsToInsert.CopyTo(_items, index);
}
_size += count;
}
}
else {
using(IEnumerator<T> en = collection.GetEnumerator()) {
while(en.MoveNext()) {
Insert(index++, en.Current);
}
}
}
_version++;
}
In particular, notice that _version is always incremented at the end of the function. This means that in-progress enumerations over the list will be invalidated at any non-exceptional call to InsertRange, even if the List was not changed. For instance, the following code throws:
static void Main(string [] args) {
var list = new List<object>() {1, 2 };
using(var enumerator = list.GetEnumerator()) {
if(enumerator.MoveNext())
Console.WriteLine(enumerator.Current);
list.InsertRange(1, new object[]{});
if(enumerator.MoveNext()) // ** InvalidOperationException
Console.WriteLine(enumerator.Current);
}
}
Modifying the method so that enumeration is not invalidated in this way would not increase execution time at all because the code already checks the size of count
. It could be rewritten as follows:
public void InsertRange(int index, IEnumerable<T> collection) {
if (collection==null) {
ThrowHelper.ThrowArgumentNullException(ExceptionArgument.collection);
}
if ((uint)index > (uint)_size) {
ThrowHelper.ThrowArgumentOutOfRangeException(ExceptionArgument.index, ExceptionResource.ArgumentOutOfRange_Index);
}
Contract.EndContractBlock();
ICollection<T> c = collection as ICollection<T>;
if( c != null ) { // if collection is ICollection<T>
int count = c.Count;
if (count > 0) {
EnsureCapacity(_size + count);
if (index < _size) {
Array.Copy(_items, index, _items, index + count, _size - index);
}
// If we're inserting a List into itself, we want to be able to deal with that.
if (this == c) {
// Copy first part of _items to insert location
Array.Copy(_items, 0, _items, index, index);
// Copy last part of _items back to inserted location
Array.Copy(_items, index+count, _items, index*2, _size-index);
}
else {
T[] itemsToInsert = new T[count];
c.CopyTo(itemsToInsert, 0);
itemsToInsert.CopyTo(_items, index);
}
_size += count;
_version++;
}
}
else {
var inserted = false;
using(IEnumerator<T> en = collection.GetEnumerator()) {
while(en.MoveNext()) {
inserted = true;
Insert(index++, en.Current);
}
}
if (inserted) _version++;
}
}
Which has as its only disadvantage an extra local variable (which will probably be JITed into a register), maybe 20 bytes increase in the working set, and an irrelevant amount of extra CPU work when inserting IEnumerable
s. If the extra bool or the in-loop assignment need to be avoided, the insertion for IEnumerable
s could be performed as
if(en.MoveNext()) {
Insert(index++, en.Current);
_version++;
}
while(en.MoveNext()) {
Insert(index++, en.Current);
}
So...
Is the .NET implementation the intended behavior, or is it a mistake?
EDIT:
I realize that if you are enumeration on one thread while modifying the thread on another, you are doing something wrong. According to the documentation, the behavior in these cases is undefined. However, List<T>
does the programmer a favor and throws an exception in these cases. I am not asking if List<T>
follows the documentation correctly: it does. I am asking if it is implemented in a way that is not what was intended by Microsoft.
If InsertRange()
is behaving as intended, then List<T>
has inconsistent behavior. The RemoveRange()
method only invalidates enumeration if items were actually removed:
static void Main(string [] args) {
var list = new List<object>() {1, 2 };
using(var enumerator = list.GetEnumerator()) {
if(enumerator.MoveNext())
Console.WriteLine(enumerator.Current);
list.RemoveRange(1, 0);
if(enumerator.MoveNext()) // ** Does not throw
Console.WriteLine(enumerator.Current);
}
}
Upvotes: 3
Views: 1328
Reputation: 6259
I would guess that this is intentional. C# follows a "pit of success" design - they want it to be hard to make a mistake.
Ultimately, the existing design makes it easier to analyze the use of that method. What do I mean by that? Well:
The example you cite is trivial, at a glance you can tell it isn't really modifying the list. But in almost all real-world code that isn't the case. The sequence being inserted would almost certainly be dynamically created, and could be an empty sequence almost randomly. Should an empty sequence really behave differently? Your code would work if the sequence inserted was empty, but the moment you put something real in there, ka-boom.
Imagine if you first write this code and all your sequences are empty; looks like it works. Then, some arbitrary time later, you have a non-empty insert. Now you get exceptions. The distance between you inserting the problem and detecting it is potentially very large.
With it throwing on any successful call, that failure mode becomes easier to detect.
Upvotes: 3
Reputation: 93424
The idea here is that you should not write code that reads from the list while modifying it, even if your call doesn't actually modify the list, a different call might. It's safest to simply invalidate the list every time, even if once in a while it might be less efficient.
Upvotes: 0
Reputation: 4992
It's probably intended. The idea when calling a function to insert items is to modify the list. The case where the list ends up not being modified is an exception, but the intention was to modify it. I guess here the intention counts. If someone calls InsertRange
while simultaneously iterating over the list, then there's a conceptual problem already.
Personally I'm more a fan of the Java collection framework here - the ListIterator
class can modify the list through its own Add
, Set
and Remove
methods. Even the general Iterator
can remove items from the iterated collection. But as far as I know, Java also considers any modification attempts other than those from the iterator itself as operations that invalidate the (list)iterator.
Upvotes: 1