crip
crip

Reputation: 145

.ToList() thread-safe

I've have the following situation:

I know when we directly expose the list, I will ran into concurrent modification while enumerating.

But when I call .ToList() it makes a instance of the List and copies all items with the Array.Copy function of the .NET Framework.

What do you mean, is it safe when using this approach when dirty reads are ok?

using System;
using System.Collections.Generic;
using System.Linq;
using System.Threading;

namespace ConsoleApplication1
{
    class Program
    {
        static void Main(string[] args)
        {
            var container = new Container();
            Thread writer = new Thread(() =>
            {
                for (int i = 0; i < 1000; i++)
                {
                    container.Add(new Item("Item " + i));
                }
            });

            writer.Start();

            Thread[] readerThreads = new Thread[10];
            for (int i = 0; i < readerThreads.Length; i++)
            {
                readerThreads[i] = new Thread(() =>
                {
                    foreach (Item item in container.Items)
                    {
                        Console.WriteLine(item.Name);
                    }
                });
                readerThreads[i].Start();
            }

            writer.Join();

            for (int i = 0; i < readerThreads.Length; i++)
            {
                readerThreads[i].Join();
            }

            Console.ReadLine();
        }
    }

    public class Container
    {
        private readonly IList<Item> _items;

        public Container()
        {
            _items = new List<Item>();
        }

        public IReadOnlyList<Item> Items
        {
            get { return _items.ToList().AsReadOnly(); }
        }

        public void Add(Item item)
        {
            _items.Add(item);
        }
    }

    public class Item
    {
        private readonly string _name;

        public Item(string name)
        {
            _name = name;
        }

        public string Name
        {
            get { return _name; }
        }
    }
}

Upvotes: 8

Views: 6367

Answers (3)

Ivan Stoev
Ivan Stoev

Reputation: 205549

is it safe when using this approach when dirty reads are ok?

Easy answer - no, it's not.

First, the method does not guarantee that by definition. Even if you look at the current implementation (which you should not do), you'll see that it's using List<T> constructor accepting the IEnumerable<T>. Then it checks for ICollection<T> and if yes, uses CopyTo method or just iterates the enumerable. Both are unsafe, because the first one relies on the (unknown) implementation of CopyTo method while the second will receive exception when (the standard behavior) the collection enumerator is invalidated during the Add. And even if the source is a List<T> and is using the Array method you mentioned http://referencesource.microsoft.com/#mscorlib/system/collections/generic/list.cs,d2ac2c19c9cf1d44

Array.Copy(_items, 0, array, arrayIndex, _size);

still is unsafe because it may call copy with _items and _size beyond the _items.Length which of course will throw exception. Your assumption would have been correct only if this code is somehow getting both members atomically, but it doesn't.

So, simple don't do that.

EDIT: All above applies to your concrete question. But I think there is a simple solution for the situation you have explained. If single thread add/multiple thread read with acceptable stale reads is the requirement, then it could be achieved by using the ImmutableList<T> class from the System.Collections.Immutable package like this:

public class Container
{
    private ImmutableList<Item> _items = ImmutableList<Item>.Empty;
    public IReadOnlyList<Item> Items { get { return _items; } }
    public void Add(Item item) { _items = _items.Add(item); }
}

Upvotes: 7

Scott Chamberlain
Scott Chamberlain

Reputation: 127543

What I would do is ditch your current implementation and switch to a BlockingCollection

static void Main(string[] args)
{
    var container = new BlockingCollection<Item>();
    Thread writer = new Thread(() =>
    {
        for (int i = 0; i < 1000; i++)
        {
            container.Add(new Item("Item " + i));
        }
        container.CompleteAdding();
    });

    writer.Start();

    Thread[] readerThreads = new Thread[10];
    for (int i = 0; i < readerThreads.Length; i++)
    {
        readerThreads[i] = new Thread(() =>
        {
            foreach (Item item in container.GetConsumingEnumerable())
            {
                Console.WriteLine(item.Name);
            }
        });
        readerThreads[i].Start();
    }

    writer.Join();

    for (int i = 0; i < readerThreads.Length; i++)
    {
        readerThreads[i].Join();
    }

    Console.ReadLine();
}

What a BlockingCollection will do is give you a thread safe queue (By default it uses a ConcurrentQueue as the backing storage, you can change the behavior to a stack by passing a ConcurrentStack to the constructor) that will block the foreach loops on the readers when the queue is empty waiting for more data to show up. Once the producer calls container.CompleteAdding() that lets all of the readers stop blocking and exit their foreach loops.

This does change your program's behavior, your old way would have given the same Item to multiple threads, this new code will only give the item inserted to a single "worker" thread. This implementation will also give new items to workers that where inserted in to the list after the worker entered its foreach loop, your old implementation uses a "Snapshot" of the list and does its work on that snapshot. I am willing to bet this is the behavior you actually wanted but you did not realize your old way would have duplicated the processing.

If you did want to preserve the old behavior of multiple threads all using "snapshot in time" lists (multiple threads process the same item, new items added to the list after the snapshot are not processed) switch to using the ConcurrentQueue directly instead of the BlockingCollection, it's GetEnumerator() creates a "moment in time" snapshot of the queue when you create the enumerator and will use that list for the foreach.

static void Main(string[] args)
{
    var container = new ConcurrentQueue<Item>();
    Thread writer = new Thread(() =>
    {
        for (int i = 0; i < 1000; i++)
        {
            container.Enqueue(new Item("Item " + i));
        }
    });

    writer.Start();

    Thread[] readerThreads = new Thread[10];
    for (int i = 0; i < readerThreads.Length; i++)
    {
        readerThreads[i] = new Thread(() =>
        {
            foreach (Item item in container)
            {
                Console.WriteLine(item.Name);
            }
        });
        readerThreads[i].Start();
    }

    writer.Join();

    for (int i = 0; i < readerThreads.Length; i++)
    {
        readerThreads[i].Join();
    }

    Console.ReadLine();
}

Upvotes: 5

Sergey Kalinichenko
Sergey Kalinichenko

Reputation: 726479

Your implementation is not thread-safe, despite the fact that you are returning a brand-new copy of your list each time the Items property is accessed.

Consider this situation:

Container c = new Container();

// Code in thread1
c.Add(new Item());

// Code in thread2
foreach (var item in c.Items) {
    ...
}

When the call to c.Items is made, the _items list is enumerated inside ToList() in order to construct a copy. This enumeration is not different from an enumeration that you would perform yourself. Just like your enumeration, this enumeration is not thread-safe.

You could make your implementation thread-safe by adding a lock object, and locking it when adding to the container or when requesting a copy of the items in the container:

public class Container {
    private readonly IList<Item> _items;
    private readonly synclock = new object();

    public Container() {
        _items = new List<Item>();
    }

    public IReadOnlyList<Item> Items {
        get {
            lock (synclock) {
                return _items.ToList().AsReadOnly();
            }
        }
    }

    public void Add(Item item) {
        lock (synclock) {
            _items.Add(item);
        }
    }
}

Upvotes: 3

Related Questions