Roland Ebner
Roland Ebner

Reputation: 31

Different results between foreach and Parallel.ForEach

I am trying to iterate through a whole directory of files using Parallel.ForEach using the following code:

    List<string> _files = Directory.EnumerateFiles(baseDirectory, "*", SearchOption.AllDirectories).ToList();

    Parallel.ForEach(_files, (file) => { ReadFileIntoList(file); i++; });

_files contains 28015 entries but after executing, i is only 27944 and also the resulting list contains only 27944 entries.

But if I use the following code:

    List<string> _files = Directory.EnumerateFiles(baseDirectory, "*", SearchOption.AllDirectories).ToList();

    foreach (string file in _files)
    {
        ReadFileIntoList(file); 
        i++;
    }

i will also be 28015 and also the resulting list contains 28015 entries.

Can someone please explain or check where the error is?

Upvotes: 0

Views: 261

Answers (2)

V0ldek
V0ldek

Reputation: 10563

You have two race conditions, one on i++ and one on whatever list it is that ReadFileIntoList reads into. For the first one, use Interlocked.Increment(ref i). For the second one, use a ConcurrentBag<FileInfo> - since you don't care about the order of the files (you don't if you're using Parallel.Foreach) that will be the most performant collection.

Upvotes: 0

Roland Ebner
Roland Ebner

Reputation: 31

I've found the answer. Using a

    SynchronizedCollection<FileInfo> 

instead of an

    List<FileInfo> 

did it for me.

Upvotes: 1

Related Questions