ManInMoon
ManInMoon

Reputation: 7005

Why is my Parallel.Foreach appear not to be using 10 threads?

There does not seem to be a MinDegreeOfParallelism. The following code only seems to use 1% cpu, so I suspect it is NOT using cores properly:

Parallel.ForEach(lls, new ParallelOptions { MaxDegreeOfParallelism = 10 },  GetFileSizeFSO);

Is there a way to FORCE using 10 cores/Threads?

Additional information:

private void GetFileSizeFSO(List<string> l)
{
    foreach (var dir in l)
    {

        var ds = GetDirectorySize3(dir);
        Interlocked.Add(ref _size, ds);
    }
}

    public static long GetDirectorySize3(string parentDirectory)
    {
        Scripting.FileSystemObject fso = new Scripting.FileSystemObject();
        Scripting.Folder folder = fso.GetFolder(parentDirectory);
        Int64 dirSize = (Int64)folder.Size;

        Marshal.ReleaseComObject(fso);


        return dirSize;
    }

Upvotes: 1

Views: 1428

Answers (4)

Craig Brunetti
Craig Brunetti

Reputation: 595

ManInMoon, your CPU usage is probably slow because the meat of the work you're doing is probably bound by your storage mechanism. 10 cores hitting the same hard drive for getting file sizes may not be any faster than 2 cores, because going and making a hit against a hard drive is a relatively (ridiculously) more expensive operation than the surrounding C# logic you have there.

So, you don't have a parallelism problem, you have an I/O problem.

Side note, perhaps don't use FSO, use .NET's FileInfo instead.

Upvotes: 0

Gediminas Masaitis
Gediminas Masaitis

Reputation: 3212

The simple answer is - you can't.

But why should you? .NET is pretty good at choosing the optimal amount of threads used. The use of MaxDegreeOfParallelism is to limit parallelism, not to force it, for example if you don't want to give all system resources to the loop.

As a side note, judging from your function name GetFileSizeFSO, I would guess that it reads file sizes from your persistent storage, which would explain why your CPU is not being fully used.

Upvotes: 0

Luaan
Luaan

Reputation: 63722

It's called MaxDegreeOfParallelism, not MinDegreeOfParallelism. Parallel is designed for CPU-bound work - there's no point whatsoever in using more threads than you have CPUs. It sounds like your work is I/O bound, rather than CPU-bound, so Parallel simply isn't the right tool for the job.

Ideally, find an asynchronous API to do what you're trying to do - this is the best way to use the resources you have. If there's no asynchronous API, you'll have to spawn those threads yourself - don't expect to see CPU usage, though. And most importantly, measure - it's very much possible that parallelizing the workload doesn't improve throughput at all (for example, the I/O might already be saturated).

Upvotes: 2

Sasha
Sasha

Reputation: 8850

What does your function GetFileSizeFSO do? In case it accesses files on disc, that must be your main time-consumer. Processor is simply too fast and disc can't catch up with the processor. So processor have enough time to spare and wait while HDD completes it's job.

If you need to optimize your code, you better look into accessing files more efficiently than trying to load processor for 100%.

Upvotes: 2

Related Questions