Parallel.ForEach keeps spawning new threads

Leon Bouquiet picture Leon Bouquiet · Dec 26, 2012 · Viewed 9.5k times · Source

While I was using Parallel.ForEach in my program, I found that some threads never seemed to finish. In fact, it kept spawning new threads over and over, a behaviour that I wasn't expecting and definitely don't want.

I was able to reproduce this behaviour with the following code which, just like my 'real' program, both uses processor and memory a lot (.NET 4.0 code):

public class Node
{
    public Node Previous { get; private set; }

    public Node(Node previous)
    {
        Previous = previous;
    }
}

public class Program
{
    public static void Main(string[] args)
    {
        DateTime startMoment = DateTime.Now;
        int concurrentThreads = 0;

        var jobs = Enumerable.Range(0, 2000);
        Parallel.ForEach(jobs, delegate(int jobNr)
        {
            Interlocked.Increment(ref concurrentThreads);

            int heavyness = jobNr % 9;

            //Give the processor and the garbage collector something to do...
            List<Node> nodes = new List<Node>();
            Node current = null;
            for (int y = 0; y < 1024 * 1024 * heavyness; y++)
            {
                current = new Node(current);
                nodes.Add(current);
            }

            TimeSpan elapsed = DateTime.Now - startMoment;
            int threadsRemaining = Interlocked.Decrement(ref concurrentThreads);
            Console.WriteLine("[{0:mm\\:ss}] Job {1,4} complete. {2} threads remaining.", elapsed, jobNr, threadsRemaining);
        });
    }
}

When run on my quad-core, it initially starts of with 4 concurrent threads, just as you would expect. However, over time more and more threads are being created. Eventually, this program then throws an OutOfMemoryException:

[00:00] Job    0 complete. 3 threads remaining.
[00:01] Job    1 complete. 4 threads remaining.
[00:01] Job    2 complete. 4 threads remaining.
[00:02] Job    3 complete. 4 threads remaining.
[00:05] Job    9 complete. 5 threads remaining.
[00:05] Job    4 complete. 5 threads remaining.
[00:05] Job    5 complete. 5 threads remaining.
[00:05] Job   10 complete. 5 threads remaining.
[00:08] Job   11 complete. 5 threads remaining.
[00:08] Job    6 complete. 5 threads remaining.
...
[00:55] Job   67 complete. 7 threads remaining.
[00:56] Job   81 complete. 8 threads remaining.
...
[01:54] Job  107 complete. 11 threads remaining.
[02:00] Job  121 complete. 12 threads remaining.
..
[02:55] Job  115 complete. 19 threads remaining.
[03:02] Job  166 complete. 21 threads remaining.
...
[03:41] Job  113 complete. 28 threads remaining.
<OutOfMemoryException>

The memory usage graph for the experiment above is as follows:

Processor and memory usage

(The screenshot is in Dutch; the top part represents processor usage, the bottom part memory usage.) As you can see, it looks like a new thread is being spawned almost every time the garbage collector gets in the way (as can be seen in the dips of memory usage).

Can anyone explain why this is happening, and what I can do about it? I just want .NET to stop spawning new threads, and finish the existing threads first...

Answer

Douglas picture Douglas · Dec 26, 2012

You can limit the maximum number of threads that get created by specifying a ParallelOptions instance with the MaxDegreeOfParallelism property set:

var jobs = Enumerable.Range(0, 2000);
ParallelOptions po = new ParallelOptions
{ 
    MaxDegreeOfParallelism = Environment.ProcessorCount
};

Parallel.ForEach(jobs, po, jobNr =>
{
    // ...
});

As to why you're getting the behaviour you're observing: The TPL (which underlies PLINQ) is, by default, at liberty to guess the optimal number of threads to use. Whenever a parallel task blocks, the task scheduler may create a new thread in order to maintain progress. In your case, the blocking might be happening implicitly; for example, through the Console.WriteLine call, or (as you observed) during garbage collection.

From Concurrency Levels Tuning with Task Parallel Library (How Many Threads to Use?):

Since the TPL default policy is to use one thread per processor, we can conclude that TPL initially assumes that the workload of a task is ~100% working and 0% waiting, and if the initial assumption fails and the task enters a waiting state (i.e. starts blocking) - TPL with take the liberty to add threads as appropriate.