Starting Tasks In foreach Loop Uses Value of Last Item

Wonko the Sane picture Wonko the Sane · Jan 13, 2011 · Viewed 26.5k times · Source

I am making a first attempt at playing with the new Tasks, but something is happening that I don't understand.

First, the code, which is pretty straight-forward. I pass in a list of paths to some image files, and attempt to add a task to process each of them:

public Boolean AddPictures(IList<string> paths)
{
    Boolean result = (paths.Count > 0);
    List<Task> tasks = new List<Task>(paths.Count);

    foreach (string path in paths)
    {
        var task = Task.Factory.StartNew(() =>
            {
                Boolean taskResult = ProcessPicture(path);
                return taskResult;
            });
        task.ContinueWith(t => result &= t.Result);
        tasks.Add(task);
    }

    Task.WaitAll(tasks.ToArray());

    return result;
}

I've found that if I just let this run with, say, a list of 3 paths in a unit test, all three tasks use the last path in the provided list. If I step through (and slow down the processing of the loop), each path from the loop is used.

Can somebody please explain what is happening, and why? Possible workarounds?

Answer

Jon Skeet picture Jon Skeet · Jan 13, 2011

You're closing over the loop variable. Don't do that. Take a copy instead:

foreach (string path in paths)
{
    string pathCopy = path;
    var task = Task.Factory.StartNew(() =>
        {
            Boolean taskResult = ProcessPicture(pathCopy);
            return taskResult;
        });
    // See note at end of post
    task.ContinueWith(t => result &= t.Result);
    tasks.Add(task);
}

Your current code is capturing path - not the value of it when you create the task, but the variable itself. That variable changes value each time you go through the loop - so it can easily change by the time your delegate is called.

By taking a copy of the variable, you're introducing a new variable each time you go through the loop - when you capture that variable, it won't be changed in the next iteration of the loop.

Eric Lippert has a pair of blog posts which go into this in a lot more detail: part 1; part 2.

Don't feel bad - this catches almost everyone out :(


Note about this line:

task.ContinueWith(t => result &= t.Result);

As pointed out in comments, this isn't thread-safe. Multiple threads could execute it at the same time, potentially stamping on each other's results. I haven't added locking or anything similar as it would distract from the main issue that the question is interested, namely variable capture. However, it's worth being aware of.