Enumerating Collections that are not inherently IEnumerable?

BillW picture BillW · Nov 29, 2009 · Viewed 7k times · Source

When you want to recursively enumerate a hierarchical object, selecting some elements based on some criteria, there are numerous examples of techniques like "flattening" and then filtering using Linq : like those found here :

link text

But, when you are enumerating something like the Controls collection of a Form, or the Nodes collection of a TreeView, I have been unable to use these types of techniques because they seem to require an argument (to the extension method) which is an IEnumerable collection : passing in SomeForm.Controls does not compile.

The most useful thing I found was this :

link text

Which does give you an extension method for Control.ControlCollection with an IEnumerable result you can then use with Linq.

I've modified the above example to parse the Nodes of a TreeView with no problem.

public static IEnumerable<TreeNode> GetNodesRecursively(this TreeNodeCollection nodeCollection)
{
    foreach (TreeNode theNode in nodeCollection)
    {
        yield return theNode;

        if (theNode.Nodes.Count > 0)
        {
            foreach (TreeNode subNode in theNode.Nodes.GetNodesRecursively())
            {
                yield return subNode;
            }
        }
    }
}

This is the kind of code I'm writing now using the extension method :

    var theNodes = treeView1.Nodes.GetNodesRecursively();

    var filteredNodes = 
    (
        from n in theNodes
            where n.Text.Contains("1")
                select n
    ).ToList();

And I think there may be a more elegant way to do this where the constraint(s) are passed in.

What I want to know if it is possible to define such procedures generically, so that : at run-time I can pass in the type of collection, as well as the actual collection, to a generic parameter, so the code is independent of whether it's a TreeNodeCollection or Controls.Collection.

It would also interest me to know if there's any other way (cheaper ? fastser ?) than that shown in the second link (above) to get a TreeNodeCollection or Control.ControlCollection in a form usable by Linq.

A comment by Leppie about 'SelectMany in the SO post linked to first (above) seems like a clue.

My experiments with SelectMany have been : well, call them "disasters." :)

Appreciate any pointers. I have spent several hours reading every SO post I could find that touched on these areas, and rambling my way into such exotica as the "y-combinator." A "humbling" experience, I might add :)

Answer

mrydengren picture mrydengren · Nov 29, 2009

This code should do the trick

public static class Extensions
{
    public static IEnumerable<T> GetRecursively<T>(this IEnumerable collection,
        Func<T, IEnumerable> selector)
    {
        foreach (var item in collection.OfType<T>())
        {
            yield return item;

            IEnumerable<T> children = selector(item).GetRecursively(selector);
            foreach (var child in children)
            {
                yield return child;
            }
        }
    }
}

Here's an example of how to use it

TreeView view = new TreeView();

// ...

IEnumerable<TreeNode> nodes = view.Nodes.
    .GetRecursively<TreeNode>(item => item.Nodes);

Update: In response to Eric Lippert's post.

Here's a much improved version using the technique discussed in All About Iterators.

public static class Extensions
{
    public static IEnumerable<T> GetItems<T>(this IEnumerable collection,
        Func<T, IEnumerable> selector)
    {
        Stack<IEnumerable<T>> stack = new Stack<IEnumerable<T>>();
        stack.Push(collection.OfType<T>());

        while (stack.Count > 0)
        {
            IEnumerable<T> items = stack.Pop();
            foreach (var item in items)
            {
                yield return item;

                IEnumerable<T> children = selector(item).OfType<T>();
                stack.Push(children);
            }
        }
    }
}

I did a simple performance test using the following benchmarking technique. The results speak for themselves. The depth of the tree has only marginal impact on the performance of the second solution; whereas the performance decreases rapidly for the first solution, eventually leadning to a StackOverflowException when the depth of the tree becomes too great.

benchmarking