Expressing recursion in LINQ

Rex M picture Rex M · Apr 9, 2009 · Viewed 25.8k times · Source

I am writing a LINQ provider to a hierarchal data source. I find it easiest to design my API by writing examples showing how I want to use it, and then coding to support those use cases.

One thing I am having trouble with is an easy/reusable/elegant way to express "deep query" or recursion in a LINQ statement. In other words, what is the best way to distinguish between:

from item in immediate-descendants-of-current-node where ... select item

versus:

from item in all-descendants-of-current-node where ... select item

(Edit: please note neither of those examples above necessarily reflect the structure of the query I want. I am interested in any good way to express recursion/depth)

Please note I am not asking how to implement such a provider, or how to write my IQueryable or IEnumerable in such a way that allows recursion. I am asking from the standpoint of a person writing the LINQ query and utilizing my provider - what is an intuitive way for them to express whether they want to recurse or not?

The data structure resembles a typical file system: a folder can contain a collection of subfolders, and a folder can also contain a collection of items. So myFolder.Folders represents all the folders who are immediate children of myFolder, and myFolder.Items contains all the items immediately within myFolder. Here's a basic example of a site hierachy, much like a filesystem with folders and pages:

(F)Products
    (F)Light Trucks
        (F)Z150
            (I)Pictures
            (I)Specs
            (I)Reviews
        (F)Z250
            (I)Pictures
            (I)Specs
            (I)Reviews
        (F)Z350
            (I)Pictures
            (I)Specs
            (I)Reviews
        (I)Splash Page
    (F)Heavy Trucks
    (F)Consumer Vehicles
    (I)Overview 

If I write:

from item in lightTrucks.Items where item.Title == "Pictures" select item

What is the most intuitive way to express an intent that the query get all items underneath Light Trucks, or only the immediate ones? The least-intrusive, lowest-friction way to distinguish between the two intents?

My #1 goal is to be able to turn this LINQ provider over to other developers who have an average understanding of LINQ and allow them to write both recursive and list queries without giving them a tutorial on writing recursive lambdas. Given a usage that looks good, I can code the provider against that.

Additional clarification: (I am really sucking at communicating this!) - This LINQ provider is to an external system, it is not simply walking an object graph, nor in this specific case does a recursive expression actually translate into any kind of true recursive activity under the hood. Just need a way to distinguish between a "deep" query and a "shallow" one.

So, what do you think is the best way to express it? Or is there a standard way of expressing it that I've missed out on?

Answer

Frank Schwieterman picture Frank Schwieterman · Apr 9, 2009

Linq-toXml handles this fine, there is an XElement.Elements()/.Nodes() operation to get immediate children, and a XElement.Descendents()/DescendentNodes() operations to get all descendents. Would you consider that as an example?

To summarize Linq-to-Xml's behavior... The navigation functions each correspond to an axis type in XPath (http://www.w3schools.com/xpath/xpath_axes.asp). If the navigation function selects Elements, the axis name is used. If the navigation function selects Nodes, the axis name is used with Node appended.

For instance, there are functions Descendants() and DescendantsNode() correspond to XPath's descendants axis, returning either an XElement or an XNode.

The exception case is not surprisingly the most used case, the children axis. In XPath, this is the axis used if no axis is specified. For this, the linq-to-xml navigation functions are not Children() and ChildrenNodes() but rather Elements() and Nodes().

XElement is a subtype of XNode. XNode's include things like HTML tags, but also HTML comments, cdata or text. XElements are a type of XNode, but refer specifically to HTML tags. XElements therefore have a tag name, and support the navigation functions.

Now its not as easy to chain navigations in Linq-to-XML as it is XPath. The problem is that navigation functions return collection objects, while the navigation functions are applied to non-collections. Consider the XPath expression which selects a table tag as an immediate child then any descendant table data tag. I think this would look like "./children::table/descendants::td" or "./table/descendants::td"

Using IEnumerable<>::SelectMany() allows one to call the navigation functions on a collection. The equivalent to the above looks something like .Elements("table").SelectMany(T => T.Descendants("td"))