Using XPath in SelectSingleNode: Retrieving individual element from XML if it's present

user108324 picture user108324 · May 19, 2009 · Viewed 54.8k times · Source

My XML looks like :

<?xml version=\"1.0\"?>
<itemSet>
       <Item>one</Item>
       <Item>two</Item>
       <Item>three</Item>
       .....maybe more Items here.
</itemSet>

Some of the individual Item may or may not be present. Say I want to retrieve the element <Item>two</Item> if it's present. I've tried the following XPaths (in C#).

  • XMLNode node = myXMLdoc.SelectSingleNode("/itemSet[Item='two']") --- If Item two is present, then it returns me only the first element one. Maybe this query just points to the first element in itemSet, if it has an Item of value two somewhere as a child. Is this interpretation correct?

So I tried:

  • XMLNode node = myXMLdoc.SelectSingleNode("/itemSet[Item='two']/Item[1]") --- I read this query as, return me the first <Item> element within itemSet that has value = 'two'. Am I correct?

This still returns only the first element one. What am I doing wrong? In both the cases, using the siblings I can traverse the child nodes and get to two, but that's not what I am looking at. Also if two is absent then SelectSingleNode returns null. Thus the very fact that I am getting a successfull return node does indicate the presence of element two, so had I wanted a boolean test to chk presence of two, any of the above XPaths would suffice, but I actually the need the full element <Item>two</Item> as my return node.

[My first question here, and my first time working with web programming, so I just learned the above XPaths and related xml stuff on the fly right now from past questions in SO. So be gentle, and let me know if I am a doofus or flouting any community rules. Thanks.]

Answer

Jon Skeet picture Jon Skeet · May 19, 2009

I think you want:

myXMLdoc.SelectSingleNode("/itemSet/Item[text()='two']")

In other words, you want the Item which has text of two, not the itemSet containing it.

You can also use a single dot to indicate the context node, in your case:

myXMLdoc.SelectSingleNode("/itemSet/Item[.='two']")

EDIT: The difference between . and text() is that . means "this node" effectively, and text() means "all the text node children of this node". In both cases the comparison will be against the "string-value" of the LHS. For an element node, the string-value is "the concatenation of the string-values of all text node descendants of the element node in document order" and for a collection of text nodes, the comparison will check whether any text node is equal to the one you're testing against.

So it doesn't matter when the element content only has a single text node, but suppose we had:

<root>
  <item name="first">x<foo/>y</item>
  <item name="second">xy<foo/>ab</item>
</root>

Then an XPath expression of "root/item[.='xy']" will match the first item, but "root/item[text()='xy']" will match the second.