HtmlAgilityPack selecting childNodes not as expected

Sheff picture Sheff · May 13, 2009 · Viewed 24.1k times · Source

I am attempting to use the HtmlAgilityPack library to parse some links in a page, but I am not seeing the results I would expect from the methods. In the following I have a HtmlNodeCollection of links. For each link I want to check if there is an image node and then parse its attribures but the SelectNodes and SelectSingleNode methods of linkNode seems to be searching the parent document not the childNodes of linkNode what gives?

HtmlDocument htmldoc = new HtmlDocument();
htmldoc.LoadHtml(content);
HtmlNodeCollection linkNodes = htmldoc.DocumentNode.SelectNodes("//a[@href]");

foreach(HtmlNode linkNode in linkNodes)
{
    string linkTitle = linkNode.GetAttributeValue("title", string.Empty);
    if (linkTitle == string.Empty)
    {
        HtmlNode imageNode = linkNode.SelectSingleNode("/img[@alt]");     
    }
}

Is there any other way I could get the alt attribute of the image childnode of linkNode if it exists?

Answer

Richard Szalay picture Richard Szalay · May 13, 2009

You should remove the forwardslash prefix from "/img[@alt]" as it signifies that you want to start at the root of the document.

HtmlNode imageNode = linkNode.SelectSingleNode("img[@alt]");