HTML Agility Pack Null Reference

tohereknowswhen picture tohereknowswhen · Apr 27, 2011 · Viewed 12.1k times · Source

I've got some trouble with the HTML Agility Pack.

I get a null reference exception when I use this method on HTML not containing the specific node. It worked at first, but then it stopped working. This is only a snippet and there are about 10 more foreach loops that selects different nodes.

What am I doing wrong?

public string Export(string html)
{
    var doc = new HtmlDocument();
    doc.LoadHtml(html);
    // exception gets thrown on below line
    foreach (var repeater in doc.DocumentNode.SelectNodes("//table[@class='mceRepeater']"))
    {
        if (repeater != null)
        {
            repeater.Name = "editor:repeater";
            repeater.Attributes.RemoveAll();
        }
    }

    var sw = new StringWriter();
    doc.Save(sw);
    sw.Flush();

    return sw.ToString();
}

Answer

Alex picture Alex · Apr 27, 2011

AFAIK, DocumentNode.SelectNodes could return null if no nodes found.

This is default behaviour, see a discussion thread on codeplex: Why DocumentNode.SelectNodes returns null

So the workaround could be in rewriting the foreach block:

var repeaters = doc.DocumentNode.SelectNodes("//table[@class='mceRepeater']");
if (repeaters != null)
{
    foreach (var repeater in repeaters)
    {
        if (repeater != null)
        {
            repeater.Name = "editor:repeater";
            repeater.Attributes.RemoveAll();
        }
    }
}