remove html node from htmldocument :HTMLAgilityPack

Priya picture Priya · Aug 24, 2012 · Viewed 23.1k times · Source

In my code, I want to remove the img tag which doesn't have src value. I am using HTMLAgilitypack's HtmlDocument object. I am finding the img which doesn't have src value and trying to remove it.. but it gives me error Collection was modified; enumeration operation may not execute. Can anyone help me for this? The code which I have used is:

foreach (HtmlNode node in doc.DocumentNode.DescendantNodes())
    if (node.Name.ToLower() == "img")
           string src = node.Attributes["src"].Value;
           if (string.IsNullOrEmpty(src))
               node.ParentNode.RemoveChild(node, false);    
             ..........// i am performing other operations on document


Alex picture Alex · Aug 30, 2012

It seems you're modifying the collection during the enumeration by using HtmlNode.RemoveChild method.

To fix this you need is to copy your nodes to a separate list/array by calling e.g. Enumerable.ToList<T>() or Enumerable.ToArray<T>().

var nodesToRemove = doc.DocumentNode

foreach (var node in nodesToRemove)

If I'm right, the problem will disappear.