I am trying to remove paragraph (I'm using some placeholder text to do generation from docx template-like file) from .docx file using OpenXML, but whenever I remove paragraph it breaks the foreach loop which I'm using to iterate trough.
MainDocumentPart mainpart = doc.MainDocumentPart;
IEnumerable<OpenXmlElement> elems = mainPart.Document.Body.Descendants();
foreach(OpenXmlElement elem in elems){
if(elem is Text && elem.InnerText == "##MY_PLACE_HOLDER##")
{
Run run = (Run)elem.Parent;
Paragraph p = (Paragraph)run.Parent;
p.RemoveAllChildren();
p.Remove();
}
}
This works, removes my place holder and paragraph it is in, but foreach loop stops iterating. And I need more things to do in my foreach loop.
Is this ok way to remove paragraph in C# using OpenXML and why is my foreach loop stopping or how to make it not stop? Thanks.
This is the "Halloween Problem", so called because it was noticed by some developers on Halloween, and it looked spooky to them. It is the problem of using declarative code (queries) with imperative code (deleting nodes) at the same time. If you think about it, you are iterating though a linked list, and if you start deleting nodes in the linked list, you totally mess up the iterator. A simpler way to avoid this problem is to "materialize" the results of the query in a List, and then you can iterate through the list, and delete nodes at will. The only difference in the following code is that it calls ToList after calling the Descendants axis.
MainDocumentPart mainpart = doc.MainDocumentPart;
IEnumerable<OpenXmlElement> elems = mainPart.Document.Body.Descendants().ToList();
foreach(OpenXmlElement elem in elems){
if(elem is Text && elem.InnerText == "##MY_PLACE_HOLDER##")
{
Run run = (Run)elem.Parent;
Paragraph p = (Paragraph)run.Parent;
p.RemoveAllChildren();
p.Remove();
}
}
However, I have to note that I see another bug in your code. There is nothing to stop Word from splitting up that text node into multiple text elements from multiple runs. While in most cases, your code will work fine, sooner or later, you or a user is going to take some action (like selecting a character, and accidentally hitting the bold button on the ribbon) and then your code will no longer work.
If you really want to work at the text level, then you need to use code such as what I introduce in this screen-cast: http://openxmldeveloper.org/blog/b/openxmldeveloper/archive/2011/08/04/introducing-textreplacer-a-new-class-for-powertools-for-open-xml.aspx
In fact, you could probably use that code verbatim to handle your use case, I believe.
Another approach, more flexible and powerful, is detailed in:
While that screen-cast is about PresentationML, the same principles apply to WordprocessingML.
But even better, given that you are using WordprocessingML, is to use content controls. For one approach to document generation, see:
http://ericwhite.com/blog/map/generating-open-xml-wordprocessingml-documents-blog-post-series/
And for lots of information about using content controls in general, see:
http://www.ericwhite.com/blog/content-controls-expanded
-Eric