How to modify a huge XML file by StAX?

Eugene picture Eugene · May 10, 2013 · Viewed 13.2k times · Source

I have a huge XML (~2GB) and I need to add new Elements and modify the old ones. For example, I have:

<books>
    <book>....</book>
    ...
    <book>....</book>
</books>

And want to get:

<books>
   <book>
      <index></index>
      ....
   </book>
   ...
   <book>
      <index></index>
      ....
   </book>
</books>

I used the following code:

XMLInputFactory inFactory = XMLInputFactory.newInstance();
XMLEventReader eventReader = inFactory.createXMLEventReader(new FileInputStream(file));
XMLOutputFactory factory = XMLOutputFactory.newInstance();
XMLStreamWriter writer = factory.createXMLStreamWriter(new FileWriter(file, true));
while (eventReader.hasNext()) {
   XMLEvent event = eventReader.nextEvent();
   if (event.getEventType() == XMLEvent.START_ELEMENT) {
      if (event.asStartElement().getName().toString().equalsIgnoreCase("book")) {
          writer.writeStartElement("index");
          writer.writeEndElement();
       }
    }
}
writer.close();

But the result was the following:

<books>
   <book>....</book>
   ....
   <book>....</book>
</books><index></index>

Any ideas?

Answer

Evgeniy Dorofeev picture Evgeniy Dorofeev · May 10, 2013

Try this

    XMLInputFactory inFactory = XMLInputFactory.newInstance();
    XMLEventReader eventReader = inFactory.createXMLEventReader(new FileInputStream("1.xml"));
    XMLOutputFactory factory = XMLOutputFactory.newInstance();
    XMLEventWriter writer = factory.createXMLEventWriter(new FileWriter(file));
    XMLEventFactory eventFactory = XMLEventFactory.newInstance();
    while (eventReader.hasNext()) {
        XMLEvent event = eventReader.nextEvent();
        writer.add(event);
        if (event.getEventType() == XMLEvent.START_ELEMENT) {
            if (event.asStartElement().getName().toString().equalsIgnoreCase("book")) {
                writer.add(eventFactory.createStartElement("", null, "index"));
                writer.add(eventFactory.createEndElement("", null, "index"));
            }
        }
    }
    writer.close();

Notes

new FileWriter(file, true) is appending to the end of the file, you hardly really need it

equalsIgnoreCase("book") is bad idea because XML is case-sensitive