What is the most efficient way to iterate through all DOM elements in Java?
Something like this but for every single DOM elements on current org.w3c.dom.Document
?
for(Node childNode = node.getFirstChild(); childNode!=null;){
Node nextChild = childNode.getNextSibling();
// Do something with childNode, including move or delete...
childNode = nextChild;
}
Basically you have two ways to iterate over all elements:
1. Using recursion (the most common way I think):
public static void main(String[] args) throws SAXException, IOException,
ParserConfigurationException, TransformerException {
DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory
.newInstance();
DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
Document document = docBuilder.parse(new File("document.xml"));
doSomething(document.getDocumentElement());
}
public static void doSomething(Node node) {
// do something with the current node instead of System.out
System.out.println(node.getNodeName());
NodeList nodeList = node.getChildNodes();
for (int i = 0; i < nodeList.getLength(); i++) {
Node currentNode = nodeList.item(i);
if (currentNode.getNodeType() == Node.ELEMENT_NODE) {
//calls this method for all the children which is Element
doSomething(currentNode);
}
}
}
2. Avoiding recursion using getElementsByTagName()
method with *
as parameter:
public static void main(String[] args) throws SAXException, IOException,
ParserConfigurationException, TransformerException {
DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory
.newInstance();
DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
Document document = docBuilder.parse(new File("document.xml"));
NodeList nodeList = document.getElementsByTagName("*");
for (int i = 0; i < nodeList.getLength(); i++) {
Node node = nodeList.item(i);
if (node.getNodeType() == Node.ELEMENT_NODE) {
// do something with the current element
System.out.println(node.getNodeName());
}
}
}
I think these ways are both efficient.
Hope this helps.