different between "getDocumentElement" and "getFirstChild"

URL87 picture URL87 · Jun 7, 2012 · Viewed 30.4k times · Source

I have the following Document object - Document myDoc.

myDoc holds an XML file by...

myDoc = DocumentBuilderFactory.newInstance()
            .newDocumentBuilder().parse(file);

Now I want to get the root of the XML file. Is there any difference between

Node firstChild = this.myDoc.getFirstChild() 

and

Node firstChild = (Node)myDoc.getDocumentElement()

In the first way, firstChild holds a node root of an XML file but it will not have the depth of Node. However, in the second way, firstChild will be the root with all of the depth.

For example, I have the following XML

<inventory>
    <book num="b1">
    </book>
    <book num="b2">
    </book>
    <book num="b3">
    </book>
</inventory>

and file holds it.

In the 1st case, int count = firstChild.getChildNodes() gives count = 0.

The 2nd case will give count = 3.

Am I right?

Answer

dragon66 picture dragon66 · Jun 7, 2012

The Node you get using myDoc.getFirstChild() may not be the document root if there are other nodes before the document root node - such as a comment node. Look at the example below:

import org.w3c.dom.*;

public class ReadXML {

    public static void main(String args[]) throws Exception{     

        DocumentBuilderFactory docFactory = DocumentBuilderFactory.newInstance();
        DocumentBuilder docBuilder = docFactory.newDocumentBuilder();

        // Document elements
        Document doc = docBuilder.parse(new File(args[0]));

        Node firstChild = doc.getFirstChild();
        System.out.println(firstChild.getChildNodes().getLength());
        System.out.println(firstChild.getNodeType());
        System.out.println(firstChild.getNodeName());

        Node root = doc.getDocumentElement();
        System.out.println(root.getChildNodes().getLength());
        System.out.println(root.getNodeType());
        System.out.println(root.getNodeName());

    }
}

When parsing the following XML file:

<?xml version="1.0"?>
<!-- Edited by XMLSpy -->
<catalog>
   <product description="Cardigan Sweater" product_image="cardigan.jpg">
      <catalog_item gender="Men's">
         <item_number>QWZ5671</item_number>
         <price>39.95</price>
         <size description="Medium">
            <color_swatch image="red_cardigan.jpg">Red</color_swatch>
            <color_swatch image="burgundy_cardigan.jpg">Burgundy</color_swatch>
         </size>
         <size description="Large">
            <color_swatch image="red_cardigan.jpg">Red</color_swatch>
            <color_swatch image="burgundy_cardigan.jpg">Burgundy</color_swatch>
         </size>
      </catalog_item>    
   </product>
</catalog>

gives the following result:

0
8
#comment
3
1
catalog

But if I remove the comment, it gives:

3
1
catalog
3
1
catalog