DOM getElementsByTagName() returning Nodes with NULL Values

gene b. picture gene b. · Jan 19, 2012 · Viewed 19.3k times · Source

I have an XML file as follows.

When I use getElementsByTagName("LEVEL2_ID"), I do get a NodeList with Nodes, but those Nodes have NULL values (in other words, getNodeValue() on each result node will return NULL). Why is this? I need to get the contents value of each node, in this case 2000.

XML:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Root>
    <Date>01/17/2012</Date>
    <LEVEL1>
        <LEVEL1_ID>1000</LEVEL1_ID>

        <LEVEL2>
           <LEVEL2_ID>2000</LEVEL2_ID>
        </LEVEL2>
    </LEVEL1>
</Root>

In Java, printing the Value of the 1st node obtained with getElementsByTagName() returns NULL:

NodeList nodes = document.getElementsByTagName("LEVEL2_ID");

System.out.println("Value of 1st node: " + nodes.item(0).getNodeValue());

Answer

Felix Kling picture Felix Kling · Jan 19, 2012

That is defined in the specification. Element nodes' nodeValue is null.

nodeValue of type DOMString: The value of this node, depending on its type; see the table above. When it is defined to be null, setting it has no effect.

If you want to get the text content of each node, you have to iterate over all text node descendants and concatenate their value.

That said, the API implementation you are using might offer a method to directly retrieve the text content of an element. For example, PHP's DOMNode has a $textContent property.

If, as in your case, the element's only child is actually the text node you want, you can simply access its value:

element.getFirstChild().getNodeValue()