How to get tagname of a TEXT_NODE in java's org.w3c.dom.Node

Question 1

How to get tagname of a TEXT_NODE in java's org.w3c.dom.Node

java xml-parsing w3c xmlnode

Lucas Ou-Yang · Jul 31, 2013 · Viewed 8.4k times · Source

Answer

Answer

I think you've misunderstood what nodes are involved. This XML:

<country>US</country>

... contains two nodes:

The country element
The text node, with content of US

The element is not a text node, and the text node doesn't have an element name, because it's not an element. It's important to understand that these are different nodes. That's the source of all your confusion, I believe.

If you're currently looking at the text node, you could use node.getParentNode().getNodeName() to get the element name. Or from the element node, you could call getTextContent().

Question 2

In the documentation for this interface it states that textnodes all return "#text" for their names instead of the actual tag name. But for what i'm doing, the tag name is necessary.

// I'm using the following imports
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;

import org.w3c.dom.Document;
import org.w3c.dom.NamedNodeMap;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
import org.xml.sax.EntityResolver;
import org.xml.sax.InputSource;


// In the .xml input file
<country>US</country>  // This is a "text node" .getTextContent()
                       // returns "US", I need "country" and .getNodeName() 
                       // only returns "#text"

How could I access the tag name? This must be possible somehow, I don't mind a hackish solution.

Docs:

http://www.w3schools.com/dom/dom_nodetype.asp

http://www.w3.org/2003/01/dom2-javadoc/org/w3c/dom/Node.html

Thank you.

How to get tagname of a TEXT_NODE in java's org.w3c.dom.Node

Answer

Related questions