What is the simplest way to extract an XML node for JAXB.unmarshal()?

neu242 picture neu242 · Jun 10, 2013 · Viewed 7.4k times · Source

I use the wsdl2java goal of cxf-codegen-plugin to generate Java from a WSDL. Then, in my tests, I use JAXB.unmarshal() to populate classes from a raw webservice XML result.

A typical example is GetAllResponseType response = unmarshal("get-all.xml", GetAllResponseType.class), using the following method:

<T> T unmarshal(String filename, Class<T> clazz) throws Exception {
    InputStream body = getClass().getResourceAsStream(filename);
    return javax.xml.bind.JAXB.unmarshal(body, clazz);
}

The problem is this: The raw XML response always have enclosing Envelope and Body tags which are not generated as classes by wsdl2java:

<n4:Envelope xmlns:http="http://schemas.xmlsoap.org/wsdl/http/" xmlns:n="http://www.informatica.com/wsdl/"
         xmlns:n4="http://schemas.xmlsoap.org/soap/envelope/" xmlns:n5="http://schemas.xmlsoap.org/wsdl/"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <n4:Body>
    <n:getAllResponse xmlns:n="http://www.informatica.com/wsdl/">
        <n:getAllResponseElement>
           ...
        </n:getAllResponseElement>
    </n:getAllResponse>
  </n4:Body>
</n4:Envelope>

So, in order to use JAXB.unmarshal() I have to

  1. either strip away the surrounding Envelope/Body tags manually in get-all.xml
  2. or extract the getAllResponse node and re-convert it to an InputStream
  3. or create the Envelope and Body classes

Currently I do 2, but it's a lot of code:

<T> T unmarshal(String filename, Class<T> clazz) throws Exception {
    InputStream is = getClass().getResourceAsStream(filename);
    InputStream body = nodeContent(is, "n4:Body");
    return javax.xml.bind.JAXB.unmarshal(body, clazz);
}

InputStream nodeContent(InputStream is, String name) throws Exception {
    DocumentBuilderFactory docFactory = DocumentBuilderFactory.newInstance();
    DocumentBuilder docBuilder = docFactory.newDocumentBuilder();
    Document doc = docBuilder.parse(is);
    Node node = firstNonTextNode(doc.getElementsByTagName(name).item(0).getChildNodes());
    return nodeToStream(node);
}

Node firstNonTextNode(NodeList nl) {
    for (int i = 0; i < nl.getLength(); i++) {
        if (!(nl.item(i) instanceof Text)) {
            return nl.item(i);
        }
    }
    throw new RuntimeException("Couldn't find nontext node");
}

InputStream nodeToStream(Node node) throws Exception {
    ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
    Source xmlSource = new DOMSource(node);
    Result outputTarget = new StreamResult(outputStream);
    TransformerFactory.newInstance().newTransformer().transform(xmlSource, outputTarget);
    return new ByteArrayInputStream(outputStream.toByteArray());
}

My questions are:

  • Is there an easier way to the extraction in 2? I am tempted to just do a regexp. I tried XPath, but somehow I couldn't get it to work. Code examples would be helpful.
  • Can I get wsdl2java to create the Body / Envelope classes (3), or is it easy to create them myself?

Answer

tkolleh picture tkolleh · Apr 20, 2017

Use the DOMSource to pass a Node as input. The following method takes a org.w3c.dom.Node as input and returns the unmarshalled class.

private <T> T unmarshal(Node node, Class<T> clazz) throws JAXBException {
        XMLInputFactory xmlInputFactory = XMLInputFactory.newFactory();
        Source xmlSource = new DOMSource(node);
        Unmarshaller unmarshaller = JAXBContext.newInstance(clazz).createUnmarshaller();
        return unmarshaller.unmarshal(xmlSource, clazz).getValue();
}