How to get the element of and invalid xml file with failed xsd Validation

bubblebath picture bubblebath · Aug 12, 2012 · Viewed 16.3k times · Source

I am currently using my XSD to Validate my xml. This part works fine my porblem is that I want to obtain the element of the tag /value that is invalid.

    InputSource is = new InputSource();
    is.setCharacterStream(new StringReader(xml));
    XMLStreamReader reader = null;
    SchemaFactory factory=SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
    Schema schema = factory.newSchema(xsdschemalocation);
    Validator validator = schema.newValidator(); 
    try
    {
        reader = XMLInputFactory.newInstance().createXMLStreamReader(new StreamSource(new StringReader(xml)));
    } catch (XMLStreamException ex)
    {
        LogController.getLogger().logSEVERE("Unable to create the streamreader from the xml source", ex.getLocalizedMessage());
        return false;
    }
    try
    {
        validator.validate(new StAXSource(reader));
    }
    catch (IOException ex)
    {
        LogController.getLogger().logSEVERE("IOException in the validatation has been caused as the reader has become null", ex.getLocalizedMessage());
        return false;
    }
catch(SAXException saxe)
    {
        LogController.getLogger().logWARNING("Their is a validation error with the xml", saxe.getLocalizedMessage());
        //*****HERE I WANT THE TAG THAT HAS THE ERROR
        ClientCommunication.ErrorMessageForClient(VALIDATION_ERROR, socket);
        CloseClientConnection();
        return;
    }

The idea I had which is not practical is to look in the message for the word "type" or "end-tag" and get the value after it, however I know this is not going to be good practice! I find this frustrating as I can see the tag that is invalid but can't get hold of it!

Here are some examples of the element I want

1. Message: Element type "first" must be followed by either attribute specifications, ">" or "/>".

2. javax.xml.stream.XMLStreamException: org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 353; cvc-pattern-valid: Value '079e989989' is not facet-valid with respect to pattern '([0-9])+' for type 'phoneNumber'.

3. Message: The element type "firstLine" must be terminated by the matching end-tag "</firstLine>".

Answer

bdoughan picture bdoughan · Aug 12, 2012

Below is a way that you could implement your use case using an ErrorHandler:

MyErrorHandler

I would recommend implementing an ErrorHandler that maintained a reference to the XMLStreamReader so that when a SAXParseException occurs you could interrogate the XMLStreamReader to get information about the element. If you want the parsing to stop once an exception is thrown simply rethrow the SAXParseException at the end of each of the methods.

package forum11921190;

import javax.xml.stream.XMLStreamReader;
import org.xml.sax.*;

public class MyErrorHandler implements ErrorHandler {

    private XMLStreamReader reader;

    public MyErrorHandler(XMLStreamReader reader) {
        this.reader = reader;
    }

    @Override
    public void error(SAXParseException e) throws SAXException {
        warning(e);
    }

    @Override
    public void fatalError(SAXParseException e) throws SAXException {
        warning(e);
    }

    @Override
    public void warning(SAXParseException e) throws SAXException {
        System.out.println(reader.getLocalName());
        System.out.println(reader.getNamespaceURI());
        e.printStackTrace(System.out);
    }

}

Demo

You set an instance of ErrorHandler on the Validator.

package forum11921190;

import javax.xml.XMLConstants;
import javax.xml.stream.*;
import javax.xml.transform.stax.StAXSource;
import javax.xml.transform.stream.StreamSource;
import javax.xml.validation.*;

public class Demo {

    private static final StreamSource XSD = new StreamSource("src/forum11921190/schema.xsd");
    private static final StreamSource XML = new StreamSource("src/forum11921190/input.xml");

    public static void main(String[] args) throws Exception {
        SchemaFactory factory=SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
        Schema schema = factory.newSchema(XSD);

        XMLStreamReader reader = XMLInputFactory.newFactory().createXMLStreamReader(XML);

        Validator validator = schema.newValidator();
        validator.setErrorHandler(new MyErrorHandler(reader));
        validator.validate(new StAXSource(reader));

    }

}

schema.xsd

Below is a sample XML schema I used when writing the demo code.

<?xml version="1.0" encoding="UTF-8"?>
<schema xmlns="http://www.w3.org/2001/XMLSchema" 
    targetNamespace="http://www.example.com"
    xmlns:tns="http://www.example.com"
    elementFormDefault="qualified">
    <element name="root">
        <complexType>
            <sequence>
                <element name="foo" type="string"/>
                <element name="bar" type="int"/>
            </sequence>
        </complexType>
    </element>
</schema>

input.xml

Below is some sample input. The bar element has invalid content.

<?xml version="1.0" encoding="UTF-8"?>
<root xmlns="http://www.example.com">
    <foo>valid</foo>
    <bar>invalid</bar>
</root>

Output

Below is the output from running the demo code:

bar
http://www.example.com
org.xml.sax.SAXParseException: cvc-datatype-valid.1.2.1: 'invalid' is not a valid value for 'integer'.
    at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:195)
    at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.error(ErrorHandlerWrapper.java:131)
    at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:384)
    at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:318)
    at com.sun.org.apache.xerces.internal.impl.xs.XMLSchemaValidator$XSIErrorReporter.reportError(XMLSchemaValidator.java:423)
    at com.sun.org.apache.xerces.internal.impl.xs.XMLSchemaValidator.reportSchemaError(XMLSchemaValidator.java:3188)
    at com.sun.org.apache.xerces.internal.impl.xs.XMLSchemaValidator.elementLocallyValidType(XMLSchemaValidator.java:3103)
    at com.sun.org.apache.xerces.internal.impl.xs.XMLSchemaValidator.processElementContent(XMLSchemaValidator.java:3013)
    at com.sun.org.apache.xerces.internal.impl.xs.XMLSchemaValidator.handleEndElement(XMLSchemaValidator.java:2156)
    at com.sun.org.apache.xerces.internal.impl.xs.XMLSchemaValidator.endElement(XMLSchemaValidator.java:824)
    at com.sun.org.apache.xerces.internal.jaxp.validation.ValidatorHandlerImpl.endElement(ValidatorHandlerImpl.java:565)
    at com.sun.org.apache.xml.internal.serializer.ToXMLSAXHandler.endElement(ToXMLSAXHandler.java:261)
    at com.sun.org.apache.xalan.internal.xsltc.trax.StAXStream2SAX.handleEndElement(StAXStream2SAX.java:295)
    at com.sun.org.apache.xalan.internal.xsltc.trax.StAXStream2SAX.bridge(StAXStream2SAX.java:167)
    at com.sun.org.apache.xalan.internal.xsltc.trax.StAXStream2SAX.parse(StAXStream2SAX.java:120)
    at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transformIdentity(TransformerImpl.java:674)
    at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transform(TransformerImpl.java:723)
    at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transform(TransformerImpl.java:336)
    at com.sun.org.apache.xerces.internal.jaxp.validation.StAXValidatorHelper.validate(StAXValidatorHelper.java:94)
    at com.sun.org.apache.xerces.internal.jaxp.validation.ValidatorImpl.validate(ValidatorImpl.java:118)
    at javax.xml.validation.Validator.validate(Validator.java:127)
    at forum11921190.Demo.main(Demo.java:26)
bar
http://www.example.com
org.xml.sax.SAXParseException: cvc-type.3.1.3: The value 'invalid' of element 'bar' is not valid.
    at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:195)
    at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.error(ErrorHandlerWrapper.java:131)
    at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:384)
    at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:318)
    at com.sun.org.apache.xerces.internal.impl.xs.XMLSchemaValidator$XSIErrorReporter.reportError(XMLSchemaValidator.java:423)
    at com.sun.org.apache.xerces.internal.impl.xs.XMLSchemaValidator.reportSchemaError(XMLSchemaValidator.java:3188)
    at com.sun.org.apache.xerces.internal.impl.xs.XMLSchemaValidator.elementLocallyValidType(XMLSchemaValidator.java:3104)
    at com.sun.org.apache.xerces.internal.impl.xs.XMLSchemaValidator.processElementContent(XMLSchemaValidator.java:3013)
    at com.sun.org.apache.xerces.internal.impl.xs.XMLSchemaValidator.handleEndElement(XMLSchemaValidator.java:2156)
    at com.sun.org.apache.xerces.internal.impl.xs.XMLSchemaValidator.endElement(XMLSchemaValidator.java:824)
    at com.sun.org.apache.xerces.internal.jaxp.validation.ValidatorHandlerImpl.endElement(ValidatorHandlerImpl.java:565)
    at com.sun.org.apache.xml.internal.serializer.ToXMLSAXHandler.endElement(ToXMLSAXHandler.java:261)
    at com.sun.org.apache.xalan.internal.xsltc.trax.StAXStream2SAX.handleEndElement(StAXStream2SAX.java:295)
    at com.sun.org.apache.xalan.internal.xsltc.trax.StAXStream2SAX.bridge(StAXStream2SAX.java:167)
    at com.sun.org.apache.xalan.internal.xsltc.trax.StAXStream2SAX.parse(StAXStream2SAX.java:120)
    at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transformIdentity(TransformerImpl.java:674)
    at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transform(TransformerImpl.java:723)
    at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transform(TransformerImpl.java:336)
    at com.sun.org.apache.xerces.internal.jaxp.validation.StAXValidatorHelper.validate(StAXValidatorHelper.java:94)
    at com.sun.org.apache.xerces.internal.jaxp.validation.ValidatorImpl.validate(ValidatorImpl.java:118)
    at javax.xml.validation.Validator.validate(Validator.java:127)
    at forum11921190.Demo.main(Demo.java:26)