How can I validate an XML file against a DTD that is stored locally as a file? The XML file does not have any DOCTYPE declaration (or may have one that should then be overridden). I had a look at this thread but besides the fact they are using .NET I doubt that this is a good solution.
Any input appreciated!
In an ideal world, you'd be able to validate using a Validator. Something like this:
SchemaFactory schemaFactory = SchemaFactory
.newInstance(XMLConstants.XML_DTD_NS_URI);
Schema schema = schemaFactory.newSchema(new File(
"xmlValidate.dtd"));
Validator validator = schema.newValidator();
validator.validate(new StreamSource("xmlValidate.xml"));
Unfortunately, the Sun implementation (at least, as of Java 6) does not include support for creating a Schema instance from a DTD. You might be able to track down a 3rd party implementation.
Your best bet may be to alter the document to include the DTD before parsing using some other mechanism.
You can use a transformer to insert a DTD declaration:
TransformerFactory tf = TransformerFactory
.newInstance();
Transformer transformer = tf.newTransformer();
transformer.setOutputProperty(
OutputKeys.DOCTYPE_SYSTEM, "xmlValidate.dtd");
transformer.transform(new StreamSource(
"xmlValidate.xml"), new StreamResult(System.out));
...but this does not seem to replace an existing DTD declaration.
This StAX event reader can do the job:
public static class DTDReplacer extends
EventReaderDelegate {
private final XMLEvent dtd;
private boolean sendDtd = false;
public DTDReplacer(XMLEventReader reader, XMLEvent dtd) {
super(reader);
if (dtd.getEventType() != XMLEvent.DTD) {
throw new IllegalArgumentException("" + dtd);
}
this.dtd = dtd;
}
@Override
public XMLEvent nextEvent() throws XMLStreamException {
if (sendDtd) {
sendDtd = false;
return dtd;
}
XMLEvent evt = super.nextEvent();
if (evt.getEventType() == XMLEvent.START_DOCUMENT) {
sendDtd = true;
} else if (evt.getEventType() == XMLEvent.DTD) {
// discard old DTD
return super.nextEvent();
}
return evt;
}
}
It will send a given DTD declaration right after the document start and discard any from the old document.
Demo usage:
XMLEventFactory eventFactory = XMLEventFactory.newInstance();
XMLEvent dtd = eventFactory
.createDTD("<!DOCTYPE Employee SYSTEM \"xmlValidate.dtd\">");
XMLInputFactory inFactory = XMLInputFactory.newInstance();
XMLOutputFactory outFactory = XMLOutputFactory.newInstance();
XMLEventReader reader = inFactory
.createXMLEventReader(new StreamSource(
"xmlValidate.xml"));
reader = new DTDReplacer(reader, dtd);
XMLEventWriter writer = outFactory.createXMLEventWriter(System.out);
writer.add(reader);
writer.flush();
// TODO error and proper stream handling
Note that the XMLEventReader could form the source for some other transformation mechanism that performed validation.
It would be much easier to validate using a W3 schema if you have that option.