xmllint : validate an XML file against two XSD schemas (envelope / payload)

Marcus Junius Brutus picture Marcus Junius Brutus · Jun 8, 2013 · Viewed 20.6k times · Source

I am using xmllint to do some validations and I have an XML instance document which needs to validate against two schemas: one for the outer "envelope" (which includes an any element) and one for the particular payload. Say A.xsd is the envelope schema, B.xsd a payload schema (there are different possible payloads) and ab.xml a valid XML instance document (I provide an example at the end of the post).

I have all three files locally available in the same directory and am using xmllint to perform the validation, providing as the schema argument the location of the outer (envelope) schema:

xmllint -schema A.xsd ab.xml

... yet, although I provide the location of both A.xsd and B.xsd in the instance document (using the xsi:schemaLocation element) xmllint fails to find it and complains:

ab.xml:8: element person: Schemas validity error : Element '{http://www.example.org/B}person': No matching global element declaration available, but demanded by the strict wildcard.
ab.xml fails to validate

So apparently xmllint is not reading the xsi:schemaLocation element. I understand that xmllint can be configured with catalogs but I failed to get xmllint to find both schemas. How should I get xmllint to take into account both schemas when validating the instance document or is there another command line utility or graphical tool I could use instead?

SSCCE

A.xsd - envelope schema

<?xml version="1.0" encoding="UTF-8"?>
<schema elementFormDefault="qualified" 
        xmlns               ="http://www.w3.org/2001/XMLSchema"
        xmlns:a             ="http://www.example.org/A"
        targetNamespace ="http://www.example.org/A">

       <element name="someType" type="a:SomeType"></element>

        <complexType name="SomeType">
            <sequence>
                <any namespace="##other" processContents="strict"/>
            </sequence>
        </complexType>
</schema>

B.xsd - payload schema

<?xml version="1.0" encoding="UTF-8"?>
<schema elementFormDefault="qualified"
    xmlns          ="http://www.w3.org/2001/XMLSchema"
    xmlns:b        ="http://www.example.org/B"
    targetNamespace="http://www.example.org/B"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">

    <element name="person" type="b:PersonType"></element>
    <complexType name="PersonType">
        <sequence>
                <element name="firstName" type="string"/>
                <element name="lastName"  type="string"/>
        </sequence>
    </complexType>
  </schema>

ab.xml - instance document

<?xml version="1.0" encoding="UTF-8"?>
<a:someType xmlns:a="http://www.example.org/A"
        xmlns:b="http://www.example.org/B"
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xsi:schemaLocation="http://www.example.org/A A.xsd
                            http://www.example.org/B B.xsd">

            <b:person>
                <b:firstName>Mary</b:firstName>
                <b:lastName>Bones</b:lastName>
            </b:person>

</a:someType>

Answer

Marcus Junius Brutus picture Marcus Junius Brutus · Jun 8, 2013

I quit on xmllint and used Xerces instead.

I downloaded Xerces tarball and after exploding it to some local folder I created the following validate script based on this suggestion (from web archive - original link being now dead):

#!/bin/bash
XERCES_HOME=~/software-downloads/xerces-2_11_0/
echo $XERCES_HOME
java -classpath $XERCES_HOME/xercesImpl.jar:$XERCES_HOME/xml-apis.jar:$XERCES_HOME/xercesSamples.jar sax.Counter $*

The ab.xml file is then validated, against both schemas, with the following command:

 validate -v -n -np -s -f ab.xml

Xerces is reading the schema locations from the xsi:schemaLocation element in ab.xml so they don't need to be provided in the command line invocation.