schema validation with msxml in delphi

Miel picture Miel · Jan 15, 2009 · Viewed 17.3k times · Source

I'm trying to validate an XML file against the schemas it references. (Using Delphi and MSXML2_TLB.) The (relevant part of the) code looks something like this:

procedure TfrmMain.ValidateXMLFile;
var
    xml: IXMLDOMDocument2;
    err: IXMLDOMParseError;
    schemas: IXMLDOMSchemaCollection;
begin
    xml := ComsDOMDocument.Create;
    if xml.load('Data/file.xml') then
    begin
        schemas := xml.namespaces;
        if schemas.length > 0 then
        begin
            xml.schemas := schemas;
            err := xml.validate;
        end;
    end;
end;

This has the result that cache is loaded (schemas.length > 0), but then the next assignment raises an exception: "only XMLSchemaCache-schemacollections can be used."

How should I go about this?

Answer

Miel picture Miel · Feb 19, 2009

I've come up with an approach that seems to work. I first load the schema's explicitly, then add themn to the schemacollection. Next I load the xml-file and assign the schemacollection to its schemas property. The solution now looks like this:

uses MSXML2_TLB  
That is:  
// Type Lib: C:\Windows\system32\msxml4.dll  
// LIBID: {F5078F18-C551-11D3-89B9-0000F81FE221}

function TfrmMain.ValidXML(
    const xmlFile: String; 
    out err: IXMLDOMParseError): Boolean;
var
    xml, xsd: IXMLDOMDocument2;
    cache: IXMLDOMSchemaCollection;
begin
    xsd := CoDOMDocument40.Create;
    xsd.Async := False;
    xsd.load('http://the.uri.com/schemalocation/schema.xsd');

    cache := CoXMLSchemaCache40.Create;
    cache.add('http://the.uri.com/schemalocation', xsd);

    xml := CoDOMDocument40.Create;
    xml.async := False;
    xml.schemas := cache;

    Result := xml.load(xmlFile);
    if not Result then
      err := xml.parseError
    else
      err := nil;
end;

It is important to use XMLSchemaCache40 or later. Earlier versions don't follow the W3C XML Schema standard, but only validate against XDR Schema, a MicroSoft specification.

The disadvantage of this solution is that I need to load the schema's explicitly. It seems to me that it should be possible to retrieve them automatically.