Parsing XSD Schema with XSOM in Java. How to access element and complex types

DUFF picture DUFF · Oct 30, 2011 · Viewed 17.3k times · Source

I’m having a lot of difficuly parsing an .XSD file with a XSOM in Java. I have two .XSD files one defines a calendar and the second the global types. I'd like to be able to read the calendar file and determine that:

calendar has 3 properties

  • Valid is an ENUM called eYN
  • Cal is a String
  • Status is a ENUM called eSTATUS

Calendar.xsd

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
 xmlns:gtypes="http://www.btec.com/gtypes"
 elementFormDefault="qualified">
<xs:import namespace="http://www.btec.com/gtypes"
 schemaLocation="gtypes.xsd"/>
<xs:element name="CALENDAR">
  <xs:complexType>
    <xs:sequence>
      <xs:element name="Valid" type="eYN" minOccurs="0"/>
      <xs:element name="Cal" minOccurs="0">
        <xs:complexType>
          <xs:simpleContent>
            <xs:extension base="gtypes:STRING">
              <xs:attribute name="IsKey" type="xs:string" fixed="Y"/>
            </xs:extension>
          </xs:simpleContent>
        </xs:complexType>
      </xs:element>
      <xs:element name="Status" type="eSTATUS" minOccurs="0"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>
<xs:complexType name="eSTATUS">
  <xs:simpleContent>
    <xs:extension base="gtypes:ENUM"/>
  </xs:simpleContent>
</xs:complexType>
<xs:complexType name="eYN">
  <xs:simpleContent>
    <xs:extension base="gtypes:ENUM"/>
  </xs:simpleContent>
</xs:complexType>

gtypes.xsd

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
 targetNamespace="http://www.btec.com/gtypes"
 elementFormDefault="qualified">
<xs:complexType name="ENUM">
  <xs:simpleContent>
    <xs:extension base="xs:string">
      <xs:attribute name="TYPE" fixed="ENUM"/>
      <xs:attribute name="derived" use="optional"/>
      <xs:attribute name="readonly" use="optional"/>
      <xs:attribute name="required" use="optional"/>
    </xs:extension>
  </xs:simpleContent>
</xs:complexType>
<xs:complexType name="STRING">
  <xs:simpleContent>
    <xs:extension base="xs:string">
      <xs:attribute name="TYPE" use="optional"/>
      <xs:attribute name="derived" use="optional"/>
      <xs:attribute name="readonly" use="optional"/>
      <xs:attribute name="required" use="optional"/>
    </xs:extension>
  </xs:simpleContent>
</xs:complexType>
</xs:schema>

The code from my attempt to access this information is below. I'm pretty new to Java so any style criticism welcome.

I really need to know

  1. How to I access the complex type cal and see that it's a string?
  2. How do I access the definition of Status to see it's a enumeration of type eSTATUS emphasized text

I've has several attempts to access the right information via ComplexType and Elements and Content. However I'm just don't get it and I cannot find any examples that help. I expect (hope) the best method is (relatively) simple when you know how. So, once again, if anyone could point me in the right direction that would be a great help.

xmlfile = "Calendar.xsd"
XSOMParser parser = new XSOMParser();

parser.parse(new File(xmlfile));
XSSchemaSet sset = parser.getResult();
XSSchema s = sset.getSchema(1);
if (s.getTargetNamespace().equals("")) // this is the ns with all the stuff
       // in
{
  // try ElementDecls
  Iterator jtr = s.iterateElementDecls();
  while (jtr.hasNext())
  {
    XSElementDecl e = (XSElementDecl) jtr.next();
    System.out.print("got ElementDecls " + e.getName());
    // ok we've got a CALENDAR.. what next?
    // not this anyway
    /*  
     *
     * XSParticle[] particles = e.asElementDecl() for (final XSParticle p :
     * particles) { final XSTerm pterm = p.getTerm(); if
     * (pterm.isElementDecl()) { final XSElementDecl ed =
     * pterm.asElementDecl(); System.out.println(ed.getName()); }
     */
  }

  // try all Complex Types in schema
  Iterator<XSComplexType> ctiter = s.iterateComplexTypes();
  while (ctiter.hasNext())
  {
    // this will be a eSTATUS. Lets type and get the extension to 
    // see its a ENUM
    XSComplexType ct = (XSComplexType) ctiter.next();
    String typeName = ct.getName();
    System.out.println(typeName + newline);

    // as Content
    XSContentType content = ct.getContentType();
    // now what?
    // as Partacle?
    XSParticle p2 = content.asParticle();
    if (null != p2)
    {
      System.out.print("We got partical thing !" + newline);
      // might would be good if we got here but we never do :-(
    }

    // try complex type Element Decs
    List<XSElementDecl> el = ct.getElementDecls();
    for (XSElementDecl ed : el)
    {
      System.out.print("We got ElementDecl !" + ed.getName() + newline);
      // would be good if we got here but we never do :-(
    }

    Collection<? extends XSAttributeUse> c = ct.getAttributeUses();
    Iterator<? extends XSAttributeUse> i = c.iterator();
    while (i.hasNext())
    {
      XSAttributeDecl attributeDecl = i.next().getDecl();
      System.out.println("type: " + attributeDecl.getType());
      System.out.println("name:" + attributeDecl.getName());
    }
  }
}

Answer

DUFF picture DUFF · Nov 1, 2011

Well after a lot googling I think I've answered my own question. My proposed solution was hopelessly wide of the mark. The main problem was that the XSD has three namespaces and I was looking in the wrong one for the wrong thing.

If you're looking to parse an XSD in XSOM be sure you understand the structure of the XSD and what the tags mean before you start - it will save you a lot of time.

I'll post my version below as I'm sure it can be improved!

Some links that were helpful:

http://msdn.microsoft.com/en-us/library/ms187822.aspx

http://it.toolbox.com/blogs/enterprise-web-solutions/parsing-an-xsd-schema-in-java-32565

http://www.w3schools.com/schema/el_simpleContent.asp

package xsom.test

import com.sun.xml.xsom.parser.XSOMParser;
import com.sun.xml.xsom.XSComplexType;
import com.sun.xml.xsom.XSContentType;
import com.sun.xml.xsom.XSElementDecl;
import com.sun.xml.xsom.XSModelGroup;
import com.sun.xml.xsom.XSParticle;
import com.sun.xml.xsom.XSSchema;
import com.sun.xml.xsom.XSSchemaSet;
import com.sun.xml.xsom.XSTerm;

import java.util.Iterator;
import java.io.File;
import java.util.HashMap;

public class mappingGenerator
{
  private HashMap mappings;

  public mappingGenerator()
  {
    mappings = new HashMap();
  }

  public void generate(String xmlfile) throws Exception
  {

    // with help from
    // http://msdn.microsoft.com/en-us/library/ms187822.aspx
    // http://it.toolbox.com/blogs/enterprise-web-solutions/parsing-an-xsd-schema-in-java-32565
    // http://www.w3schools.com/schema/el_simpleContent.asp
    XSOMParser parser = new XSOMParser();

    parser.parse(new File(xmlfile));
    XSSchemaSet sset = parser.getResult();

    // =========================================================
    // types namepace
    XSSchema gtypesSchema = sset.getSchema("http://www.btec.com/gtypes");
    Iterator<XSComplexType> ctiter = gtypesSchema.iterateComplexTypes();
    while (ctiter.hasNext())
    {
      XSComplexType ct = (XSComplexType) ctiter.next();
      String typeName = ct.getName();
      // these are extensions so look at the base type to see what it is
      String baseTypeName = ct.getBaseType().getName();
      System.out.println(typeName + " is a " + baseTypeName);
    }

    // =========================================================
    // global namespace
    XSSchema globalSchema = sset.getSchema("");
    // local definitions of enums are in complex types
    ctiter = globalSchema.iterateComplexTypes();
    while (ctiter.hasNext())
    {
      XSComplexType ct = (XSComplexType) ctiter.next();
      String typeName = ct.getName();
      String baseTypeName = ct.getBaseType().getName();
      System.out.println(typeName + " is a " + baseTypeName);
    }

    // =========================================================
    // the main entity of this file is in the Elements
    // there should only be one!
    if (globalSchema.getElementDecls().size() != 1)
    {
      throw new Exception("Should be only elment type per file.");
    }

    XSElementDecl ed = globalSchema.getElementDecls().values()
        .toArray(new XSElementDecl[0])[0];
    String entityType = ed.getName();
    XSContentType xsContentType = ed.getType().asComplexType().getContentType();
    XSParticle particle = xsContentType.asParticle();
    if (particle != null)
    {

      XSTerm term = particle.getTerm();
      if (term.isModelGroup())
      {
        XSModelGroup xsModelGroup = term.asModelGroup();
        term.asElementDecl();
        XSParticle[] particles = xsModelGroup.getChildren();
        String propertyName = null;
        String propertyType = null;
        XSParticle pp =particles[0];
        for (XSParticle p : particles)
        {
          XSTerm pterm = p.getTerm();
          if (pterm.isElementDecl())
          {            
            propertyName = pterm.asElementDecl().getName();
            if (pterm.asElementDecl().getType().getName() == null)
            {
              propertyType = pterm.asElementDecl().getType().getBaseType().getName();
            }
            else
            {
              propertyType = pterm.asElementDecl().getType().getName();              
            }
            System.out.println(propertyName + " is a " + propertyType);
          }
        }
      }
    }
    return;
  }
}   

The output from this is:

ENUM is a string
STRING is a string
eSTATUS is a ENUM
eYN is a ENUM
Valid is a eYN
Cal is a STRING
Status is a eSTATUS