Friday, October 1, 2010

Trang: Creating Schemas from XML

Does anyone like writing XML schemas? Sometimes they can be frustrating, and yet always ends up feeling simple when you're done. When given the choice, it always feels good to me to start writing a schema from an example instance document, and of course there are plenty of tools to help.

While a lot of tools are available for a price, Trang however is free, and helps me with the writer's block I tend to get when I'm handed an XML document and asked to make a schema from scratch.

Trang is a Java app that can be downloaded in zip form. Luckily (for me at least), it was available in the Ubuntu package repositores:

$ sudo apt-get install trang

Now that we have Trang installed, let's generate a schema from a simple XML document, languages.xml:

<?xml version='1.0' encoding='UTF-8'?>
<languages>
<language>
<name>Groovy</name>
<platform>JVM</platform>
<appeared>2003</appeared>
</language>
<language>
<name>Scala</name>
<platform>JVM</platform>
<appeared>2003</appeared>
</language>
<language>
<name>Boo</name>
<platform>CLR</platform>
<appeared>2003</appeared>
</language>
</languages>

Now let's tell Trang, gratefully, to make us a schema:

$ trang languages.xml languages.xsd

And a schema is generated for us:

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified">
<xs:element name="languages">
<xs:complexType>
<xs:sequence>
<xs:element maxOccurs="unbounded" ref="language"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="language">
<xs:complexType>
<xs:sequence>
<xs:element ref="name"/>
<xs:element ref="platform"/>
<xs:element ref="appeared"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="name" type="xs:NCName"/>
<xs:element name="platform" type="xs:NCName"/>
<xs:element name="appeared" type="xs:integer"/>
</xs:schema>

This is a real good start, but I feel like I should make a couple changes. To make things a little easier to understand for consumers, I think I'll change the uses of NCName to string:

<xs:element name="name" type="xs:NCName"/>
<xs:element name="platform" type="xs:NCName"/>

And require at least one language element for the document to be valid:

<xs:element minOccurs="1" maxOccurs="unbounded" ref="language"/>

My edited schema becomes:

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified">
<xs:element name="languages">
<xs:complexType>
<xs:sequence>
<xs:element minOccurs="1" maxOccurs="unbounded" ref="language"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="language">
<xs:complexType>
<xs:sequence>
<xs:element ref="name"/>
<xs:element ref="platform"/>
<xs:element ref="appeared"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="name" type="xs:string"/>
<xs:element name="platform" type="xs:string"/>
<xs:element name="appeared" type="xs:integer"/>
</xs:schema>

Even with this simple example, Trang has saved me a lot of typing. Trang also has the options to also create RELAX NG and DTD documents if you need them.

No comments:

Post a Comment