Saturday, October 3, 2009

XML Schema Validation with a Simple Groovy Script

I'm sure there are many methods for validating XML documents against schemas out there in Java-land. I decided I want a relatively simple way to do it using the command line method using a Groovy script. Since I'd also need to leverage some libraries to do the job, I thought I'd also try running my script from Maven, so once I defined the libraries I needed, I wouldn't have to worry about having to have them set up.

To start, I generated a simple project, selecting the basic GMaven archetype:

  $ mvn archetype:generate
45: internal -> gmaven-archetype-basic (Groovy basic archetype)
Which is located in the Codehaus repository at:

Next, I edited the POM to include dom4j, the final version:

<project xmlns="" xmlns:xsi="" xsi:schemaLocation="">
<name>Schema Validation</name>

I wrote up this little script, src/main/groovy/Validator.groovy:

def (schema, document) = args
def schemaStream = new File(schema).newInputStream()
def documentReader = new File(document).newReader()
SAXReader reader = new SAXReader()
setupSaxReader(reader, schemaStream)
println "Document valid.\n"

def setupSaxReader(reader, stream) {
reader.setFeature("", true)
reader.setFeature("", true)
reader.setProperty("", "")
reader.setProperty("", stream)

As you can see, the script takes two arguments, the first being a schema to validate against, and the second is a XML instance docuemnt. I found a test schema from another test project and copied it over to my project directory along with a instance document.

I can run the script by first compiling:

  $ mvn compile
And then using the exec plugin in offline mode (-o) to speed things up as I already have all the jars I need so there's no need for Maven to check:
  $ mvn -o exec:java -Dexec.mainClass=Validator -Dexec.args="order.xsd order.xml"
If the validation succeeds, I'm given the normal BUILD SUCCESSFUL output. If not, the error is presented on the screen, for example:
  Error on line 7 of document  : cvc-complex-type.2.4.b: The content of element 'customer' is not complete.
One of '{country}' is expected. Nested exception: cvc-complex-type.2.4.b:
The content of element 'customer' is not complete.
One of '{country}' is expected.
To finish up, remembering the mvn command is a little rough for me, so I wrapped it into a shell script:
mvn -o exec:java -Dexec.mainClass=Validator -Dexec.args="$1 $2"
Which I can call with:
  $ ./ order.xsd order.xml

This little project has proved pretty useful as I can edit the schema or XML instance and run it though rather quickly. I'm sure there are plenty of other methods out there.

1 comment: