Friday, January 28, 2011

Perl: Retrieve URLs with LWP and LWP::Simple

With Perl there are many ways to make requests over the web. One method is to use the LWP module. Below is an example of using it grab the contents of a web page:

use Carp;
use LWP;

my $url = 'http://prystash.blogspot.com';
my $contents = get_contents_from($url);

print $contents;

sub get_contents_from {
my ($url) = @_;

my $agent = LWP::UserAgent->new;
my $request = HTTP::Request->new(GET => $url);
my $response = $agent->request($request);

if (!$response->is_success) {
croak "Could not get URL '$url'";
}

return $response->content
}
Another simpler method is to use the LWP::Simple module:
use Carp;
use LWP::Simple;

my $url = 'http://prystash.blogspot.com';
my $contents = get_contents_from($url);

sub get_contents_from {
my ($url) = @_;
my $contents = get($url) or croak "Could not get URL '$url'";
return $contents;
}

Thursday, January 27, 2011

Removing Files Older than a Certain Number of Days

Using the find command, we can remove files that have not been modified in a certain number of days old using the mtime option:
  -mtime n
File's data was last modified n*24 hours ago. See the comments for -atime
to understand how rounding affects the inter‐pretation of file modification times.
To delete files that are older than 6 months, we can use:
  $ find . -type f -mtime +180 | xargs rm
Or alternatively:
  $ find . -type f -mtime +180 -exec rm {} \;

Tuesday, January 25, 2011

GMaven: A Couple Early Problems Building a Plugin

My first attempt to build a Maven plugin using GMaven got off to a rough start. I don't want to use this space to complain by any means, but I would to share what I learned in case anyone else runs into something similar.

My first problem had to deal with my use of a newer version of GMaven:

      <plugin>
<groupId>org.codehaus.gmaven</groupId>
<artifactId>gmaven-plugin</artifactId>
<version>1.3</version>
<configuration>
<providerSelection>1.7</providerSelection>
</configuration>
<extensions>true</extensions>
<inherited>true</inherited>
<executions>
<execution>
<goals>
<goal>generateStubs</goal>
<goal>compile</goal>
<goal>generateTestStubs</goal>
<goal>testCompile</goal>
</goals>
</execution>
</executions>
</plugin>

During the build of the plugin, I was given a deprecation warning stating that no mojo descriptors were found in the project:

[WARNING] Deprecation Alert:
[WARNING] No mojo descriptors were found in this project which has a packaging type of maven-plugin.

I found that reason for the warning was that the stub generation was not retaining the Javadoc annotations used to mark a Mojo. By downgrading to version 1.2 of GMaven and changing the providerSelection to 1.6, the warning went away.

Next when trying to use the plugin in another build, I was present with something ilke:

This realm = plexus.core
urls[0] = file:/opt/apache-maven-2.2.1/lib/maven-2.2.1-uber.jar
Number of imports: 10
import: org.codehaus.classworlds.Entry@a6c57a42
import: org.codehaus.classworlds.Entry@12f43f3b
import: org.codehaus.classworlds.Entry@20025374
import: org.codehaus.classworlds.Entry@f8e44ca4
import: org.codehaus.classworlds.Entry@92758522
import: org.codehaus.classworlds.Entry@ebf2705b
import: org.codehaus.classworlds.Entry@bb25e54
import: org.codehaus.classworlds.Entry@bece5185
import: org.codehaus.classworlds.Entry@3fee8e37
import: org.codehaus.classworlds.Entry@3fee19d8
-----------------------------------------------------
[INFO] ------------------------------------------------------------------------
[ERROR] BUILD ERROR
[INFO] ------------------------------------------------------------------------
[INFO] Internal error in the plugin manager executing goal 'org.prystasj.plugins:jms-testing:1.0-SNAPSHOT:hello': Unable to find the mojo 'hello' (or one of its required components) in the plugin 'org.prystasj.plugins:jms-testing'
org.codehaus.groovy.runtime.GroovyCategorySupport.getCategoryNameUsage(Ljava/lang/String;)Ljava/util/concurrent/atomic/AtomicInteger;

I found the solution to this problem was to exclude the groovy-all-minimal jar, version 1.5.7 with:

   <dependency>
<groupId>org.codehaus.groovy.maven</groupId>
<artifactId>gmaven-mojo</artifactId>
<version>1.0</version>
<exclusions>
<exclusion>
<groupId>org.codehaus.groovy</groupId>
<artifactId>groovy-all-minimal</artifactId>
</exclusion>
</exclusions>
</dependency>

I also had another that was similar error related to class CallSiteArray that was alleviated by ensuring I was using Groovy 1.6 everywhere.

Monday, January 10, 2011

Groovy: Sorting a Map by Values

Here's a real quick Groovy snippet demonstrating one way to sort a Map by the values stored in its entries (mostly so I don't forget how to do it):

def map = ["ghi":6, "abc":4 ,"def":5]
def sortedByValue = map.sort { a,b -> a.value <=> b.value }
println sortedByValue.keySet()
The output from this snippet is:
[abc, def, ghi]
Anyone have any other methods for doing the same thing?

Saturday, January 8, 2011

Using Maven to Publish and Verify Schemas

We use Maven to publish schemas and other documents, like WSDLs, that we would like to share across projects. This makes publishing schema releases, along with schemas that are in-development, easy for consumption by clients. Using Maven also allows to easily to publish the schema together with example documents as a resource bundle in addition to validating the examples messages against the schema during the build so we know the schema and what we would expect a message to look like are in sync.

Here I'm going to demonstrate our base project structure and the minimum POM we use for our schema projects. I'll follow that up by adding in additional plugins to make the project more worthwhile.

To start, we'll use a real simple schema, people.xsd, describing a list of people with each person being described with a first name, last name, and his/her age:

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified">
<xs:element name="people">
<xs:complexType>
<xs:sequence>
<xs:element minOccurs="1" maxOccurs="unbounded" ref="person"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="person">
<xs:complexType>
<xs:sequence>
<xs:element name="firstName" type="xs:string"/>
<xs:element name="lastName" type="xs:string"/>
<xs:element name="age" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>

An example document containing two people:

<people>
<person>
<firstName>George</firstName>
<lastName>Costanza</lastName>
<age>40</age>
</person>
<person>
<firstName>Cosmo</firstName>
<lastName>Kramer</lastName>
<age>42</age>
</person>
</people>

At the root of the project, we of course have the pom.xml. The schema and example documents are kept in the src/main/resources directory:

  pom.xml
src/main/resources/people.xsd
src/main/resources/twoPeople.xml
src/main/resources/onePerson.xml

The POM itself is relatively simple. By default, Maven will look to package a JAR. We have no classes in this project, but the files in the src/main/resources directory will be packaged. Since we want to publish the schema as a separate artifact, we can use the Build Helper plugin to attach the schema to the build for publishing. The location of the schema is defined by the schema property, which we will reuse later:

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>org.prystasj.schemas</groupId>
<artifactId>people</artifactId>
<version>1.0-SNAPSHOT</version>
<name>People Schema</name>
<properties>
<schema>src/main/resources/${artifactId}.xsd</schema>
</properties>
<build>
<plugins>
<plugin>
<groupId>org.codehaus.mojo</groupId>
<artifactId>build-helper-maven-plugin</artifactId>
<version>1.5</version>
<executions>
<execution>
<phase>package</phase>
<id>attach-artifacts</id>
<goals>
<goal>attach-artifact</goal>
</goals>
</execution>
</executions>
<configuration>
<artifacts>
<artifact>
<file>${schema}</file>
<type>xsd</type>
</artifact>
</artifacts>
</configuration>
</plugin>
</plugins>
</build>
</project>

Now we can run the install phase and see that the schema is installed to our local repository:

$ mvn install
[INFO] Scanning for projects...
...
[INFO] [install:install {execution: default-install}]
[INFO] Installing /home/prystasj/workspace/prystasj/writing/maven-xsd/target/people-1.0-SNAPSHOT.jar to /home/prystasj/.m2/repository/org/prystasj/schemas/people/1.0-SNAPSHOT/people-1.0-SNAPSHOT.jar
[INFO] Installing /home/prystasj/workspace/prystasj/writing/maven-xsd/src/main/schemas/people.xsd to /home/prystasj/.m2/repository/org/prystasj/schemas/people/1.0-SNAPSHOT/people-1.0-SNAPSHOT.xsd
...

The schema, along with the JAR, are installed separately to the local repository, and of course we can deploy it for public consumption by running the deploy phase.

Since we have a couple of sample messages, we can add a step to the build to validate them against the schema to help insulate us against any inconsistencies if the schema ever changes. To do just that, we'll add an execution of the XML Maven Plugin:

      <plugin>
<groupId>org.codehaus.mojo</groupId>
<artifactId>xml-maven-plugin</artifactId>
<version>1.0-beta-3</version>
<executions>
<execution>
<goals>
<goal>validate</goal>
</goals>
</execution>
</executions>
<configuration>
<validationSets>
<validationSet>
<systemId>${schema}</systemId>
<dir>src/main/resources</dir>
<excludes>
<exclude>${artifactId}.xsd</exclude>
</excludes>
</validationSet>
</validationSets>
</configuration>
</plugin>

The systemId property is used to define the location of the schema we want to validate against relative to the base of the project. Here I'm using the value of the schema property from the original version of the POM. The dir property describes the directory containing the instance documents to validate. I've added an exclude element to ensure the schema itself is not used.

When we run the install phase now, we should see the following build output if the validation succeeds:

[INFO] [surefire:test {execution: default-test}]
[INFO] No tests to run.
[INFO] [xml:validate {execution: default}]
[INFO] [jar:jar {execution: default-jar}]

To make sure things are setup correctly, let's add an instance document that we know should be considered invalid, src/main/resources/invalidPerson.xml, which defines one person without the required age element:

<people>
<person>
<firstName>George</firstName>
<lastName>Costanza</lastName>
</person>
</people>

Now our build fails, citing:

[INFO] ------------------------------------------------------------------------
[ERROR] BUILD ERROR
[INFO] ------------------------------------------------------------------------
[INFO] While parsing /home/prystasj/workspace/prystasj/writing/maven-xsd/src/main/resources/invalidPerson.xml, at file:/home/prystasj/workspace/prystasj/writing/maven-xsd/src/main/resources/invalidPerson.xml, line 5, column 12: cvc-complex-type.2.4.b: The content of element 'person' is not complete. One of '{age}' is expected.

Finally, we can add the Remote Resources Plugin to create a resource bundle for clients to take advantage of. We do have the JAR that is built that contains both the schema and example documents that client can unpack using the Dependency Plugin, but providing a resource bundle makes things easier, as the contents of the bundle are unpacked as a result of a depending on the bundle. An example use case for a client would be ensuring the creation of messages to be sent are valid.

The resource bundle can be created by adding the plugin:

      <plugin>
<artifactId>maven-remote-resources-plugin</artifactId>
<version>1.1</version>
<executions>
<execution>
<goals>
<goal>bundle</goal>
</goals>
<configuration>
<includes>
<include>**/*.xml</include>
<include>**/*.xsd</include>
</includes>
</configuration>
</execution>
</executions>
</plugin>

The build will now create a file describing the resources to be included in the bundle:

<?xml version="1.0" encoding="UTF-8"?>
<remoteResourcesBundle xsi:schemaLocation="http://maven.apache.org/plugins/maven-remote-resources-plugin/remote-resources/1.1.0 http://maven.apache.org/plugins/maven-remote-resources-plugin/xsd/remote-resources-1.1.0.xsd"
xmlns="http://maven.apache.org/plugins/maven-remote-resources-plugin/remote-resources/1.1.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<remoteResources>
<remoteResource>twoPeople.xml</remoteResource>
<remoteResource>people.xsd</remoteResource>
<remoteResource>onePerson.xml</remoteResource>
</remoteResources>
</remoteResourcesBundle>

A client can now depend on the bundle using the same plugin:

      <plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-remote-resources-plugin</artifactId>
<executions>
<execution>
<goals>
<goal>process</goal>
</goals>
<configuration>
<resourceBundles>
<resourceBundle>org.prystasj.schemas:people:1.0-SNAPSHOT</resourceBundle>
</resourceBundles>
</configuration>
</execution>
</executions>
</plugin>

With the addition of the plugin in the client build, the schema and example documents will be available on the classpath:

target/maven-shared-archive-resources/twoPeople.xml
target/maven-shared-archive-resources/people.xsd
target/maven-shared-archive-resources/onePerson.xml
target/classes/twoPeople.xml
target/classes/people.xsd
target/classes/onePerson.xml

Hope this helps demonstrate some potential uses for Maven that might be a little outside-the-box, but useful none the less.