Tuesday, October 19, 2010

Bash: Learning about Arrays

Last year, I wrote about a little Groovy script to help me validate XML documents against a schema. A ended up with a little script to call to help me from having to recall how to run it using Maven. The script takes a schema location and an example instance document as arguments:

  #!/bin/bash
mvn -o exec:java -Dexec.mainClass=Validator -Dexec.args="$1 $2"

Which I can call with:

  $ ./validate.sh order.xsd order.xml

We store example XML documents along with our internal schemas, and sometimes I find myself running the same script multiple times, once for each document in the project, so I thought having a script to loop through the documents in a directory and report the results would also be helpful.

In order to report both the 'good' instances and the 'bad' instances at the end of the scripts run, I needed to learn a little about bash arrays.

In found that in bash, there are multiple ways to create an array. You can start by simply assigning a value to a yet unused array:

  #!/bin/bash
good[0]="my.xml"
echo ${good[0]}
Or you can declare an array using:
  #!/bin/bash
declare -a good = ("my.xml", "your.xml")
echo ${good[0]}
echo ${good[1]}

In my case, I will be iterating through the files and adding them to the appropriate array, so I won't be able to declare the array or its size up front. I used this method to first check if the array is empty in an if statement, and if so declare and initialize it. If the array does in fact exist, I append an element to in the else clause, by using @ to get the length of the array:

  #!/bin/bash
if [ ${#bad[0]} -eq 0 ]; then
declare -a bad=("$i")
else
bad=("${bad[@]}", $i)
fi

To roll everything up, my new scripts takes the schema location and a directory as arguments. For every, directory listing in the directory, I run the Groovy code against it. If the validation failed, setting $? to 1, I add the file to the 'bad' list. Otherwise, it goes to the good list:

  #!/bin/bash
xsd=$1
dir=$2

for i in `ls $dir`
do
echo -n "$i..."
mvn -o exec:java -Dexec.mainClass=Validator -Dexec.args="$1 $2/$i"

if [ $? -eq 1 ]; then
if [ ${#bad[0]} -eq 0 ]; then
declare -a bad=("$i")
else
bad=("${bad[@]}", $i)
fi
else
if [ ${#good[0]} -eq 0 ]; then
declare -a good=("$i")
else
good=("${good[@]}", $i)
fi
fi
done

echo
echo "Good files: "
echo "${good[@]}"
echo
echo "Failed files: "
echo ${bad[@]}
echo

At the end of script, I simply echo both lists. An example of the output for the report:

  Good files: 
good1.xml good2.xml good3.xml

Failed files:
bad1.xml bad2.xml bad3.xml

I'm sure I could remove some of the duplication, but I since I have only two arrays and this is just a helper, I think I'll leave things be for me. If you have any more array advice, please leave a comment!

No comments:

Post a Comment