Friday, April 15, 2011

Adding Maven Dependencies (for Google App-Engine)

Maven provides a decent system for managing builds involving gnarly dependencies. If you're starting from scratch with an app-engine project, just go here.  However, newer developers and clients with which I work complain consistently that difficulty using maven increases substantially when an archetype is chosen but developers want to absorb another largish development or deployment framework into the same build. Prognosis for static analysis isn't good in these cases either.

Challenges
Though many SpringSource frameworks work well on Google's app-engine without change, I found blog postings on adding gae support to an existing Maven pom.xml lacking. I had to field emails when "stuff didn't just work".  In particular, two nagging problems arouse: 1) which framework jars did builds depend on and 2) blog entries providing clean cut-and-paste solutions were all out of date.

No sooner had I shown someone how to augment a maven POM than Google upgraded their app-engine. Sure enough, I passed them code and it didn't work. They had two versions of gae on their system but maven was building against the old version of the app-engine.

Another individual fought an issue because they cut-and-paste a scheme from a well-formulated (but broken) blog entry on the topic. Specifically, maven's behavior, when groupId, artifactId tuples are repeated for multiple files, is to silently overwrite existing maven repository data with the newly supplied file. This means:

mvn install:install-file -Dfile=${LIB}/appengine-tools-api.jar \
  -DgroupId=com.google \
  -DartifactId=appengine-tools \
  -Dversion=${VERS} \
  -Dpackaging=jar \
  -DgeneratePom=true
 
mvn install:install-file -Dfile=${LIB}/shared/appengine-local-runtime-shared.jar \
  -DgroupId=com.google \
  -DartifactId=appengine-tools \
  -Dversion=${VERS} \
  -Dpackaging=jar \
  -DgeneratePom=true

Does not do what you want it to. Instead of having two jars included in your build as part of the com.google.appengine-tools dependency, you get only the runtime-shared jar.

Finally, those beset by the need to do things themselves often need to look up several XSLT syntactic constructs to get correct transforms capable of scraping a POM for all the dependencies that may need upgraded, removed, or otherwise tweaked. This is particularly the case because matching a POM dependency requires a multi-element match, governed in the case of my code by parameters passed to the sheet

Solution
On the POM side of things, you want to look for scenarios where an old version of a dependency exists, such as below:

<dependency>
 <groupid>com.google</groupid>
 <artifactid>appengine-api</artifactid>
 <version>1.4.2</version>
 <scope>compile</scope>
</dependency>


Note the explicit version number. What's preferable, is managing your dependencies in a parametrized fashion. Comparing x.y.z-format version numbers is obnoxious. Chasing down each related dependency in a file to check version number is as well. Because of these (amongst other) reasons, a more maintainable idea involves parameterizing these as references and changing that single instance on upgrade.

For instance:

<dependency>
        <groupid>com.google</groupid>
        <artifactid>appengine-api</artifactid>
        <version>${google.app-engine.version}</version>
        <scope>compile</scope>
</dependency>


Followed, of course, by:


  <properties>
    <google.app-engine.version>1.4.3</google.app-engine.version>
  </properties>


Since I'm building kit to help security folk work with and inject secure snippets into existing development projects, I decided to build a utility to help folk get things up and running on gae too (the average familiarity with Maven in security isn't a high as developers' typically) by doing the above. The utility wrote to handle these situations does the following:
  1. Parses an existing pom.xml looking for existing out-dated or half-working dependencies
  2. Iterates through specified dependencies, modifying the POM
    •        
    • Adds unmentioned groupId, artifactId tuples
    • Modifies existing groupId, artifactId tuples
    • Comments out replaced collisions for later inspection
    • Points out collisions for further inspection
    • Controls version references using properties, which are also added/modified as necessary
  3. Installs listed dependency files in the user's local maven repository
Ideally, proper version parsing, comparison, and collision detection would be possible but that would take real time. The script supplied here can be used with any dependency, not just Google's app-engine. In fact, manage any dependency you expect to change with regularity, if your project uses a different archetype already. Prerequisites show how.
    Pre-Requisites
    Pre-requisites are part of the problem here: a snag people run into when they're new to build configuration has always been that they're missing some tool they need to get things compiling. This was true of gmake and autoconf and it's true of maven builds nearly as often in my experience. So, when I wrote this utility, I purposefully used only bash, mvn, xsltproc, javac, and sed. All of these utilities are available on Linux and OS X out of the box.

    When gearing up to use inject-gae-dependency, simply:
    • Use an existing pom.xml
    • Modify the lists the dependencies (by editing the script's enumerations)
    • Download Google's app-engine or to be managed secondarily
    • Optionally salt-and-pepper to taste
    Running
    Using the tool is straightforward. Usage works as follows:

    
    ./bin/install-google-appengine-dependencies.sh <path to Google app-engine Java SDK>

    Status is printed to STDOUT and maven logging occurs to ./mvn.log

    Download
    Download the current code base from its Google Code Repository  or use SVN to do a checkout using URL:

    svn checkout http://code-poetic.googlecode.com/svn/trunk/mvn-inject-gae mvn-inject-gae read-only