Tag: build process
Pimp My Build
by Chris Wash on Jul.21, 2008, under Software Engineering
In terms of the sheer number of moving parts, extravagance, and complexity, the build-tool landscape is one of bells and whistles that can be a little tough to stay on top of. There are a few important pieces of a robust and valuable build, no matter what tools you use to achieve it, and by breaking the build into modules or categories, we can make a little bit of sense out of the chaos.
In our vast build-tool landscape, there tend to be some zealous cults out there, entrenched deeply in their own camps and unwilling to venture out to explore what the rest of the landscape has to offer. Thus, there have been some religious-like differences of opinion about which build tools are better and why. Sadly you run across these types of biases all too often, which only pollutes the landscape and makes it more difficult to wrap your head the most basic questions underlying all tools:
- What does it accomplish?
- When is it (or isn’t it) right for the job?
- How do I use it effectively and efficiently?
With a pragmatic view, we can try to cut through a lot of the BS and figure out if a tool is worth its weight in:
- dependencies/design complexity.
- ease of use/interface complexity.
- architectural decisions/compromises that it brings to the table.
Does the tool get us to the goal of a single button build/deploy? If the answer is yes, I’ll welcome it with open arms. Does it add me a form of feedback or some kind of statistic that could be a kind of gauge on a dashboard? Then I may be interested in it, but it’s not nearly as vital.
Here’s an outline of what I typically look to include in a project’s build before I am satisfied:
- Compile – Obviously your build script would be for naught if you didn’t include this piece. There’s also something to be said about some dynamic/interpreted languages (that run on the JVM) taking the hit on this step and producing compiled versions. The argument is akin to compiling your JSPs back in the day.
- Unit test suite – Another one of those no-brainers! You’ll obviously want to integrate your ability to run a single test or test suite from a build tool quickly and easily.
- Test reports – Test reports are an important way of visualizing the health of your project and are usually the first place you’ll go when you see something wrong. A must have.
- Transitive dependency management – This is one of those sticky subjects that people like to argue about religiously. Really the important thing to note here is that a crucial part of keeping your build clean and manageable is figuring out how to represent and resolve transitive dependencies, or second-level dependencies. If your code directly depends on a library, odds are that library has its own run-time dependencies that need to be satisified in order for it to work properly. Since managing the different versions of many different libraries can become tough when you start to look at different “stacks” of software, many people advocate having a tool help you resolve these dependencies so it’s not as daunting a task to get the entire stack to work again when you need to upgrade a single library. Maven and Ivy solve this problem by introducing repositories that will house and dynamically resolve these versioning issues, in much the same manner as yum or apt-get does in the Linux world. A lot of newbs will make the mistake of thinking this type of functionality will let them turn off their brain when it comes to think about their project’s dependencies. They will be sorely mistaken! The point of these to make resolving conflicts in transitive dependencies easier, they do not replace your brain, nor your obligation to know the dependencies in your project and how they fit together!
- Utility ant tasks – the daunting part of learning how to write good builds has a bit to do with how much you know about the ant task landscape. Nearly all libraries have tools that plug into ant, but which ones do you really need to know about? I’ve found a few are quite powerful, particularly replace, uptodate, cvstagdiff (and corresponding svn’s), xmltask and dbdeploy. I think these tools should be looked at when you are trying to figure out how to accomplish something specific, and for that reason, they’re a bit more like “bells and whistles” than anything else. The same goes for the next category, which arguably doesn’t even need to live in your build (but if you’re a team player, it’s a definite nice to have):
- Static analysis – There have been a barage of slick static-analysis tools that have entered the Java landscape recently. A few are FindBugs, PMD, CheckStyle. While most of these tools have plugins that integrate nicely into an IDE, it’s also helpful to have them plugged into your build, too. Just to make sure everyone is following the rules. A few others I’ve seen that focus on cyclomatic complexity, cohesion/coupling are Metrics and JDepend.
What build tools fit into your pimped out script?
Continuous Integration Dissected
by Chris Wash on Mar.13, 2008, under Software Engineering
Setting the Record Straight
A lot gets written about Continuous Integration, particularly on which is the best visual cue to let you know your build is broken or that a test is failing – lava lamps, Beta Brights, Ambient Orbs, and some even suggest traffic lights. But aside from this extraneous (at least to business) nerd-banter, a lot of what I find written about the actual topic of CI is fluffy, ivory tower, or pie-in-the-sky jibber-jabber that leaves out important parts of the big picture or confuses people more than it helps. In hopes of clearing up confusion on what exactly CI is and how it’s supposed to work, I’m ripping out a description that I wrote for a client proposal recently (so my apologizes for the dry-tone). I hope sheds some light on the true nature of CI, why it’s important and how to implement it from a birds-eye point of view.

Continuous Integration Dissected
Any large scale development project needs an automated, repeatable build process. Following best practices while developing a build process properly separates environment-specific configuration concerns from the codebase. This allows new environments to be created quickly and easily by simply overriding any environment-specific configuration values when first executing the build process. Whatever build tool is being used, builds should share a common, consistent process and interface. A consistent, repeatable build will know all of its dependencies and the goal is to be able to build any given module anywhere, independently, at any time.
Automated, repeatable build processes typically begin by obtaining dependencies (which can be specified using a dependency management tool) and a specific working-copy of the codebase (“checking out”) from a SCM system like CVS or Subversion. It is important to note that this codebase includes any code that is responsible for performing automated testing in addition to source code and configuration (and possibly other source-like artifacts).
Once the checkout has completed, the process will compile code and run automated unit test suites for each module in the system. At this point, all automated unit tests should pass, and custom development can begin. Any changes to code must be adequately covered by unit tests (either by changing existing tests or creating new ones), must fully compile without any errors and pass all automated unit test suites before being committed to the repository. The practice of always keeping the code committed to the SCM repository in this state (no compilation or unit test errors) is known as Continuous Integration and ensures that new development is safe to proceed at any point without fear of integration errors.
Subversion (and CVS) support concurrent development by following a Copy-Edit-Merge paradigm; any contention over files is usually caught when a developer tries to commit their changes and notices the underlying files have changed since they obtained their copy. In many cases, Subversion is capable of performing a merge automatically, if there was no contention over the same piece of a file. Sometimes, however, a manual merge will be required. Merging becomes more painful as the number of differences in the conflicting files increase. A good rule of thumb is that every developer should commit their changes at least daily.
Designating a single machine as a Continuous Integration (CI) environment provides many added benefits to a large scale development project. There are many operations which are good candidates to have run “continuously” but quite often are too expensive for developers to perform before every commit. Examples include executing automated in-browser system tests (which, if maintained over multiple releases, can serve as a “mini” regression test suite), performance tests/profiling, producing test metrics, generating documentation, etc. CI servers are an ideal place to schedule these processes to occur in an automated fashion.
