The Holy Grail of software development

user-pic

Wiki Extras for this post

English is an interesting language. It has lots of rules, and over 170,000 words in common usage. That seems like a lot, until you realize that there are even more concepts that are described using multiple words. Those words are arranged and rearranged to create millions upon millions of books.

One of the things I find most interesting about English is that it's a very idiomatic language. There are many phrases that, if you pick them apart into their individual words, do not carry the same meaning as when they are used together. "In her element," "red herring," and "chip on your shoulder," are just a few examples.

These idioms allow you to deliver a significant amount of meaning in a small number of words. It's a conceptual shortcut. It's natural for people to want to communicate as much information in as small a space as possible because most of the time, there is more to communicate than there is time to communicate it.

In that respect, programming is very similar. We can think far faster than we can generate code. Therefore the less we have to write, the more meaning and function we can deliver per line, the more we'll be able to accomplish.

In the Perl world, this value is commonly referred to as DRY or Don't Repeat Yourself. The concept behind DRY is that we all know that there are activities our programs must do over and over, but that doesn't mean that we should have to write them over and over. Also in the Perl world, there is a temple for this worship of this value. It's called CPAN.

CPAN is really the Holy Grail of software development. CPAN represents the work of thousands of developers working together to harness DRY and to refine their day-to-day development process to writing the most meaningful lines possible.

After you work with CPAN for awhile, it can be tempting to pick at it, to get sidetracked thinking about the occasional issue or problem you encounter. There have been a lot of posts lately about CPAN, long dependency chains, test failures and such. There are some legitimate concerns here, but for the most part, the issues are relatively minor.

The truth is that CPAN is the envy of all the other languages for a reason. The problems CPAN has are the kinds of problems that are inevitable when your software is deployed on literally thousands of different combinations of system libraries and operating systems.

For those who don't know what CPAN is, and for those who do but have lost sight of how amazing it really is, let's take a look at what CPAN is and what you really get by using it.

CPAN is a collection of free bits of functionality that you can include into any project.

Modern software is complicated. Let's take a web application. There are literally thousands of things that have to happen in any given web application, from HTTP request parsing to form validation to data storage and integrating dynamic data into HTML aka templates.

CPAN gives you the option to pick the functionality rather than create the functionality. There are choices to be made, but choosing a lamp for your bedside table generally requires a lot less effort than making a lamp for your bedside table.

CPAN modules are generally high quality.

Let's think about this for a moment because it's easy to miss this fact. CPAN modules are not the same as some code you found on some web forum. In order to make it to CPAN, the module author has to be registered with CPAN. They have to have refactored the code to a point where it is relatively self-contained and not tied to any application where it may have originated. It also has to have had a module build suite created for it that is compatible with CPAN. Finally it has to have at least one test.

This is a fair amount of effort to put in just to share code you already wrote. It shows a level of commitment by the author and does a fair job of forcing them to answer the question 'is this code even worth releasing.' When you are looking at a module on CPAN, you can pretty much rely on the fact that it does what it says.

CPAN modules have a common build and distribution system.

Once again, let's think about this. If you have done any software development in other languages, or even just on Linux, you know that each library / framework you might want to use has, if you're lucky, it's own web site. To start a project that includes it, you have to go find the tar or zip file, download it, and read it's INSTALL instructions. If you are lucky your INSTALL file lists the libraries that it depends on so you can go get them and install them first. Once they are installed, you can run your original library's configure / build process according to whatever rules / processes are described in the INSTALL file (which are often different from library to library) If you are very very lucky you can install your OS repository's binary package of the library, but even then, you often wind up with an older version.

Let's compare this to CPAN. First, all the CPAN modules are listed at search.cpan.org, which has a functional search to help you find the module you want. Once you've decided, you enter a single command on your machine: cpan TheModule::Name. It's always the same command. The cpan program then goes and retrieves the most up to date version of the module, automatically determines it's dependencies, retrieves and installs them, all the while performing tests to make sure they function properly on your machine.

Let me repeat that in clearer terms: You type one command and walk away for a coffee, when you return, it's a safe bet that your module is installed and has been verified to work on your system.

Which brings us to....

CPAN modules are tested.

Every module that goes up on CPAN is tested. I don't mean it's been tested by the author. It is automatically tested on hundreds of machines running hundreds of different combinations of Perl version, OS and OS version. The results of those tests are public and linked directly from the CPAN distribution page for the module.

The vast majority of CPAN modules pass all tests on all platforms. Anyone who has written code for a living knows how hard it can be to get software to work on systems that are significantly different from the one the code was written on. CPAN modules are expected to work on all the platforms tested by the cpan testing service. As a CPAN author, if your module fails on some platform, the CPAN testing service will email you to tell you so.

Which is a long way of saying that you can rely on the fact that the code works.

CPAN modules are tested on your system.

When you install a library from a binary package system such as YUM or the various Debian package repositories, you are operating on faith that those libraries will work with your machine. To the repository maintainer's credit, they usually do. However, if you have used or managed a Linux machine (or any OS, really) you know that your faith in that system is balanced on a very thin wire. If you manually update anything on your system, or you use a package from a different repository, you are taking your chances as to whether you will end up with hundreds of broken packages on your system.

With CPAN, each module is tested on your machine before it is installed. Let's think about this again. When cpan is finished installing your module, you can be confident that it will work and you can move on to the next thing. This is a BIG DEAL. It's a big deal because it means that if you see a bug or failure in the way your code is working, you can pretty much rely on the fact that the bug is in your own code and you don't have to go digging through library code to find the cause. (Yes, there is always the possibility of an edge case the module doesn't test for, but that's a relatively rare occurrence.)

CPAN lets you have confidence in your deployments.

CPAN's testing process means that you can rely on the fact that when your code is deployed to a new server, you will know of issues BEFORE you try to turn your application on. The cpan installation utility will not install a module if ANY of it's tests fail, so you, or your system admin, are forced to resolve any issues before your code is deployed.

This eliminates a HUGE source of potential problems at launch time. There is nothing worse than spending months on a project only to fail on deployment because of a critical missing library or server configuration problem. CPAN goes a long way to prevent that from happening by making sure that all the pieces of code that your software stands on are solid.

The Short Version™

Above are just a few of the reasons that CPAN is great.

I encourage all Perl programmers to take a moment to realize that you are extremely fortunate to have such a fantastic tool available to you. We all need to realize that there will always be issues in software this complex, and the ones we have are relatively minor. Yes, we test during installation. I think that's great. It gives us a level of comfort in our deployments that few others get to have. Yes, some of our modules require many others. But complaining about this is akin to complaining that our dictionary uses too many other words to describe the meaning of a word.

These issues are true, they exist. It is, however, a line of thinking that somewhat misses the point. It's somewhat akin to saying, "yeah, my free car is no good because when I drive it off of a cliff, the doors and trunk pop open, and I have get out and close them before I can keep driving." The point is that you get it for free, it works REALLY well and even when you drive it off the cliff and crash horribly, taking a moment to close the trunk and the doors is usually enough to rectify the issue.

I do not mean to disregard that there are areas where CPAN in general could be improved. I do, however, mean to remind you that when you're holding the Holy Grail, it's somewhat short sighted to say 'This cup sucks.'

No TrackBacks

TrackBack URL: http://www.catalyzed.org/mt/mt-tb.fcgi/36

8 Comments

| Leave a comment

Great post, well written.

In my opinion the two most under-rated tools in Perl are (indeed) CPAN and (don't forget) the amazing Perl debugger.

I recently purchased and sawed through the book 'Writing Perl Modules for CPAN' by Sam Tregar, and although it is slightly out of date I would recommend it highly for an enjoyable read.

I agree, Sam's book is great. It is a bit dated now, especially since the advent of Moose. It is free from A Press as well as a pdf.

Tests are not mandatory. There is no requirement for even one test.

Lets call it a convention then. I've never seen a CPAN module without a single test.

Well unfortunately Sam's book is no longer available as an ebook (I even contacted the fine folks at apres and they confirmed it). But if you're lucky like me you'll happen to find an edition in some obscure back shelf of a bookstore.

Is that trait specific to English? I would have guessed it holds true for (almost?) every human language.

It does at least for the ones I have been in contact with (Danish, Swedish, German, French.)

It's definitely unusual to have no tests. I once uploaded a dist with no tests and started getting UNKNOWN results... so I added a compiles-ok test. Shame on me...

Leave a comment

All comments are moderated. Spammers don't waste your time

Sponsored By


Ionzero: Rescue your dev project.
OpenID accepted here Learn more about OpenID

Following

Not following anyone

Note to spammers: all comments are moderated. Don't waste your time