Getting Bioinformatics Done: Creating and maintaining Perl command-line scripts

In bioinformatics 90% of Perl scripts are command-line scripts and they should be written fast (because our bosses want results at the end of the day). Of course, Perl is perfect for this task, also called "quick and dirty" scripts and you may think will be fine create a messy script that will be used just once, right?

But a problem with "quick and dirty" scripts is that they usually live more than one run. Sometimes it becomes part of your mainstream pipeline and maintain that script is a pain.
Well, so how can we avoid unmaintainable command-line scripts?

The answer is: CREATE A HABIT.

So, you should always start a new script thinking that it will be a mainstream script. The problem with mainstream code is that we waste time thinking how to create a good code structure that will be extensible in the future. This is the main reason we always prefer create a "quick and dirty" script.

Instead of waste time thinking in a good structure for your script, why not use a framework for command-line script? That's the idea behind App::Cmd module.

This module allows to create toolkit scripts. For example, instead of create a script called 'create_dna_sequence.pl' and 'create_protein_sequence.pl' we could create a toolkit called 'create_sequence.pl' and call commands after like in:

But maybe you like to use Modern Perl style. Don't worry, there is MooseX::App::Cmd module that marries App::Cmd with MooseX::Getopt, so you can define command line options as Moose attributes.
I always use Moose and MooseX::Declare, so the code for a simple command-line script like the previous would be:

Really simple, right? And we can always expand the script adding more functions.

When I create a empty file with '.pl' extension in my VIm it always load the template below: So, even for simple scripts I have a well structured script that will be easy to maintain.

Getting Bioinformatics Done

Wednesday, November 9, 2011

Creating and maintaining Perl command-line scripts

No comments:

Post a Comment