Friday, September 16, 2005

What if Garrison Keillor did bioinformatics?

Okay - I wrote this a few years ago and some of the issues have sorted themselves out, but not so many as one might think.

What is bioinformatics? A biologist's perspective.

Imagine this. You've been sequencing DNA for a few years now, perhaps ESTs, or something else, and storing files on your local network. Your system administrator makes backup files for you and all is well.

One day you learn about interesting results from assembling sequence data and decide to try it yourself.

Watch out! You are about to descend into bioinformatics hell.

Soon you learn that the assembly program has complicated requirements and demands that all files entering the system be given an incomprehensible name to comply with sequencing procedures from the last decade.

You beg someone to do something with the computer and rename your files. Meanwhile, the back-up files with the original names, that were referenced in experimental procedure and linked to experimental data, languish on the system, forgotten. A few months later, no one knows why those files are there. Your new files with their new names are backed up. More new files enter the system and quickly acquire two sets of names. More months pass, the server is loaded down with files, and no one knows why.

Your department head, frustrated with the slow network, hires an expert to analyze the system and determine if you need a Linux cluster. Oops, it turns out that many files contain the same information. Naturally, the older files are deleted. Now all information connecting the files to the original experiments is lost.

Your lab director says to quit fooling around and hires someone to move all of your data into a database. But, the next few weeks find you ranting at your computer. Why? You don't know how to use SQL and you have important research to do, dammit! The last thing you want to do is fight with your computer to get it to tell you something you don't already know. And, you start to wonder, what exactly is in those tables? And why tables? And how are you going to get your data back and do something useful with it?

Perhaps, you decide, it's time to hire a programmer.

The first person you interview is very enthusiastic. You ask about programming experience. Apparently, he can program in more languages than a UN interpreter can speak. And he's especially excited about some language called "open source" and some snake language. Confused already, you ask what he's done. It turns out that he's written games and designed something sticky or gooey (you think) and know lots about cold fusion. You're a little worried about using gooey stuff around your computer and puzzled by the remark about cold fusion (especially since it was a fraud), but you smile and nod, not wanting to betray your ignorance.

Time to switch to your domain.

Do you know anything about biology? you ask. The candidate smiles. Oh yes! He took biology in high school and read "Genome", too!

You hire him, pay him twice the salary of any of the post-docs, and have him start with something simple. You ask him to write a program to translate DNA into open reading frames. You're met with a blank stare. Is there a problem, you ask? What's an open reading frame? is the reply.

To quote Garrison Keillor, "Wouldn't this be a great time for a slice of rhubarb pie?"


technorati tags:


Post a Comment

Links to this post:

Create a Link

<< Home