BLASTing through the kingdom of life

This popular activity, designed to accompany the BLAST for beginners tutorial, has been updated to incorporate student comments and teacher requests. Originally developed for the BIO 99 teacher workshop, this activity has been one of the most popular items on Geospiza's web site. We have seen the activity used in several venues from high school courses to workshops for researchers at the Lawrence Livermore National Laboratory.

Students BLAST through the kingdom of life by using blastn to identify 16 "unknown" sequences. The 16 sequences were chosen to represent diverse organisms from RNA viruses that infect yeast, to humans. This set was compiled from a mixture of cDNA sequences and intronless sequences from bacteria or viruses to minimize confusion. Further, every sequence in this set codes for some kind of protein that might be recognizable to students, such as amylase (an enzyme found in spit that breaks down starch) or DNA polymerase (makes DNA). In this update, I added an new example sequence along with answers. Ideally, the strategies I used for answering questions with this sequence, will be a good example for students who are completing this activity.

The data set, worksheet, and answer key for this activity are all available on-line at:

Unlike "canned" activities, it should be noted that students use real sequences and real databases. Since new information is continually added to the databases, the answers to the questions may change, too. I once saw the contents of a database change between the beginning and the end of a three hour workshop. On one hand, this can be disconcerting when it's unexpected. On the other hand, knowing that these are living and changing resources is exciting. Students know when they use these resources and programs that they're not using old or simplified techniques that are only employed in a classroom setting.

An unfortunate consequence is that grading gets a bit more challenging. The continual addition of information to the NCBI databases, used in this activity, means that some information that's unknown today might be known tomorrow. The majority of the answers in our key will not change - but new information might be added. Our current plan is to update the answers on a yearly basis to incorporate new information.

