Monday, February 27, 2006

"I hear the cottonwoods whispering above.."

Some people say that science takes the magic out of everyday life.

Not me!

I've learned some things by reading Science (1) that might give nightmares to some people, especially young children.

Remember that scene in "The Wizard of Oz" when the trees start hurling apples at poor Dorothy?

Now we're learning that the trees would really defend themselves by giving poor Dorothy a tummy ache.

Pardon me a moment while I apologize to enforcers of precise scientific language.

Okay, okay, the trees probably have an evolutionary benefit if Dorothy eats the apples and kindly deposits the seeds in a nice rich pile of human fertilizer after her body is done with them. The trees would really only fight back if Dorothy chomped a bit on the leaves and bark.

Nevermind, this is even better. The trees talk to each other.


We learned last week that plants move more than you think, but it's not that. Even if a tree waved its branches frantically around, the other trees wouldn't see it. After all, they don't have any eyes. (Well, someone had to say it.)

But trees have other abilities that we don't see.

Trees (and other plants) can communicate by passing gas. You may not have gotten the point, but you've probably received some of the messages.

For example, if you walk around on the Berkeley, CA, campus, you know there are Eucalyptus trees around. You can't escape the smell. (The first time I visited, I thought someone must really be into cough-drops.)

Eucalyptus trees make over 30 volatile compounds (2) that contribute to that unusual smell. At least some of those compounds are messages to other trees. And a pine forest, what is it saying?

This first time I heard this fanciful idea of trees talking to each other was when I was in graduate school. This was back in the days when we had to walk five miles in the snow to get the lab (Oops, wrong story!).

One of the students and a post-doc in our lab made subtractive cDNA libraries to try and find wound-inducible genes. They were hoping to find some of the genes for making those tree-talking molecules, but their tools were a bit too crude.

Now, in the Feb. 10th issue of Science, there are some great articles that describe more recent experiments showing that the talking tree hypothesis is correct, at least for a few plants.

Baldwin et. al. describes some wonderful experiments where researchers observed that tobacco plants get eaten less when they live near sagebrush. Using microarray technology and sensitive analytical instruments, they found that the sagebrush produced all kinds of volatile compounds that traveled through the air. They also found that nearby tobacco plants responded by turning on new genes.

When tobacco plants heed the warning, they make a proteinase inhibitor. Since the proteinase inhibitor makes it harder to digest protein, the leaf-eating animals probably feel a bit ill after chewing those tobacco plants. Maybe they swear off chewing tobacco for good! Or maybe they decide that next time, they'll try a different brand.

Whatever goes through the murky minds of the herbivores:

When sagebrush talks, tobacco listens.


1. Baldwin IT, Halitschke R, Paschold A, von Dahl CC, Preston CA. Volatile Signaling in Plant-Plant Interactions: "Talking Trees in the Genomics Era." 2006. Science. 311:812-5.

2. Zini CA, Zanin KD, Christensen E, Caramao EB, Pawliszyn J. Solid-phase microextraction of volatile compounds from the chopped leaves of three species of Eucalyptus. 2003. J Agric Food Chem. 51:2679-86.


technorati tags: , , ,

Friday, February 24, 2006

I want my plant TV!

A long time ago, I saw a Star Trek episode where the crew encountered aliens who lived at a different frequency. I may have this backwards, but I think the aliens moved so quickly that no one knew they were there. And until problems struck, our heroes were happily oblivious to the existence of the others.

The Plants In Motion movies remind me of that episode. Since plant movement occurs much more slowly than movements we can easily observe, we tend to think that plants don't move. These movies prove that idea wrong. Filmed with time-lapse photography, these short movies show seeds germinating, beans dancing to circadian rhythms, twirling vines, flowers bursting out with passion, and many other activities that we simply don't see because they happen at a different rate.

I think a collection of these movies on a DVD, with a good soundtrack, could be a lot of fun. It wouldn't have quite the impact of Jumanji or Invasion of the Body Snatchers, but it still might inspire a deeper appreciation for the plants in our lives.

As for the background music, my vote goes to: "Feed me, Seymore!"

Subject: , ,

technorati tags: , ,

Thursday, February 23, 2006

Digital Biology course info is posted

A few months ago, I posted a note about the two courses that I will be teaching this summer in Austin, together with Dr. Linnea Fletcher (Austin Community College).

The information is now on-line, so you can ahead and register. The courses and dates are:

A Hands-On Tour Through the World of Bioinformatics

LINEA FLETCHER, Austin Community College and SANDRA G. PORTER, Geospiza, Inc.
June 8-10, 2006 in Austin, TX

Studying Evolution with Bioinformatics

LINNEA FLETCHER, Austin Community College SANDRA G. PORTER, Geospiza, Inc.
June 12-14, 2006 in Austin, TX

If you have a topic request, let me know. I've been working on some things with influenza virus, HIV, genome browsing, mutant structures, and green fluorescent protein. So we should have lots of fun.


Friday, February 17, 2006

Sequencing the campus at the Johns Hopkins University

A few years ago, the General Biology students at the Johns Hopkins University began to interrogate the unseen world. During this semester-long project, they study the ecosystems of the Homewood campus, and engage in novel research by exploring the microbial ecosystems in different sections of the campus. Biology lab students gather environmental samples from different campus ecosystems, isolate DNA, amplify 16s ribosomal DNA by PCR, and check their PCR results by gel electrophoresis.

DNA samples are next sent to the university's Genetic Resources Core Facility , where scientific staff, in the DNA Analysis Facility, prepare the DNA templates for sequencing, and load the completed reactions onto an Applied Biosystems 3730 Genetic Analyzer.

The past few years have seen some changes in this process. Data used to be retrieved by logging into an FTP site that allowed anyone to access data from any investigator.

In recent years, this JHU core facility obtained a Geospiza Finch Server, so now, instead of using an FTP site, they upload experimental data, in the form of electropherogram files (aka chromatogram or trace files), into a secure system (the Finch Server) for analysis and delivery. Students can log-in, but now they can only access student data. During the past two years, almost 500 JHU students have logged into the Finch Server to retrieve and view their data.

In the next part of the project, students use BLAST (and our BLAST for beginner's tutorial) to query GenBank at the NCBI and determine which bacterial species were isolated.

All about Phred
One issue that comes up, though, is the quality of the data. Data quality can be a problem when students do PCR for the first time in a lab class. The image below shows a screen shot from a Finch Server illustrating the distribution of high quality and lower quality data from this year's set of 87 chromatograms.

You can see from the histogram that about 20% of the chromatograms have fewer than 50 high quality bases. We're defining high quality, as base calls with a Phred score greater than 20. Phred, KB, and TraceTuner are programs that measure the probability of an incorrect base call. A Phred score of 20 corresponds to a 1% chance of a base-calling error.

What this histogram fails to show, though, is how the high quality bases are distributed in a DNA sequence and where they're located. The Finch Suite has programs that will trim poor quality regions of a sequence, but sometimes its still nice to see what your data look like.

I want my FinchTV
In the next step, students look at their chromatogram data in yet another Geospiza program, available for free, called FinchTV. As you can see, below, they select the high quality region of the sequence, and they can query different databases at the NCBI, just by choosing BLAST sequence.

(Warning potential bias alert: I do work for Geospiza, but I still think this is of cool!).

Through this process, students learn, first-hand, about the diversity of microbial life in the campus all around them and the genetic code that's used to store information in DNA. They also learn about DNA sequence analysis and bioinformatics. Since many of these students plan to attend medical school, this lab serves a critical need in acquainting future doctors with molecular diagnostics.

Both the Finch Server and the core lab staff in the DNA Analysis Facility were important for success of the project. "We never could have done this without the advice and help we got from the people in core lab," said Dr. Rebecca Pearlman, course instructor, "We were able to get our data, talk about quality, and complete a BLAST search in a single class period."

Pearlman adds, "Our students relish the opportunity to do genuine research. They get really excited when they learn they're using the same techniques for bacterial identification that are used by the public health departments."

Of course, I'm writing about this, partly because I get to help out, too. We're making custom BLAST-formatted databanks from each session of the course, so we can do quick comparisons between different data sets, among other things. Over time, these data will allow students at JHU to study changes in bacterial composition from year to year.

And who knows? There are loads of bacteria in every little bit of dirt. What could be cooler than discovering a new species in your first quarter of college biology?


technorati tags: , , ,

Wednesday, February 15, 2006

Experiments on Peeps

Science fair season is getting close and it's time to come up with interesting experiments that can be done at home. My youngest child wants to do some kind of psychology experiment with the cats, but I'm more intrigued with the idea of experimenting on peeps.

Much of my peep inspiration comes from Peep Research .

After all, haven't you always wanted to know about the biochemistry and anatomy of these simple almost life forms?

Where else can you go to find out the answers to those really troubling questions, like:

  • What happens when you autoclave a peep?

  • or drop it into sulfuric acid?

  • or maybe, drop it into liquid nitrogen?

  • And what about peeps in space? (or at least in a vacuum?)

Sigh. Some days I do miss having all those lab toys at my disposal.

I guess we just have to be more creative and see what sort of peep tests we can conduct around the house.


Tuesday, February 14, 2006

Pacific Northwest ASM meeting

If you live in the Pacific Northwest and want to learn more about microbiology, my alma mater is hosting the:

2006 Northwest Branch Meeting of the American Society For Microbiology
March 10 - 12

It's very affordable for those of you who are teachers and it's even on a weekend!

I'm biased, of course, but I think the meeting will be very informative and lots of fun!

There are some fascinating topics on the agenda.

Microbes, after all, were the first forms of life on our planet and have been voted the form of life most likely to be found out in the solar system. (Wouldn't that be a good one in your high school yearbook!)

True to form, the meeting will have Roger Buick (Earth and Space Sciences, UW) talking about the Earliest Life on Earth and Tom Quinn (Astronomy, UW) talking about the formation of planets.

Since microbes live everywhere, we'll hear talks on those that live in Squid and those that colonize volcanos (before and after eruption).

Unfortunately, the microbes that colonize our bodies aren't always the best tenants. So we'll get to learn new things about the ones that make us sick and the ones where we're having a hard time getting acquainted with the culture (technique).

Last, there are lots of interesting talks about genomics and computational biology. I'm kind of unlucky, though, my talk is at the same time as Maynard Olson. So, you either get to hear one of the pioneers of the human genome project or you can come hear me.

I'll have cooler slides.


Monday, February 13, 2006

Hunting for huntingtin, part IV: What did you expect to find?

Hunting for ways to do the experiment

In our last episode, we were stopped in our tracks by some glutamines that were missing from our positive control. Sometimes it's hard enough to find the sequences we want; when the sequences are intentionally hidden, it's impossible.

We trudged onward, though, and identified the problem. All we had to do was to turn off the option to hide low complexity sequences and lo and behold, we found our glutamines ... at least we found them in a match to the positive control.

Next, I tried searching with a sequence of ten glutamines.

What did I get?


Does that mean GenBank is devoid of proteins with 10 glutamines?

No. In this case, absence of evidence is not evidence of absence.

Huntingtin has at least 10 glutamines and sometimes more. Just because I didn't find a match, doesn't mean that there isn't a match to be found.

Hunting for ways around assumptions

This is a good time to talk about assumptions.

We've learned how to turn off the low complexity filter, but it looks like we will need to do a bit more. I've found this technique to be helpful before and we're going to have do it again. We're going to have to think like programmers in order to guess what assumptions were made in setting up the blastp web form at the NCBI server.

The most important assumption is given here:
To assess whether a given alignment constitutes evidence for homology, it helps to know how strong an alignment can be expected from chance alone.
From "The Statistics of Similarity Scores"

It appears that the BLAST server forms at the NCBI were set up with the assumption that most people would use the collection of BLAST programs to look for homologous sequences.

Homologous protein or nucleic acid sequences are defined as sequences that share a common evolutionary origin. If two sequences are sufficiently similar, according to the statistical parameters in a BLAST search, they are likely to be homologous – that is they are likely to have been derived from a common ancestor. Naturally, the people who set up the web server at the NCBI made the assumption that this is reason why biologists would come to site to do BLAST searches.

There you have it.

The most commonly used set of bioinformatics programs, in the world, (outside of Microsoft Excel), work by calculating the probability that two sequences are related through evolution. (If you have problems with evolution, you probably want to look for another field. Take my advice; bioinformatics will not be a good fit.)

Because the BLAST web form was designed with the assumption that we're looking for statistically relevant homologous sequences, features like low complexity filtering are carried out by default. Low complexity sequences can be found in lots of proteins so their presence would not indicate a common evolutionary origin.

But that doesn't mean that we're not interested in finding them.

We have our own reasons for hunting, so what do we expect?

We are doing this search for an entirely different reason. We're not trying to use statistical significance to support the case that proteins with lots of glutamines share a common evolutionary ancestor. We're using BLAST as a surrogate for Google. We just want to find other proteins with polyglutamine sequences and see if those proteins share the characteristic, like huntingtin, that a genetic disease occurs when we have few extra glutamines.

Right, but how do we find those proteins with extra glutamines?

We go back to the blastp web form and look more closely at the default parameters.

Looking below the set of filters, we see that the default parameter for the Expect (E) value is set at 10. An E value of 10 means that we would expect to find 10 sequences that matched, if we searched a big enough database and it was filled with random sequences. When the cutoff is set to 10, any matching sequence with a value over 10 would be hidden.

Hunting with blastp and to heck with the statistics!

To heck with statistical relevance! I want my sequences!!!

I arbitrarily raised the setting on the E value to 50. A stretch of 10 glutamines is pretty short, and so we would expect to get a high E value even if we have perfect match. (A quick explanation of E values.)

This time, I got results. Over 3400 protein sequences matched my 10 glutamine query.

And check this out:

The best match has an E value of 33.

This explains why I didn't see any sequences match when the cut-off value was set at 10.

But how good a match was this, anyway?

Look at the results:

It was a perfect match.

Sometimes digital biology can be just like working in the lab. We spend lots of time just trying to get our controls to work.

Next time, we WILL do some experiments.

In the meantime, if you'd like to skip ahead, here's your assignment:

Find out if there are other cases where extra glutamines are linked to genetic disease.

And for extra credit - explain why this might be so. Use evidence to support your hypothesis.

Read the whole series:


technorati tags: , ,

Friday, February 10, 2006

Plants that make crystals that look like plants

Chemical & Engineering News published this fascinating article called "The Secret Life of Plant Crystals" with some wonderful photos of calcium oxalate crystals. These crystals are produced by special cells (called "idioblasts") and the shape of the crystals is unique to each plant.

No one knows exactly why plants make these crystals and the crystals could serve multiple functions. Some of the sharp, needle-like, crystals might help defend plants from being eaten. Some crystals might bind heavy metals or harmful substances and keep them in kind of a molecular toxic waste containment center.

The best feature of this story, though, is the gallery of photos. The Botanic Crystal Fashion Show is composed of excellent pictures, all taken by Harry T. (Jack) Horner at Iowa State University. It's well worth taking a look.

Subject: ,

technorati tags: ,

Multiple views of integration

Here are some snapshots from one of our favorite friendly phage. Yep, lambda phage, scourge of E. coli everywhere and decades-long friend to molecular biologists.

I used to spend lots of time with shoelaces and bits of string trying to understand how lambda got cut apart, how the E. coli chromosome got cut apart, and how the four free ends, with two strands of DNA at each end, got joined back together. The process of lambda integration is also intriguing because of the way the genes get flipped around during integration, setting up a new gene order.

Now, we're luckier. We get to see and interact with colorful images, derived from the coordinates of the lambda integrase protein itself. The image below shows the two subunits of lambda integrase, colored in blue and purple, hanging on to three different strands of DNA (green, brown, and grey).

In the next image, I zoomed in and hid the integrase protein, so that I could get a closer look at the DNA. This image shows where the integrase protein has cut one strand of DNA so there are two free ends.

Exploring DNA Structure has over 70 structures like these with DNA molecules alone or DNA molecules bound to proteins, anti-tumor drugs, or other substances. It's truly amazing to hear suprised high school kids say they never knew that working with DNA could be as much fun as a video game.

With resources like these, I don't see any reason why high school kids and college students can't use computers for doing science. Why not learn about DNA structure by working with real data?


technorati tags: , , ,

Saturday, February 04, 2006

Part. III Hunting for huntingtin

Dealing with distractions

It's been a couple of months since our last installment. "Next week" has come and gone, and like Odysseus on his journey back from Troy, we've experienced delays in getting back to the story. We've likewise faced our own distractions from lotus-eaters, sirens, harpies, the dreaded cyclops. A typical holiday season.

We've even had our own trials with Circe's angry father. If you follow the weather, you know that Neptune unleashed his wrath on the Pacific Northwest this month. Some people tough it out and ride their bikes to work anyway. Most of us just huddle down inside our raincoats, grimly clench our umbrellas and double-tall lattes, and try to hang on until we see sun-breaks or spring.

But I saw the sun today, so it's time to get back to our story.

In our first episode, we learned that Huntington's disease results from the presence of extra CAG's (in the DNA), which are translated to glutamines in the huntingtin protein.

In our second episode, we wanted to know why the extra glutamines were a problem. So we looked for structures with extra glutamines but couldn't find them.

Too many glutamines

Then, we decided to look for other proteins with extra glutamines.


We looked for two reasons.

First, we could learn something about the structure of polyglutamine (a fancy way of saying lots of glutamines) if we could find a bunch of glutamines in a different protein. Earlier, we wondered if glutamines formed hydrogen bonds with other glutamines or other amino acids. The availability of a different structure might let us test that idea.

Second, we know that people with Huntington's disease get sick because of the extra glutamines. Well, if extra glutamines in the huntingtin protein lead to disease, extra glutamines in other proteins might lead to other genetic diseases. This is just plain interesting stuff to know about.

This is why doing science is kind of like living in "Alice in Wonderland." It's easy to fall down rabbit holes.

Down we go!

Last time, we tried to use blastp to search the protein database for proteins with at least 15 glutamines but couldn't find anything.

So, we searched again with the sequence of the huntingtin protein itself. We know that this entire sequence is in the database and it can serve as a positive control since it has to match itself.

It almost did.

But look at the image. The matching sequences are a bit short on the amino end of the protein sequence (this part of the protein maps towards the 5' end of the mRNA).

That's odd.

So I looked at the sequence alignments themselves to see what was happening at the amino end of the protein.

The alignment (above) shows that our Query sequence (huntingtin) begins to match a database sequence (identified as Sbjct) at residue 77.

This begs the question: Why isn't there a match to the first 76 amino acids?

So, I looked up the amino sequence for huntingtin in GenBank. Part of the missing section, from 1-71, is shown below. Q stands for glutamine.

matleklmka feslksfqqq qqqqqqqqqq qqqqqqqqpp pppppppppq lpqpppqaqp llpqpqpppp

My conclusion:

The glutamines were missing from the alignment!

Call out John Wayne; it's time for some troubleshooting

We were unable to match glutamines, even with our positive control.

As with any experiment, if the positive control doesn't work, you need to recheck your procedure and find out if something went wrong.

The answer is on the original page of the web form where we began our blastp search.

Notice the box that's checked, next to the phrase "Low complexity." The default setting with blastp filters and hides low complexity sequences like pppppp and qqqqqq. In general, this is a good thing, but not when we're trying to find proteins that contain those sequences.

(We use a similar kind of program with DNA sequences, too, called "RepeatMasker.")

Testing the parameters
Let's remove the filter and find out if our positive control (huntingtin) will work, now.

You can check this yourself. Either type or copy and paste the accession number NP_002102 into the blastp web form, and uncheck the low complexity filter before you do the search.

We're on our way. Join us next time, when we do an actual experiment!

Read the whole series:


technorati tags: , ,

Thursday, February 02, 2006


Announcements Courses, meetings, podcasts, and other events
Birds Biology is not just for the birds
Chemistry Chemistry makes biology come to life!Classroom activities and educational materials available from Geospiza Education
Disclosure statement and good questions to ask about any web siteDigital biology: How do people use bioinformatics resources to learn about biology and discover new things?Education: Random thoughts on education and information about summer coursesInsects Some creepy, crawly things, have their own web sitesMicrobiology Closer than you might expect
Plants Green, lean, and sometimes meanThe life and times of a digital biologist
Web resources: Diverse web resources for learning about the diversity of life. These cover biology, bioinformatics, biotechnology, microbiology and more.Wierd Science and Silly things

Can I be more organized on-line than I am in real life?

As they might say somewhere:

Some are born to be organized and others have organization thrust upon them.
I fall into the latter camp. The last memes made me aware that it's high time to organize topics into a more comprehensible subject groups.