Monday, December 19, 2005

White people are mutants

50 points for digital biology! Lamason et. al. found a new gene that controls human skin color while studying pigmentation in zebra fish (1).

No red herrings, here! These zebra fish had an unusual golden color that turned out to be an important clue. Lamason and collaborators found that the golden zebra fish lost their normal color because of a mutation in the slc24a5 gene. When the zebra fish have the mutant form, they produce fewer melanosomes.

A short language lesson
Fewer melanosomes, right. What on earth is a melanosome? Melanosomes are special compartments that store pigment, you can think of them like clear containers that hold brown paint. The brown paint would correspond to the pigment, melanin. There also different shades of brown paint (melanin), one lighter (phaeomelanin) and one darker (eumelanin). Melanosomes are only found in special kinds of cells called "melanocytes."

Fewer melanosomes mean a lighter color.

So what do we care about zebra fish??
We're very, very, very distant cousins and we share a common ancestor somewhere way, way, back in time. So, if zebra fish have this gene that controls melanosome production, humans probably have it, too. (Evolution is not controversial in my field; it's fundamental.)

So, just like all biologists have been doing for the past twenty some years, Lamason and friends went fishing off the GenBank (a database of nucleotide sequences), using the zebra fish gene as bait.

What did they catch?
They found the human version of the gene and looked to see if there were any differences that were associated with skin color. Indeed, there were. Europeans had the mutant gene (i.e. fewer melanosomes, lighter skin), while Africans had the gene that makes more melanosomes.

Other research
This isn't the first study to look at the genetics of skin color. Other researchers have run across skin color genes while studying cancer biology. Bonilla, et. al. found a single nucleotide change in the 3' untranslated region of the ASIP gene that is associated with skin color (2). (This changes the sequence of the messenger RNA but NOT the sequence of the protein). One nucleotide is associated with a lighter skin color, in Americans of African descent, with the other nucleotide, the skin color is darker.

The melanocortin 1 receptor (MC1R) also accounts for some variation in skin pigmentation and hair color (3). In this case, variation in the MCR1 protein results in red hair.

What do we conclude?

First, for those of you with a paranoid bent, your worst fears are confirmed. We can tell skin color with a simple genetic test. Hopefully, this ability won't be misused.

Second, the differences that determine skin color are very small. Statistically, it would seem that changing a few nucleotides in a 3 billion nucleotide genome, would be insignificant.

So it would seem.....

Read the abstracts from the original papers:

1. Lamason RL, Mohideen MA, Mest JR, Wong AC, Norton HL, Aros MC, Jurynec MJ, Mao X, Humphreville VR, Humbert JE, Sinha S, Moore JL, Jagadeeswaran P, Zhao W, Ning G, Makalowska I, McKeigue PM, O'donnell D, Kittles R, Parra EJ, Mangini NJ, Grunwald DJ, Shriver MD, Canfield VA, Cheng KC. 2005. SLC24A5, a putative cation exchanger, affects pigmentation in zebra fish and humans. Science. 310:1782-6.

2. Bonilla C, Boxill LA, Donald SA, Williams T, Sylvester N, Parra EJ, Dios S, Norton HL, Shriver MD, Kittles RA. 2005. The 8818G allele of the agouti signaling protein (ASIP) gene is ancestral and is associated with darker skin color in African Americans.
Hum Genet. 116:402-6.

3. Rees JL. 2004. The genetics of sun sensitivity in humans.
Am J Hum Genet. 75:739-51.


technorati tags: , , ,

Friday, December 16, 2005

People who look like their dogs

Now that the dog genome is done, maybe we need a new project in genetic variation.

What genotypes make people look like their dogs?


Thursday, December 15, 2005

Part II. Hunting for huntingtin

In which we're reminded that database searches are experiments, too.

Playing catch-up with the latecomers

Hi, for those of you who've just joined us, we've gotten lost in some databases while hunting for information on huntingtin. If you'd like to catch up a bit and come back later, you might want to read Hunting for huntingtin (part I).

If not, here's a brief synopsis of the plot and what we've done so far:

  • learned about Woody Guthrie and Nancy Wexler

  • found a couple of reviews describing Huntington Disease

  • got the HD gene sequence and counted the number of CAGs

  • we learned the CAG codes for glutamine and that glutamine can form hydrogen bonds

Then, we got curious about those extra CAGs and wanted to know if they result from the disease or cause the disease. So we looked up huntingtin at the UCSC genome browser and saw that there are similar genes in mouse, pigs, and zebra fish (plus a few other members of the animal kingdom that were not discussed).

Ah hah!
Since mice have a similar gene – and we know that the Jackson Lab is the place to go for all things mouse – sure enough, the Jackson mouse breeders have made mice with extra CAGs, and .....the mice get the symptoms of HD.

So you guessed it, the extra CAGS are the problem, not the result.

As the fearless leader of this expedition, I vote now we look at those extra CAGs a little more closely.

Searching for the lost glutamines

You might remember, in part I, I mentioned looking for 3-D structures with polyglutamine. I did find one structure with a polyglutamine sequence, but it looks like the crystallographers weren't able to resolve the part in the structure where the glutamines are supposed to be. Cn3D shows the missing glutamines in grey in the sequence window. The structure window shows this:

Looking for other structures

Okay, so what can I do? What would you do?

I decided to do a blastp search, since NCBI has this cool new feature where protein sequences, with a corresponding structure, are linked to the structure record in the MMDB.

So I used blastp to search the human protein database with a sequence of 15 glutamines.

What did I find? My result was this: No significant similarity found.

Database searches are experiments, too.
Some of you might be rolling on the floor laughing about now, because you know why I got this kind of result. And some of you might be puzzling over this because it seems kind of contradictory. Didn't I write something earlier about proteins with lots of glutamines? Shouldn't there be some in the database?

Right! Database searches are experiments even if we don't have a pipette anywhere in the room. And there are some fundamental rules for lab experiments that are important to keep in mind.

Okay, Okay, what did we miss?

Close your eyes (oops, no, read the sentence first). Let your mind relax and imagine elementary school science lab. What were the important things to remember about doing experiments?

Disinfect your bench before you start work?

No, not that!



Experiment Rule 1. Always include control samples, usually both positive and negative.

Personally, I find positive controls to be the most important for database searches. But if I were testing a new program, I would include negative controls, as well.

So I went back and repeated the experiment with a positive control. (I wish this were the only time I've had to do that.) I looked for the huntingtin protein in GenBank and found a reference sequence, NP_002102. And I repeated the blastp search with the reference sequence. (I'm so grateful that I can do this in five minutes on my computer rather than spending a couple of weeks or more trying to get the samples and a couple of days doing a PCR experiment or Southern blot.)

I did a blastp search again. And this time I got a result.

or did I??

There's definitely something odd about this image.

I'm going to stop writing now and give you a chance to solve the puzzle and find out what's wrong.

And yes, this is one experiment that you can try at home. Take the accession number for huntingtin, NP_002102, type it or paste it into the web form for blastp and see if you can up with answer.

Look for more episodes next week in the continuing saga of the search for huntingtin. In the meantime, feel free to post your guesses or results. I won't give away the answer until next week.

As Tigger says, TTFN!

Read the whole series:


technorati tags: , ,

Tuesday, December 13, 2005

My humble efforts to submit

Maybe we should just experiment on ourselves
I've come to the conclusion that writing a grant is by far easier than submitting one, at least to the NIH. For the past two weeks, I've been trying to submit a phase I SBIR, and now that we're approaching the second deadline for resolving grant problems (Dec. 15th), it's getting harder to suppress that impending feel of panic as the NIH is insisting that the second deadline is for real.

If you've read other postings here, you might know that I usually write about more teacherly topics, after all, this is not an anonymous blog and there are plenty of sites where you can read writer's rants. However, in light of my current mood of quiet desperation, I really couldn't pass this up. What better way to deal with pain than to try to find the funny side?

A phase I SBIR, for those of you new to the lingo, is a small grant that small businesses can apply for to help jump start new research activities. My company, Geospiza, has had these kinds of grants in the past, so we usually know how to handle these sorts of things. But for this submission period the NIH decided to try a large, uncontrolled, experiment on a few thousand (and probably unwilling) human participants.

That's correct. After a long-standing tradition of voraciously consuming trees by making people send in several photocopies of large applications, the NIH will no longer accept anything but electronic grant submissions. I'm okay with that. It's a good thing to do with the potential to make life a bit easier. The NSF has done this for several years and it works. It's just that the rapid transition, multiple web sites, and lack of useful information on the most challenging site, is making this really, really hard. So, I've decided to share my pain and hopefully spare a few of you from the same path of suffering and anxiety.

Maybe I'm just not the submissive type
I'm optimistic that the bugs will get worked out, but if you plan to submit a proposal in the next year, pay attention.
  • Do as much as you can in advance. I've spent two weeks now on the electronic submission step and the end is not yet in site. If you do the math, that means at least 4 weeks to write your proposal, two weeks to get it through the proper channels, and possibly two or more weeks to get it submitted correctly. (That's right, at least 8 weeks.)

  • Don't waste time with e-mailing the commons support group (unless you really don't like looking around web sites for information. The e-mails didn't tell me anything that wasn't already posted.)

  • Do call the help desk, but be warned, the toll-free phone number is only on the front page of the commons, you won't find it by clicking the Tab labeled "Help." Help is most helpful when nothing's gone wrong.

  • Stock up on crossword puzzles or some blogs to read while you're waiting on the phone.

How long must I submit? My personal odyssey

Weds., Nov. 30th, 8-11 pm
First let me say this: I use a Mac. Many people groan when they hear that and decide right away that I'm a stereotypical troublemaker and that all the problems are due to my poor titanium G4 laptop. So, usually when I need to ask for help, people want to blame everything on my Mac. But my Mac is not the problem. I've learned to adapt when it's necessary. And just in time, too, since the NIH has decreed that we shall all use either Windows or Virtual PC, if we wish to apply for funding. So I began the odyssey by installing the government-mandated PureEdge program on Virtual PC my Mac. First hurdle, successfully jumped!

Then, I hit the SUBMIT button in PureEdge. Something seemed to happen. FireFox opened up. And then everything stopped.

No luck. I try again, no luck. I notice that the copyright date on PureEdge is a few years old (2002), so on a hunch I open up Internet Explorer and the submission process starts working! Yeah!!! It's submitted. I can go to sleep now.

Thurs., Dec. 1st, 5:30 am AAAHHH! There's an automated e-mail from the NIH saying that there are errors in the application (but no indication of what they are) and today is the deadline and I have leave town tomorrow morning!! Plus, once the grant has been submitted I have two days to log in to the commons and approve it. (This step seems kind of redundant to me since it's quite alot of work to submit something that I don't approve of.)

Really, it would be nice if the subject line in the e-mail said something, like the Hitchhiker's Guide to the Galaxy, in large friendly letters, : DON'T PANIC!.

But that would be wrong. Panic is exactly what's called for in this situation. Among the helpful bits of information in the e-mail is the statement:
To view the messages, log in with your username and password
to the NIH era Commons website at .... Then select the Status menu item, retrieve the grant application, and click on the Application Identifier (TN) link next to the submitted application.
And a bunch of other stuff that didn't really matter because I logged and couldn't find my grant application. Naturally, I couldn't retrieve it.

I e-mailed the help desk but then got nervous and called support. Ah good, only 8 people in the queue. The nice and very cordial support person told me the problem was that my user name was missing from the proposal. Was that in the instructions? I added my user name and uploaded the grant again, guessing from the time stamps on the e-mails that it would be about 8 hours before I would find out if this worked.

Thurs., Dec 1st 7 pm I get another e-mail with the same error message as before. I still don't see my grant when I log in but at I'm a little relieved to see that the deadline for corrected applications has been extended to the 8th (of course we have to have a written note from our parents .. no just joking, but we do have to include a cover letter with an explanation for why the proposal is late, I kid you not!). And I have to catch a shuttle at 5 am to get to the airport to go to California and give a professional development workshop for high school teachers on using computers for doing biology (yes, you sense irony).

This time I sent an e-mail to the support group asking for their help.

Tues., Dec 6th I'm back in town and at 11 am receive a return e-mail from the commons telling me to log in and retrieve the grant (that I can't find or see) with a list of the most common problems and a nice note stating that my issue has been closed.

Okay. I get on the phone once again (only 20 people in the queue this time!) and once again talk to a very nice, patient help desk guy, who tells me how to find submitted grants that have errors. He also points out that I'm not a PI! What!!! Okay, I suppose that one must be my fault since I was the person in our company who took the initiative, a year ago, and applied for a commons account in the first place. I don't how I could have done this and not assigned myself a role as a PI, unless I thought I had to have an NIH grant at the time. Never mind that. Help desk guy tells me it's not too late. I can make myself a PI. And bless him, he tells me how to find grants that have e-submission errors so I can find and fix them myself.

Oh yeah, and I need to fix a math error. Rounding up by 20 cents is an unforgivable sin.

No problem. I do it. Except this time, I forgot about the browser issue. I spend two hours trying to figure out why the submission isn't working and talking with a very nice (but clueless) person at who tells me that our T1 connection must not be fast enough.

Luckily I remembered the solution before someone took my stapler.

Tues., night, Dec. 6th It turns out that I had a typo in my congressional district and apparently the NIH cannot figure out where I live from an address and zip code. Plus, Microsoft Excel cannot calculate 7% accurately. The value was off by 0.000625 cents and that discrepancy triggered more error messages. I fix the errors again and once more I SUBMIT.

Weds., Dec 7th, 1 am I get an e-mail saying that the grant has been retrieved. Yeah! This time it has to work!

Weds., Dec. 7th, 6 am I log in to the NIH commons to check and find out if there's a way that I can see the grant has been received. Nada. I can find the two of the previous grants with errors but not the latest version.

I sent an e-mail to support asking them if I should be able to see if the grant has been retrieved. I can see the others that had errors by using the method I learned from the help desk guy, but not the most recent submission.

I start checking the site on a daily basis.

Fri., Dec. 9th I get a canned answer from support with a list of the most common submission problems. I e-mail a reply that I still can't tell if the grant was okay or not. I try calling the help desk. No luck. But the help desk phone message has morphed into something new and amusing. Along with the usual bit about the call being monitored, I now hear:
"The help desk is experiencing a high volume of calls and not taking any e-mails at this time. If you have left a voice mail please e-mail us at.... If you wish to remain on the line you may do so. If you wish to hang up, press 1 now...." and so on.
Apparently, (to paraphrase the taped message) the commons is experiencing such a high volume of calls that they've lost all their voice mail capacity. You cannot leave a voice mail and if you thought you left one earlier, you were wrong. (Okay, my confidence in the system has been restored.)

I especially like the part of the message where you're told to press 1 to hang up. (huh? The usual method works just fine. I've been doing this every day for a week so I know this).

I send an e-mail to support.

After all this fuss, don't I deserve a little reassurance?

Mon., Dec. 12th I'm getting worried about not seeing the application anywhere in the commons. I spend half an hour waiting on the phone to talk to a help desk person. Time to press 1.

Tues., Dec. 13th I get an automated e-mail at 3 pm from the NIH commons saying:

Our records indicate that you may have started, but not completed, the submission process for an SBIR or STTR grant application to the National Institutes of Health (NIH)..... etc.

Yikes! But at least, I think I know what's wrong. I find something new in the list of submssion problems on the commons website mentioning that I need to edit my account so that I'm both a PI and an SO (some administrative thing). I make myself a new PI account. BUT here's the worst part. TIME IS RUNNING OUT. This is all supposed to be resolved by the 15th and after I set up my second new PI account (with a new user name), I get an e-mail saying that it will take the Commons 2-5 days to verify my account, plus, since I have a different user name associated with that account, which means that I will have to upload the whole enchilada again through and the validation process seems to take 8 or so hours.

I am getting very worried about this. I now have 3 accounts with the NIH commons, with 3 different user names and 3 different passwords. And one account with with another set of user names and passwords. And I still don't know where my proposal is.

Weds. morning, Dec. 14th I found a link last night on the era commons web site for sending feedback on the submissions process. I sent a long panicky diatribe and this morning got a very nice e-mail from someone at the NIH. This was followed up by a phone call and some help. It's now 10 am and we're still not done with the process but I'm staying optimistic. I need to resubmit through one more time but I'm getting messages that say Bad Gateway. Right bad, bad gateway, you be nice to that upstream server.

wish me luck! I'll keep you posted when I've finally submitted.

FYI: I want to emphasize that both the and NIH help desk people have been very nice and cordial throughout this ordeal. Their task is not easy.

Weds. Dec. 14th 5:45 pm It's in and it's done and I owe many thanks to the kind and helpful people at the NIH help desk.

For those of you thinking about submitting a proposal soon, give yourself plently of time, and consider yourself warned!


Saturday, December 10, 2005

The birth of a hummingbird

A friend sent me this link since I manage the Hummingbirds soccer team. Hopefully, the person who posted these shots of baby hummingbirds will leave them up for awhile.

My husband told me about this page at Nature Photography with other truly incredible hummingbird photos. There are some great photos of hummingbirds in mid flight plus some pics of hummingbirds sitting quietly at bird feeders. Incredible!

Subject: ,

technorati tags: ,

Friday, December 09, 2005

Hunting for huntingtin

A bit of background
Alice's Restaurant is a movie with an unforgettable song that mostly revolves around Arlo Guthrie hanging out with his friends. Somewhere in the movie, the conversation turns to Woody, and someone asks the question that no one wants to touch. Does Arlo's girlfriend know about Huntington's? ...dead silence... Now, I did see the movie quite a few years ago, so my memory of the plot is kind of fuzzy but, as I recall, no one in the movie was prepared for that kind of discussion.

It has been a couple of decades, or so, since Alice's Restaurant was made. Woody Guthrie is long dead, but kids still sing This Land is Your Land in elementary school, and people with Huntington's disease (HD) are still without a cure.

HD is a terribly debilitating disease that strikes people in the prime of life and unfortunately, after they're likely to have had children. The disease is inherited in an autosomal dominant manner, which boils down to a 50:50 chance of getting it, if one of your parents has it. All it takes is the wrong copy of chromosome 4, with a few dozen extra nucleotides and you're SOL.

The difference between now and then, though, is that there's a genetic test that can divine your possible fate. A little bit of blood, some enzymes, a way to separate different-sized pieces of DNA, and you can find out if you better go to Disneyland while there's still time or you might want to sign up for that retirement plan that people are always telling you about.

Much of our knowledge about HD comes from work by Nancy Wexler. I was fortunate to hear Dr. Wexler talk about HD and her work with afflicted families in Venezuela. at the University of Washington, a few years back. If you'd like to hear her for yourself, NPR has an interview with Dr. Wexler that's well worth a listen.

Hunting for reviews
Okay, the disease is horrible, but learning about it is interesting. Perhaps we can even learn some general things about biology along the way.

We'll begin by learning a bit about the gene and the disease. Both GeneTests and the Genetics Home Reference at the National Library of Medicine tell us a bit about the disease symptoms, and that the difference between having HD and not having HD is a few extra CAGs in the huntingtin gene. We can even find a lab that will do a test by clicking links at these sites.

We also find out that the HD gene is also called "huntingtin" and is quite large. Huntingtin is over 200,000 bases long and has 67 exons. Plus the gene is quite polymorphic as far as the CAG repeats. Normal people have 10-35 CAGs in the huntingtin gene, where individuals with the disease can have as many as 40-55 CAG's.

Hunting for the gene
To confirm the presence of repeated CAGs, it's nice to be able to find the huntingtin gene sequence ourselves and take a look. If we go to the NCBI, choose the Gene database from the pull-down menu at the top of the page, and type "huntingtin," we get a list of genes that includes the huntingtin gene from multiple species, plus lots of genes for proteins that interact with huntingtin. So we do have to read carefully to pick the link for the right gene.

Going to the HD Gene record gives us lots more info. In the middle of this page is a picture of the introns and exons along with links to reference sequences from the contig (NC), mRNA (NM_00211), and protein (NP002102). We click NM_002111 and choose FASTA to get the DNA sequence that corresponds to the huntingtin mRNA.

Now, we see a long sequence with lots of A's, G's, C's, and T's. How are we going to find the CAG repeats without going blind staring at a computer screen? No problem, we just need a little fancy footwork with our web browser. Most, if not all, web browsers have a way to search for text on a web page. You can find one by looking through menus or use whatever key commands you normally use with Microsoft Word (Mac OS X, use Command + F, for Windows, use Ctrl + F). If you use FireFox, the search feature is really nice. Not only can you find the text, you highlight all places where it occurs. So, with Firefox, our search results look like this:

Hunting in the animal kingdom
Sure enough, a normal huntingtin gene has 19 CAGs (see above) and the disease-related mutant protein has about twice as many. This is unusual but it doesn't tell us why that would be a problem. Do the extra glutamines cause Huntington's disease or result from it?

We might be able to answer this question by looking at HD in other animals. If we search for huntingtin gene at the UCSC genome browser we can find that mice (Mus), pigs (Sus), and fish (Danio) all have the HD gene, too, with the same organization of introns and exons. The exons are the straight lines in the picture below. These are the DNA sequences that get copied into mRNA.

Since mice have the huntingtin gene, maybe mutant mice can help us the answer this question. The Jackson Labs have been able to make mice with the equivalent of HD by adding extra copies of CAG to the mouse version of the huntingtin gene.

If we go to the Jackson Labs site and search for Huntington, we find examples of five mice that develop HD symptoms, when they have extra CAGs. This result tells us that the extra CAGs are enough to spur the development of HD.

Still hunting for answers
The role of the extra CAGs is still unclear but we do know that CAG codes for glutamine.

Since glutamine is able to form hydrogen bonds through the amino and keto groups (blue and red in the picture below), proteins with extra glutamines might cause problems by binding to other proteins and interfering with their normal activities.

I've been unsuccessful, so far, in finding protein structures with more than three glutamines in a row, but I did find polyglutamine tracts in lots of protein sequences and some tantalizing clues in the literature.

So, tune in next week to learn what we can find when we continue our hunting expedition and venture into the deep, deep, darkness of the digital databanks.

Read more of the series:


technorati tags: , ,