The beta hemoglobin protein is identical in humans and chimpanzees. In this tutorial, students see that even though the proteins are identical, the genes that code for them are not. There are many more differences in the introns than in the exons, which indicates that coding regions of DNA are more highly conserved than non-coding regions.

This paper presents a tutorial in which students compare the gene for human beta hemoglobin with the gene for beta hemoglobin in the chimpanzee. This comparison is interesting because the two proteins are identical in amino acid sequence but, as students will find out, the genes that code for these proteins are similar but not identical. This tutorial introduces students to the NCBI (National Center for Biotechnology Information) genome databases and gives them some practice in using the powerful tools associated with them. This is where DNA sequences from labs all over the world are deposited. This enormous database is searchable and available at no cost to anybody with Internet access.

After students complete the tutorial, they will find that, although the beta hemoglobin proteins of humans and chimpanzees are identical, the genes that code for these proteins differ by one base pair in the exons (which code for proteins) and by 15 base pairs in the introns (which do not code for proteins). They can then understand that because the genetic code is degenerate (more than one codon can code for the same amino acid), two different genes can code for the same protein. They can also appreciate that introns evolve more rapidly than exons. This is because, although it is assumed that mutations occur in all DNA at the same rate, a mutation in an exon that changes the order of amino acids in the protein will probably be selected against, whereas most mutations in introns will likely have no selection pressure against them and so are more likely to remain in the gene pool. Thus, even though there are fewer than twice as many base pairs of introns (1068) than of exons (538), there are 15 times more mutations in the introns than in the exons.

It is suggested that teachers complete this activity themselves before giving it to students. Finally, there are other Web sites, including the Bioserver tool at the Dolan DNA Learning Center (dnalc.org), that have useful tools for genome analysis.

BLAST Tutorial: Beta Hemoglobin

DNA sequencing has become very easy and rapid since the 1990s. By now, over 100 billion base pairs of DNA have been sequenced, and more sequence is being added at an astonishing rate. All this DNA sequence is deposited in an enormous database maintained by NCBI. This database is free and available to the public. It can be easily searched using a free program called BLAST (Basic Local Alignment Search Tool; (Altschul et al., 1990). In this tutorial, you will learn how to use these databases and BLAST to compare human and chimpanzee hemoglobin.

Hemoglobin is a protein found in red blood cells that carries oxygen. One hemoglobin molecule consists of four amino acid chains: two alpha hemoglobin chains and two beta hemoglobin chains. The alpha and beta chains are coded for by separate genes; the alpha hemoglobin gene is found on chromosome 16 and the beta hemoglobin gene is found on chromosome 11. Each of these amino acid chains contains a heme group that can carry one oxygen molecule. This means that one molecule of hemoglobin can carry four molecules of oxygen. One beta hemoglobin chain contains 147 amino acids.

Humans and chimpanzees last had a common ancestor between 5 and 8 million years ago. Their beta hemoglobin proteins are identical in amino acid sequence and, of course, in secondary, tertiary, and quaternary structures. In this activity, you will explore whether the gene that codes for the beta hemoglobin protein is identical in humans and chimpanzees.

  1. 1. Go to ncbi.nlm.nih.gov. This is the NCBI home page through which you can access both the DNA databases and BLAST.

  2. 2. At the top of the home page, search All Databases (default) for HBB. HBB is the abbreviation for beta hemoglobin. Click ““Go.””

  3. 3. In the middle of the page on the left, click on ““Nucleotide: core subset of nucleotide sequence records.””

  4. 4. Near the top of the page, click on ““HBB (Homo sapiens).”” You may have to right-click with your mouse to open this link. A new screen will come up that gives you the nucleotide sequence of the gene that codes for the human beta-hemoglobin protein. There is a lot of additional information on this, and you will see links to many other pages. You can explore these after you have finished the basic tutorial.

  5. 5. Scroll down a little until you see, in a gray box on the left, ““Genomic regions, transcripts and products.”” Click on ““Reference sequence details.””

  6. 6. On the left side of the screen, find ““Genome Reference Consortium Human Build”” in a gray box. Under this, find ““Genomic,”” then find ““Download.”” To the right of ””Download,”” click on ””FASTA.”” You will see that the beta hemoglobin gene contains 1606 nucleotides (““letters””), here shown in what we call FASTA format. FASTA is a standard format, used for both genes and proteins, in which the first line begins with a > symbol, followed by any text. The text is sometimes longer than one line, as it is in this example. This is OK so long as there is no ““return”” in the text. After this one line of text, there is a return, followed by the nucleotide sequence of the gene or the amino acid sequence of the protein. In cases of proteins, the single-letter abbreviations for each of the 20 amino acids are used.

  7. 7. Copy the entire DNA sequence. Do not copy the first line.

  8. 8. Go back to the NCBI home page (ncbi.nlm.nih.gov) by clicking on the NCBI logo at the top of the page.

  9. 9. At the top of the page, click on ““BLAST.””

  10. 10. Under ““BLAST Assembled Genomes”” near the top of the page, click on ““Pan troglodytes.”” This will allow you to compare the human beta-hemoglobin gene with the chimpanzee genome, and not with the entire NCBI database. This saves time, because beta hemoglobins have been sequenced for innumerable organisms. Other BLAST searches could allow you to compare your sequence with the entire NCBI database or with different subsets of it.

  11. 11. Paste the human beta-hemoglobin sequence that you have copied into the large box at the top of the page. This is your query. The directions for the box say ““enter an accession…….””

  12. 12. Scroll to the bottom of the page and click on ““Begin Search.””

  13. 13. In the middle of the page, to the right of ““Request ID,”” click on ““View Report”” in a gray box.

  14. 14. Scroll down and look at the Graphic Summary. Notice that there is one match to a gene on the number 11 chromosome of the chimpanzee. Note that the gene for human beta hemoglobin is on human chromosome number 11. At the bottom of the box, you will see one bright red line indicating that there is one match. The fact that the line is bright red indicates that this is a very close match. Mouse over this line, and it will tell you, in technical terms, which chimpanzee gene your query (human beta hemoglobin) matches.

  15. 15. Scroll down to ““Alignments.”” The ““Query”” is human beta hemoglobin. The ““Subject”” is chimpanzee beta hemoglobin. Notice that whenever the gene is identical, there is a vertical line between the identical base pairs.

  16. 16. Now go to the top of the page on the left and click on ““Download.””

  17. 17. Near the top of the page under ““Alignment,”” click on ““text.””

  18. 18. Click on ““save to disk.””

  19. 19. Open the file. This will give you a comparison of the two beta hemoglobins. You can print this out if a printout is not included in the packet your teacher has given you (it is presented here as Figure 2). When this is on your computer screen, you will be able to search for the beginnings and ends of the exons in later steps.

  20. 20. Now look at the copy of the human-beta hemoglobin gene included in the packet your teacher has given you (Figure 1). This copy of the gene is from 1980 and uses the three-letter abbreviations for the 20 amino acids. There are three exons and two introns in this gene. You can identify the three exons because the amino acids they code for are given above the base sequences. Look at the sequences at the beginning and end of each exon. Use a highlighter to highlight the nucleotide sequences of each of the three exons in the human beta-hemoglobin gene on this copy of the human beta-hemoglobin gene.

  21. 21. Take out your printout of the human——chimpanzee beta-hemoglobin comparison from the BLAST search (Figure 2).

  22. 22. Use the find options of the computer (Edit/Find) to identify the beginning and end of each exon of the human beta-hemoglobin gene. Highlight the three exons on your printout of the human–chimpanzee comparison. It is easiest to highlight the exons in the human gene. Hint: search for six bases at the beginning and end of each exon, which you have located in Figure 1. Six bases are enough to identify these positions.

  23. 23. You can now answer the following questions (the answers are given in italics):

    a. Are the exons identical? No. Circle or highlight any differences you find. How many differences are there? One.

    b. Are the introns identical? No. Circle or highlight any differences you find. How many differences are there? Fifteen.

    c. Are there more differences in the exons or in the introns? In the introns. Can you think of an explanation for this? There would be no selection against a change in an intron.

    d. How can there be a difference in the exons if the proteins are identical? The genetic code is degenerate, so more than one codon can code for the same amino acid.

Figure 1.

The nucleotide sequence of the sense strand of beta hemoglobin DNA. Notice that three-letter abbreviations are used for the amino acids, because this figure was originally published in 1980 (Lawn et al., 1980), when single-letter abbreviations were not yet used. (Reprinted from Cell, 21, ““The nucleotide sequence of the human beta-globin gene,”” pp. 647––651, with permission from Elsevier. Previously reprinted in Vigue, 1987.)

Figure 1.

The nucleotide sequence of the sense strand of beta hemoglobin DNA. Notice that three-letter abbreviations are used for the amino acids, because this figure was originally published in 1980 (Lawn et al., 1980), when single-letter abbreviations were not yet used. (Reprinted from Cell, 21, ““The nucleotide sequence of the human beta-globin gene,”” pp. 647––651, with permission from Elsevier. Previously reprinted in Vigue, 1987.)

Figure 2.

The BLAST printout page, showing the comparison between human beta hemoglobin (Query) and chimpanzee beta hemoglobin (Subject). This figure was printed from NCBI's BLAST site.

Figure 2.

The BLAST printout page, showing the comparison between human beta hemoglobin (Query) and chimpanzee beta hemoglobin (Subject). This figure was printed from NCBI's BLAST site.

Acknowledgment

Thanks to Nadav Kupiek for expert preparation of the artwork.

Altschul
S.F.
Gish
W.
Miller
W.
Myers
E.W.
Lipman
D.J.
(
1990
).
Basic local alignment search tool
.
Journal of Molecular Biology
,
215
,
403
––
410
.
Lawn
R.M.
Efstratiadis
A.
O'Connell
C.
Maniatis
T.
(
1980
).
The nucleotide sequence of the human beta-globin gene
.
Cell
,
21
,
647
––
651
.
Vigue
C.L.
(
1987
).
Murphy's law and the human beta-globin gene
.
American Biology Teacher
,
49
,
76
––
81
.