California Association
for
Medical Laboratory Technology

Distance Learning Program

Human Genome Project: Second in a Series
(Note: this course is designed to follow CAMLT Basic Level Course 932
Human Genome Project: First in a series)


Author:
Jane Bruner, Ph.D., CLS, MT(ASCP)
Acting Dean, College of Natural Sciences
CSU Stanislaus; Turlock, CA

Course Number: DL-971
1.0 CE/Contact Hour
Level of Difficulty: Basic

© California Association for Medical Laboratory Technology.
Permission to reprint any part of these materials, other than for credit from CAMLT, must be obtained in writing from the CAMLT Executive Office.

CAMLT is approved by the California Department of Health Services
as a CA CLS Accrediting Agency (#0021)
and this course is is approved by ASCLS for the P.A.C.E.¨ Program (#519).

1895 Mowry Ave, Suite 112
Fremont, CA 94538-1766
Phone: 510-792-4441
FAX:  510-792-3045

Notification of Distance Learning Deadline
DON'T PUT YOUR LICENSE IN JEOPARDY!

This is a reminder that all the continuing education units required to renew your license must be earned no later than the expiration date printed on your license.  If some of your units are made up of Distance Learning courses, please allow yourself enough time to retake the testin the event you do not pass on the first attempt.  CAMLT urges you to earn your CE units early!


This course is configured to be completed on-line. You can register for the course, submit secure payment using a credit card via PayPal, take the quiz on-line and receive your graded score.   If you pass, your certificate will be mailed to you from the CAMLT office.

If you fail, you must submit new payment and obtain a new PayPal receipt each time you take the test.   A certificate will be issued only if you have paid for re-taking the course and you pass the test.

If you want to submit your registration and quiz via fax or mail you should print the Adobe Acrobat version of the course which includes the required Registration/Quiz form.
Links to:
On-line REGISTRATION, PAYMENT and QUIZ
Printable Acrobat version of this course *
Review Questions at the end of this Course
Other Distance Learning Courses
Get Adobe Reader - Free
*Click on the link above to download the required
free software from the Adobe website


Human Genome Project: Second in a Series

Abstract:
This course provides a discussion and overview of the Human Genome Project. From conception to completion, the project moved forward at a rate paced by technological advances and the competitive drive of researchers around the world. Much has been learned about the human genome as well as the genomes of other organisms due to the Human Genome Project. Much work is left in deciphering the data and putting it to useful applications. As stated by the U.S. Department of Energy (DOE), “Available to researchers worldwide, the human genome reference sequence provides a magnificent and unprecedented biological resource that will serve throughout the century as a basis for research and discovery and, ultimately, myriad practical applications”. Rapid progress in genome science and a glimpse into its potential applications have spurred observers to predict that biology will be the foremost science of the 21st century. Technology and resources generated by the Human Genome Project and other genomics research are already having a major impact on research across the life sciences as well as human diagnostics.

Objectives:
At the end of this exercise, participants will be able to:

    1. Define the Human Genome Project and its promise for future research.
    2. Discuss the changing goals of the project as it progressed.
    3. List several information points gathered during the Human Genome Project.
    4. Compare the genome size and estimated number of genes of humans and several other species.
    5. Describe the value of the Internet to sharing and accessing genomic databases and tools that can be used to investigate genetic disorders, chromosomes, genome maps, genes, sequence data, genetic variants and molecular structures.

Introduction:
The Human Genome Project (HGP) refers to the international 13-year effort to discover all the estimated 20,000-25,000 human genes and make them accessible for further biological study. Another project goal was to determine the complete sequence of the 3 billion DNA subunits (bases) in the human genome. As part of the HGP, parallel studies were carried out on selected model organisms such as the bacterium, Escherichia coli, and the mouse, Mus musculus, to help develop the technology and interpret human gene function. The DOE Human Genome Program and the National Institutes of Health (NIH) National Human Genome Research Institute (NHGRI) together sponsored the U.S. Human Genome Project.

Conceived in 1986, the U.S. Human Genome Project formally began in 1990 and ended in 2003. Interestingly, the project began as an initiative in the U.S. Department of Energy. The project originally was planned to last 15 years, but rapid technological advances accelerated the completion date to 2003. The DOE and the NIH brought a proposal entitled, “Understanding Our Genetic Inheritance, The U.S. Human Genome Project: The First Five Years (FY 1991-1995)” to members of congress in February of 1990. The approval they sought needed the funding of 200 million dollars per year for the next 15 years. The document outlined the current state of genome science at that time, proposed joint roles for DOE and NIH in administering research agendas, set the stage for international collaboration and provided goals for the project.

The completion in the spring of 2003 of the human DNA sequence project proposed by the DOE and NIH coincided with the 50th anniversary of Watson and Crick’s description of the fundamental structure of DNA. Some 18 countries participated in the worldwide effort producing an amazing database of genomic information that continues to grow. The analytical power arising from the reference DNA sequences of the entire human genome and other model organisms determined during the Human Genome Project has created genomics resources that have jump-started what some call the “biology century.”

Discussion:
A genome is the entire DNA in an organism, including its genes. Genes carry information for making all the proteins required by all organisms. These proteins determine, among other things, how the organism looks, how well its body metabolizes food or fights infection, and sometimes even how it behaves. DNA is made up of four similar chemicals, adenine, thymine, guanine and cytosine, called bases and abbreviated A, T, C, and G. These are repeated millions or billions of times throughout a genome. The human genome, for example, has 3 billion pairs of bases. The particular order of As, Ts, Cs, and Gs is extremely important. The order underlies all of life’s diversity, dictating whether an organism is human or another species such as yeast, rice, or fruit fly, all of which have their own genomes and are themselves the focus of genome projects. Because all organisms are related through similarities in DNA sequences, insights gained from nonhuman genomes often lead to new knowledge about human biology. Knowledge about the effects of DNA variations among individuals can lead to revolutionary new ways to diagnose, treat, and someday prevent the thousands of disorders that affect us. Besides providing clues to understanding human biology, learning about nonhuman organisms’ DNA sequences can lead to an understanding of their natural capabilities that can be applied toward solving challenges in health care, agriculture, energy production, environmental remediation, and carbon sequestration.

Goals of the Human Genome Project were set forth from the beginning to help guide the process. The initial proposal set forth the following five-year goals for the human genome project:

This first five-year plan, intended to guide research from 1990 to 1995, was revised in 1993 due to unexpected progress. The second five-year plan, intended to carry the project through 1998, was reflective of new technologies and the information gathered by that point. Other model organisms were included. There was expansion of ethical, legal and social issues.

The third, and final, five-year plan with goals was developed during a series of DOE and NIH workshops attended by project personnel, researchers and interested parties.

Third five-year plan and goals:

Timeline: The timeline that follows comes from the U.S. Department of Energy’s Department of Science. It is a sparse detailing of significant achievements related to the Human Genome Project.

  •  1990: W. French Anderson applies gene therapy for the first time. The recipient is a young girl with ADA deficiency, an immune system disorder.
The Department of Energy and the National Institutes of Health formally launch the Human Genome Project, a 15-year international effort to locate all of the genes in the human genome and make them accessible for further biological study. Another goal of the project is to determine and complete sequence of the genome’s 3 billion DNA nucleotide base pairs.

  •  1992: Researchers at Lawrence Livermore National Laboratory and Lawrence Berkeley National Laboratory in California discover a gene present in 25 to 30 percent of the population that predisposes individuals to increased heart attack risk. Discovery of this marker for heart disease on chromosome 19 may make possible the development of a simple test to screen humans for susceptibility to heart disease.
England’s Wellcome Trust joins the Human Genome Project.

  •  1994: The Department of Energy launches its Microbial Genome Program.

  •  1995: Craig Venter and colleagues at The Institute for Genomic Research in Maryland decode the first whole genome of a free-living single-cell organism, the influenza microbe, using the whole-genome shotgun sequencing method.

  •  1996: Ian Wilmut and other researchers at Scotland’s Roslin Institute clone a sheep from the cell of an adult ewe. This non-sexually produced animal is named “Dolly.”     The complete genome of the E. coli bacterium is sequenced.

  •  1998: The first complete genome sequence of a multicellular organism, the roundworm C. elegans, is published.

  •  1999: The DOE Joint Genome Institute, a genome-sequencing center formed by Lawrence Berkeley, Lawrence Livermore, and Los Alamos national laboratories, dedicates its new production sequencing facility in Walnut Creek, California.   The complete genome of the Drosophila fruit fly is sequenced.

  •  2000: Working drafts of the human genome are completed by the public International Human Genome Project and by Craig Venter’s Celera Genomics, a private company.

  •  2001: The draft human genome sequence is published in the journals Nature and Science. Twenty sequencing centers in six countries – China, France, Germany, England, Japan, and the United States – contribute to the project. Most of the sequencing is done by five major centers: the Wellcome Trust’s Sanger Center in England, the DOE Joint Genome Institute in California, and three NIH-funded centers at Baylor College of Medicine in Texas, Washington University School of Medicine in Missouri, and the Whitehead Institute in Massachusetts.

  •  2001-2002: As rapid, highly accurate sequencing techniques become readily available, the complete genomes of a wide variety of microbes and model organisms, including the mouse, pufferfish, malaria mosquito, and sea squirt, are sequenced and analyzed. Genome comparisons yield significant new insights into the causes and progress of disease, biological evolution, and the relationship between organisms and the environment.
DOE launches its “Genomes to Life” program.

  •  2003: The finished human genome is published concurrent with the 50th Anniversary of the discovery of the double helix.

Results:
The Human Genome Project was marked by accelerated progress. In June 2000, the rough draft of the human genome was completed a year ahead of schedule. In February 2001, Science and Nature published special issues containing the working draft sequence of the human genome and analysis of that sequence. By 2003, two full years ahead of time, the sequencing of the human genome was complete. The rapid completion was due to the enormous commitment to developing technology and changing strategies that resulted in revised goals and plans as the project moved forward. However, the sequencing is more a beginning than a final step. Knowing the sequence is analogous to having a dictionary and needing to make an extraordinary proclamation. The information is there but complete concept and interpretation have yet to be put forward.

A unique aspect of the U.S. Human Genome Project is that it was the first large scientific undertaking to address potential ELSI implications arising from project data. Another important feature of the project was the federal government’s long-standing dedication to the transfer of technology to the private sector. By licensing technologies to private companies and awarding grants for innovative research, the project catalyzed the multibillion-dollar U.S. biotechnology industry and fostered the development of new medical applications.
Sequence and analysis of the human genome working draft was published in February 2001 and April 2003 issues of Nature and Science. The first panoramic views of the human genetic landscape have revealed a wealth of information and some early surprises. Much remains to be deciphered in this vast trove of information; as the consortium of HGP scientists concluded in their seminal paper, “… the more we learn about the human genome, the more there is to explore.”
A few highlights from the first publications analyzing the sequence follow:

  • The human genome contains 3 billion chemical nucleotide bases (A, C, T, and G).
  • The average gene consists of 3,000 bases, but sizes vary greatly, with the largest known human gene being dystrophin at 2.4 million bases.
  • The functions are unknown for more than 50% of discovered genes.
  • The human genome sequence is almost (99.9%) exactly the same in all people.
  • About 2% of the genome encodes instructions for the synthesis of proteins.
  • Repeat sequences that do not code for proteins make up at least 50% of the human genome.
  • Repeat sequences are thought to have no direct functions, but they shed light on chromosome structure and dynamics. Over time, these repeats reshape the genome by rearranging it, thereby creating entirely new genes or modifying and reshuffling existing genes.
  • The human genome has a much greater portion (50%) of repeat sequences than the mustard weed (11%), the worm (7%), and the fly (3%).
  • Over 40% of the predicted human proteins share similarity with fruit fly or worm proteins.
  • Genes appear to be concentrated in random areas along the genome, with vast expanses of noncoding DNA between.
  • Chromosome 1 (the largest human chromosome) has the most genes (2968), and the Y chromosome has the fewest (231).
  • Genes have been pinpointed and particular sequences in those genes associated with numerous diseases and disorders including breast cancer, muscle disease, deafness, and blindness.
  • Scientists have identified about 3 million locations where single-base DNA differences occur in humans. This information promises to revolutionize the processes of finding DNA sequences associated with such common diseases as cardiovascular disease, diabetes, arthritis, and cancers.

    The table that follows compares humans with other organisms in regards to genome size and the estimated number of genes.

    Organism Genome Size (Bases) Estimated Genes
    Human (Homo sapiens) 3 billion 30,000
    Laboratory mouse (M. musculus) 2.6 billion 30,000
    Mustard weed (A. thaliana) 100 million 25,000
    Roundworm (C. elegans) 97 million 19,000
    Fruit fly (D. melanogaster) 137 million 13,000
    Yeast (S. cerevisiae) 12.1 million 6,000
    Bacterium (E. coli) 4.6 million 3,200
    Human immunodeficiency virus (HIV) 9700 9

    The estimated number of human genes is only one-third as great as previously thought; although the numbers may be revised as more computational and experimental analyses are performed. Scientists suggest that the genetic key to human complexity lies not in gene number but in how gene parts are used to build different products in a process called alternative splicing. Other underlying reasons for greater complexity are the thousands of chemical modification made to proteins and the repertoire of regulatory mechanisms controlling these processes.

    Some current and potential applications of genome research include:
      •  Molecular medicine
      •  Energy sources and environmental applications
      •  Risk assessment
      •  Bioarchaeology, anthropology, evolution, and human migration
      •  DNA forensics (identification)
      •  Agriculture, livestock breeding, and bioprocessing

    In addition to the genomic research, the Department of Energy and the National Institutes of Health Genome Programs set aside 3% to 5% of their respective annual HGP budgets for the study of the project’s ethical, legal, and social issues. Nearly $1 million was spent on HGP ELSI research during the Human Genome Project.

    A tabular summary of some of the sequencing milestones from the Human Genome Project follows:

    Human Genome Project Goals, Achievement and Completion Dates
    Area HGP Goal Standard Achieved Date Achieved
    Genetic Map 2- to 5-cM resolution map (600 – 1,500 markers) 1-cM resolution map (3,000 markers) September 1994
    Physical Map 30,000 STSs 52,000 STSs October 1998
    DNA Sequence 95% of gene-containing part of human sequence finished to 99.99% accuracy 99% of gene-containing part of human sequence finished to 99.99% accuracy April 2003
    Capacity and Cost of Finished Sequence Sequence 500 Mb/year at < $0.25 per finished base Sequence > 1,400 Mb/year at <$0.09 per finished base November 2002
    Human Sequence Variation 100,000 mapped human SNPs 3.7 million mapped human SNPs February 2003
    Gene Identification Full-length human cDNAs 15,000 full-length human cDNAs March 2003
    Model Organisms Complete genome sequences of E. coli, S. cerevisiae, C. elegans, D. melanogaster Finished genome sequences of E. coli, S. cerevisiae, C. elegans, D. melanogaster, plus whole-genome drafts of several others, including C. briggsae, D. pseudoobscura, mouse and rat April 2003
    Functional Analysis Develop genomic-scale technologies High-throughput oligonucleotide synthesis 1994
    DNA microarrays 1996
    Eukaryotic, whole-genome knockouts (yeast) 1999
    Scale-up of two-hybrid system for protein-protein interaction 2002
              Source: Science 300:286 (2003)

    Conclusion:
    The Human Genome Project is completed but the door has merely been nudged open to the scientific research and discovery that lies ahead as science puts the information to further test and application. Medical science will benefit greatly from this process. An excellent source of information for those not current in genetic aspects and applications of disease analysis is the Gene Gateway a DOE website: http://doegenomes.org/ click on Gene Gateway.
      References:
    1. Jasny BR, Roberts L. Building on the DNA revolution, Introduction. Science. 11 Apr 2003; 300:277.
    2. Collins FS, Morgan, M, Patrinos A. The human genome project: lessons from large-scale biology. Science. 11 Apr 2003;300:286-290.
    3. Frazier ME, Johnson GM, Thomassen CE, Patrinos A. Realizing the potential of the genome revolution: The genomes to life program. Science. 11 Apr 2003;300:290-293.
    4. Collins FE, Green ED, Guttmacher AE, Guyer MS. A vision for the future of genomics research. A blueprint for the genomic era. Nature. 24 Apr 2003;422:835.
    5. Carroll SB. Genetics and the making of Homo sapiens. Nature. 24 Apr 2003;422:849.
    6. Arnold A, Hilton N. Genome sequencing: revelations from a bread mould. Nature. 24 Apr 2003;422;821.
    7. Information compiled from the U.S. Department of Energy



    Review Questions - Course #DL-971 - Choose the one best answer for each question
    Link to On-line REGISTRATION, PAYMENT and QUIZ

    1. An organism’s genome is:
         a. a protein data base
         b. an NIH project
         c. the entire DNA content of the organism
         d. the result of replication
    2. The Human Genome Project was initially conceived in:
         a. 1986
         b. 1990
         c. 1998
         d. 2003
    3. Genes in the human genome appear to be arranged:
         a. at regular intervals along the chromosome
         b. at random intervals along the chromosome
         c. at the centromere of the chromosome
         d. at every third juncture of the chromosome
    4. The human genome contains 3 billion bases. What portion of the bases code for proteins?
         a. less than 50%
         b. 60%
         c. 70%
         d. more than 80%
    5. Which of the following organisms has the fewest genes?
         a. mustard seed
         b. yeast
         c. mouse
         d. fruit fly
    6. A goal of the original five-year plan was:
         a. to minimize industry involvement in the project
         b. to decrease the throughput of sequencing technology
         c. to sidestep congressional involvement
         d. to engage industry participation in the project
    7. STS is an abbreviation for
         a. start the sequencer
         b. standard test sequestering
         c. sequence tagged site
         d. separate technology steps
    8. Gene therapy had its first trial in a patient with
         a. cardiovascular disease
         b. type II diabetes
         c. ADA deficiency
         d. breast cancer
    9. The first organism sequenced was:
         a. E. coli
         b. C. elegans
         c. Influenza
         d. S. cerevisiae
    10. Human genetic complexity is dependent on:
         a. nucleotide bases found only in humans
         b. gene numbers
         c. chemical modifications made to proteins
         d. sequencing analysis