International scientists from the Whitehead/MIT Center for Genome Research and other institutions announced Monday that more than 96 percent of the genetic blueprint for the mouse, the most important animal model in biomedical research, has been deposited into public databases.
The International Mouse Genome Sequencing Consortium said it had assembled and published an advanced draft sequence of the mouse genome, which has roughly the same number of genes as the human genome.
The sequence is posted on the Internet where it is freely available to all scientists. The mouse genome was previously sequenced privately by Celera Genomics but is available only to their subscribers, The New York Times reported.
The mouse genome is contained in 20 chromosome pairs and the current results suggest that it is about 2.7 billion base pairs in size, or about 15 percent smaller than the human genome. The human genome is 3.1 billion base pairs spread out over 23 pairs of chromosomes (22 autosomes and the X and Y sex chromosomes). Analysis of the genome assembly so far has found more than 22,500 high quality gene predictions, with additional predictions expected to take the total to about 30,000.
The draft sequence was assembled by the Mouse Genome Sequencing Consortium, an international team of researchers from the Whitehead Institute, Washington University in St. Louis, and the Wellcome Trust Sanger Institute and European Bioinformatics Institute in England, with funding from the National Human Genome Research Institute of the National Institutes of Health, and the Wellcome Trust in the U.K.
The results from this analysis can be found at several websites, including those of the European BioinformaticsInstitute, the National Center for Biotechnology Information at the National Library of Medicine and the University of California at Santa Cruz. A comparison between the mouse sequence and the human sequence can be found at all three sites.
"The mouse sequence provides a very important chapter from evolution's lab notebook," said Professor Eric Lander or biology, director of the Whitehead/MIT Center for Genome Research. "Being able to read evolution's notebook and compare genomic information across species will allow us to glean important information about ourselves. That's because evolution preserves the most important genetic information across species. If specific DNA sequences have been preserved by evolution over hundreds of millions of years, then they must be functionally important."
Francis S. Collins, director of the National Human Genome Research Institute, commented, "The mouse sequence is much further along in the process than the human sequence was at the draft stage. Methods for efficient sequencing of large genomes continue to advance dramatically, and the sophistication of the team that accomplished this goal is truly impressive. This sets a new standard for speed, accuracy and public accessibility."
The achievement represents a major milestone for the Human Genome Project because it provides a key tool needed by scientists around the world to interpret the human sequence, a draft version of which was published last year. This information will allow researchers to gain insights into the function of many human genes because the mouse carries virtually the same set of genes as the human, but can be used in laboratory research.
The draft sequence shows the order of the DNA chemical bases A, T, C and G along the 20 mouse chromosomes. It includes more than 96 percent of the mouse genome with long, continuous stretches of DNA and represents a seven-fold coverage of the genome. This means that the location of every base, or DNA letter, in the mouse genome was determined an average of seven times, a frequency that ensures a high degree of accuracy.
The quality of the working draft sequence far exceeds the consortium's original expectations for this stage and was completed much sooner than initially expected, reflecting the tremendous efficiencies gained in sequencing and computational technologies in the past few years.
This milestone concludes the second phase of the consortium's mouse sequencing effort: the production of a draft sequence by whole-genome shotgun method. In Phase III, the consortium will produce a finished version with the remaining gaps filled in and errors resolved. This phase will proceed using clone-based, or hierarchical, sequencing using the publicly available mouse genome clone map.
A version of this article appeared in MIT Tech Talk on May 8, 2002.