Skip to content ↓

Number of genes in human genome lower than previously estimated

A team of more than 2,800 scientists, including several from MIT, has published its scientific description of the finished human genome sequence, reducing its estimate of the number of human protein-coding genes from 35,000 to only 20,000-25,000, a surprisingly low number for our species.

In the Oct. 21 issue of Nature, researchers with the International Human Genome Sequencing Consortium describe the final product of the Human Genome Project, the 13-year effort to read the information encoded in the human chromosomes. One of the central goals of the effort was to identify all genes, which are generally defined as stretches of DNA that code for particular proteins.

The Nature paper provides rigorous scientific evidence that the genome sequence produced by the Human Genome Project has both the high coverage and accuracy needed to perform sensitive analyses, such as those focusing on the number of genes, segmental duplications involved in disease, and the "birth" and "death" of genes over the course of evolution.

"The human genome sequence far exceeds our expectations in terms of accuracy, completeness and continuity. It reflects the dedication of hundreds of scientists working together toward a common goal--creating a solid foundation for biomedicine in the 21st century," said Eric Lander, director of the Broad Institute of MIT and Harvard and a professor in MIT's Department of Biology.

Francis S. Collins, director of the National Human Genome Research Institute (NHGRI), said, "Only a decade ago, most scientists thought humans had about 100,000 genes. When we analyzed the working draft of the human genome sequence three years ago, we estimated there were about 30,000 to 35,000 genes, which surprised many. This new analysis reduces that number even further and provides us with the clearest picture yet of our genome." In the United States, the International Human Genome Sequencing Consortium is led by NHGRI and the Department of Energy (DOE).

The Nature paper also provides the scientific community with a peer-reviewed description of the finishing process and an assessment of the quality of the finished human genome sequence. The assessment confirms that the finished sequence now covers more than 99 percent of the euchromatic (or gene-containing) portion of the human genome and was sequenced to an accuracy of 99.999 percent--10 times more accurate than the original goal.

"Finished" doesn't mean that the human genome sequence is perfect. There still remain 341 gaps in the sequence, in contrast to the 150,000 gaps in the working draft announced in June 2000. The technology now available can't readily close these gaps; doing so will require more research and new technologies.

The human genome sequence and its annotations can be accessed through several public genome browsers, including GenBank at the National Center for Biotechnology Information.

The International Human Genome Sequencing Consortium includes scientists from 20 institutions in six countries. The five largest sequencing centers are located at Baylor College of Medicine, the Broad Institute of MIT and Harvard, DOE's Joint Genome Institute, Washington University School of Medicine, and the Wellcome Trust Sanger Institute.

A version of this article appeared in MIT Tech Talk on November 10, 2004 (download PDF).

Related Links

Related Topics

More MIT News

Globular blue and white orbs "examining" single-stranded RNA products and marking them with green checks or red x's

Why are some bacterial genes high in purines?

In certain species of bacteria, the answer lies in shielding RNA transcripts from a quality-control factor called Rho. Understanding the requirements for expressible sequences is critical for expression engineering of therapeutic agents.

Read full story

Rich Nielsen, Volha Charnysh, Kevin Dorst, and Emily Richmond Pollock seated at a table, talking

Building a scholarly community

The SHASS Faculty Fellows Program, administered by the MIT Human Insight Collaborative, is fostering new research projects and creating space for supportive and interdisciplinary discussion.

Read full story