Scientists at the Whitehead Institute Center for Genome Research have found that single nucleotide polymorphisms (SNPs) in northern Europeans -- the single-letter DNA differences that underlie disease susceptibility and individual variation -- travel together in blocks that are much larger than previously thought.
The finding has major implications for mapping disease genes and dissecting human population history. It suggests that mapping the genes for common diseases might be much easier (as much as eight times easier) and more manageable and practical than previously thought. It also suggests that it isn't necessary for every single SNP to be identified and cataloged before scientists can map genes for human diseases; scientists can embark on this task with the existing collection of SNPs.
Finally, the occurrence of SNPs in large blocks was seen in the northern European but not Nigerian populations studied, suggesting that something happened in the population history of the northern Europeans -- a recent "bottleneck" that shaped the genetic history of this population. Scientists speculate that this bottleneck could be related to the founding of Europe or the migration of a small population out of Africa as recently as 50,000 years ago.
The population bottleneck was enormous. Weaving the strands of evidence together, the authors showed that some time during the migration and expansion out of Africa -- and after the separation from sub-Saharan African populations -- no more than 50 individuals probably gave rise to most of the modern northern European gene pool.
"Our results have implications for disease gene mapping, suggesting a possible two-tiered strategy," said Professor Eric Lander, director of Whitehead's Center for Genome Research and one of the leaders of the Human Genome Project. "The large blocks in northern European populations will help us to easily map to a first approximation the location of disease genes. And the presence of small blocks in other populations like the Nigerians we studied will allow us to hone in on the specific single letter difference responsible for a disease."
The results were published in the May 10 issue of Nature by scientists at Whitehead and the Institute of Biological Anthropology, University of Oxford (UK).
SNPs AND DISEASE MAPPING
Although the recently published human genome sequence provides a reference for genetic commonalities among humans as a species (all humans are 99.9 percent similar), it's the 0.1 percent difference among us that contributes to our individuality and which, taken as a whole, can explain the genetic basis of diseases.
Single nucleotide polymorphisms are the bedrock of human genetics: they can be used to track inheritance of any gene, contribute to the traits that make us unique, and underlie our susceptibilities to common diseases such as cancer, diabetes and heart disease. It is also believed that SNPs may help explain why individuals respond differently to drugs.
As a result, for more than two years, scientists have been creating a companion volume to the Human Genome Project's "Book of Life" and have accumulated the largest publicly available catalog of single-letter DNA differences (SNPs) -- 1.4 million of them -- with their exact location in the human genome.
Scientists have also known that SNPs travel together; one SNP carries information about its nearby SNP neighbors. Hence they have assumed that once they have a large collection of SNPs, they can use a method called linkage disequilibrium mapping, where they correlate a block or neighborhood of SNPs back to an ancestral chromosome. This meant that the larger the block, the easier it would be to detect a region containing a disease gene. But the size of the SNP blocks remained a subject of intense debate.
Until recently, the majority opinion maintained that SNPs traveled together in blocks that were tiny -- about one million segments of 3,000 DNA letters each. This made the task of mapping SNPs and linking them to common diseases a mammoth task. Assumptions that the SNP blocks were tiny came from studies based on only a few SNP blocks and using different populations.
"The results described in the Nature paper emerged from a large-scale experiment using a uniform protocol looking at 19 randomly selected regions of SNPs all along the human genome," said David Reich, a graduate student in the Harvard-MIT Division of Health Sciences and Technology and first author on the paper. The results suggest that at least in northern European populations, the blocks are as much as eight times larger than previously thought, making the task eight times easier.
HARRY POTTER EDITIONS
To understand this concept, consider the example of Harry Potter and the Goblet of Fire, published in both British and American English, Professor Lander said. Although the two versions tell the exact same story, small spelling differences -- "color" instead of colour -- or choices of words such as a car's "boot" versus "trunk" enable us to trace the text to the British or the American version. Even if the two versions were shuffled and spelling differences occurred sporadically within the text, we would be able to tell that pages 7-9 came from the British version and that 12-17 came from the English version, he explained.
"Our results show that the shuffling at least in the northern European population has occurred in such a way that the blocks of interspersed text are large enough for us to trace them back to the original version. The shuffling has not resulted in interspersed text at the sentence level, which would make it difficult to sift through, but at page, or chapter levels, making it easier to sift," Professor Lander said.
SNPs, HUMAN POPULATION HISTORY
This same information also provides a rich "fossil record" of human population history. Previous models based on genetic as well as archeological data record the tale of a small group -- about 10,000 people -- expanding rapidly out of Africa to populate the whole Earth in the last 50,000 to 100,000 years. But the data in this new study add a major new wrinkle in this "out of Africa" story.
Why are the blocks of linkage disequilibrium in northern Europeans so big? "Linkage disequilibrium is created when populations are small, and gets shorter as random shuffling between chromosomes breaks it down. The size of modern blocks of linkage disequilibriumthus indicates when the population last contracted to a small size," Mr. Reich said. The eight-times-larger blocks of linkage disequilibrium in northern Europeans thus suggest a relatively recent contraction. Indeed, the authors infer that the ancestors of northern Europeans last experienced a major bottleneck about 27,000-53,000 years ago.
To gain a better understanding, the authors also studied a Nigerian population and found short blocks of linkage disequilibrium that were in fact much more consistent with the story of simple population expansion than what is seen in northern Europeans. Hence, the bottleneck among the ancestors of northern Europeans must have occurred after the divergence from the sub-Saharan Africans, which estimates say occurred about 100,000 years ago.
The exact nature of the bottleneck event is unclear, the researchers say. It could have been a contraction associated with only a few people emerging from Africa to found most of the modern world population. Alternatively, it could be a bottleneck associated with the founding of the first European populations, or even the recolonization of northern Europe after the last ice age.
A version of this article appeared in MIT Tech Talk on June 6, 2001.